Principal Software Engineer, DevOps (Full Time)
Favor
Favor’s Engineering team is responsible for the complex systems that make high-touch logistics happen in real time. This includes finding the perfect Runner (that’s what we call our delivery drivers), managing the communication between customers and Runners, keeping thousands of mobile applications in sync, and more. We are seeking a Principal DevOps Engineer to drive our cloud and configuration management and build, deploy and monitor platforms.
As a Principal DevOps Engineer, you will apply our company goals to our technology. Along with a team of other motivated engineers, you will ensure world-class performance, efficiency, change management, monitoring, capacity planning, and emergency response capabilities. Your ultimate goal is to engineer operationally efficient and performant solutions, increase system observability, minimize human interactions with production systems, accelerate customer value delivery, and communicate those best practices to others.
You will work closely with the Engineering, Quality, Data, and Product Engineering teams to help define how we build, test, and ship our products at scale. You must be a self-starter who thrives in a fast-paced, agile environment and show an eagerness to learn and introduce new technologies as the need arises. Most importantly, we need a leader who can prioritize, multitask, and deliver scalable solutions.
What you'll do:
- Contribute to and architect the vision for DevOps at Favor, regularly meeting with engineers to roadmap and execute strategic initiatives to improve performance, throughput, and quality across the engineering organization.
- Create infrastructure-as-code that is scalable, performant, reliable, and secure.
- Implement and manage a containerized microservices infrastructure, delivering continuous integration and continuous delivery for new applications on Amazon Web Services (AWS).
- Maintain monitoring and alerting systems of Favor’s production services through Grafana stacks and CloudWatch.
- Assist in developing, maintaining, and improving compliant systems in PCI and SOC2 environments.
- Build, test, and maintain AWS backup and disaster recovery systems.
- Monitor performance of production systems, give recommendations for enhancing performance, and assist in implementation.
- Improve the development pipeline from local development to production.
- Implement, maintain, and test a disaster recovery plan.
- Engage and nurture development teams so they can maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Share an on-call rotation and be an escalation contact for service incidents.
Skills you have:
- Deep understanding of version control systems (git), including branching and merging strategies.
- 10+ years of experience in DevOps, with a recent focus on Kubernetes infrastructure.
- 8+ years of experience working with microservices and Service-Oriented Architectures (SOA).
- 8+ years of experience with Amazon Web Services.
- 8+ years of experience in logging, metrics, monitoring, and alerting, preferably with tools like OpsGenie, CloudWatch, Grafana.
- Must be comfortable working in a Linux/Unix environment.
- At-scale experience with containers and container orchestration platforms such as Docker and Kubernetes.
- Experience with automation/configuration management (Terraform, CloudFormation, CDK).
- A detail-oriented, organized thought process and the ability to act decisively under stressful conditions.
- An understanding of system optimization issues.
- The ability to work well with others to solve problems.
- Self-motivated work processes and excellent communication skills.
Who you are:
- You understand lean and agile software development principles and help up-level the Engineering team in these areas.
- You are an expert at defining and communicating technical solutions and strategies.
- You are a force multiplier who can move an Engineering team forward through direct contributions and influence.
- You enjoy working with other engineers in a collaborative and iterative environment.
- You have experience scaling systems and teams in a high-growth startup/medium-sized companies.
- You communicate well with technical and non-technical stakeholders.
- You are a true full-stack engineer who can navigate and advise in all areas of the software lifecycle, including design, development, deployment, debugging, monitoring, and support.