Role overview
- Enjoy seeking out and addressing bottlenecks and areas for performance improvement in our systems.
- Utilize Infrastructure as Code (IaC) principles to automate infrastructure provisioning and configuration management.
- Are experienced in collaborating with cross-functional teams to ensure that reliability and scalability are considered in the design and development of new features and services.
- Have a track record of accelerating engineering reliability by empowering your fellow engineers with excellent tooling and systems.
- Help create a diverse, equitable, and inclusive culture that makes all feel welcome while enabling radical candor and the challenging of group think.
- Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.
- Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done.
What we're looking for
- Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent work experience).
- Proven experience as an reliability engineer or a similar role in a fast-paced, rapidly scaling company.
- Strong proficiency in cloud infrastructure.
- Proficiency in programming/scripting languages.
- Experience with containerization technologies and container orchestration platforms like Kubernetes.
- Knowledge of IaC tools such as Terraform or CloudFormation.
- Excellent problem-solving and troubleshooting skills.
- Strong communication and collaboration skills.
- Experience with observability tools such as DataDog, Prometheus, Grafana, Splunk and ELK stack.
- Experience with microservices architecture and service mesh technologies.
- Knowledge of security best practices in cloud environments.
This role is exclusively based in our San Francisco HQ. We offer relocation assistance to new employees.
#LI-TN1
Tags & focus areas
Used for matching and alerts on DevFound Dev Reliability Kubernetes Openai Terraform