Staff Site Reliability Engineer - (Infra)
Hybrid · Bengaluru, Karnataka, India
Job Summary
Staff Site Reliability Engineer - (Infra) at Okta builds and operates highly scalable, reliable, and secure infrastructure across AWS and GCP. You will lead reliability and modernization initiatives, including container migrations (ECS to EKS/GKE) and microservice enablement, serve as a technical authority on Kubernetes, cloud infrastructure, and CI/CD practices, design infrastructure as code with Terraform/Ansible, drive observability and cost improvements, define SLOs/SLIs, conduct blameless postmortems, mentor engineers, collaborate with security/compliance, and participate in on-call rotations. Requirements include 8+ years in SRE/DevOps, 3–5 years with Kubernetes (EKS/GKE) and multi-cloud, Terraform experience, Python/Go coding, and a Bachelor’s degree in CS or equivalent.
Required Qualifications
- Bachelor’s degree in Computer Science or equivalent hands-on experience
- 8+ years in SRE, DevOps, or Infrastructure Engineering roles
- 3–5 years of experience with Kubernetes (EKS/GKE) and related ecosystem tools (Helm, Karpenter, etc.) in production
- 3–5 years of experience with AWS and GCP
- 3–5 years using Terraform to manage multi-cloud infrastructure
- 5+ years of coding experience in Python, Go, or similar languages
- Proven track record leading high-impact projects, specifically migration projects (ECS → EKS/GKE) and enabling microservice architectures
- Experience implementing SLOs/SLIs, performing root cause analyses, and improving operational resilience
- Prior work in SaaS or high-scale, cloud-native environments is a strong plus
- Strong Linux and security fundamentals
- Bachelor’s degree in Computer Science or equivalent hands-on experience
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.