Site Reliability Engineer
$114,000–$148,000 year
Remote · United States
Job Summary
Site Reliability Engineer responsible for ensuring platform reliability, performance, and availability. Focus on designing, implementing, monitoring scalable and secure cloud services; contribute to on-call rotations, post-mortem incident reviews, and partner with Product/Engineering to build reliable systems. Lead in creating designs and architectures for large-scale services, automate processes to improve reliability, and maintain technical documentation. Mentorship and knowledge sharing are expected, with familiarity in SOC/FedRAMP controls to assist Compliance and Security teams.
Required Qualifications
- BS/BA in computer science, engineering, or technology-related field (or equivalent work experience)
- Proven work experience as a Site Reliability Engineer or in a similar role
- 6+ years of cloud infrastructure and software development experience
- 2+ years hands on experience of Azure Kubernetes Services (AKS) with container-based deployment skills or other platforms such as OpenShift, GKS, EKS
- Advanced understanding of APM and observability tools such as Dynatrace, AppInsights, DataDog, Log Analytics, New Relic, Prometheus and Grafana
- Advanced understanding of Infrastructure-as-Code (IaC) concepts and tooling (Terraform, CloudFormation templates, Bicep or ARM templates) on Microsoft Azure, Amazon Web Services (AWS), or Google Cloud Platform (GCP)
- Deep knowledge of Configuration Management/Orchestration utilities such as Ansible, PowerShell DSC, Chef, and Puppet
- Advanced understanding of cloud concepts including elasticity, security, and identity management
- Well versed familiarity with Agile Development methodologies utilizing Jira or Azure DevOps Boards
- 6+ years of hands-on experience with the following technologies, tools, and concepts: Automating processes using PowerShell, Bash, CLI, REST APIs, python, ARM Templates or other scripting languages
- Knowledge of container orchestration platforms such as Kubernetes, OpenShift, AKS, GKS or helm
- Microsoft Azure, Amazon Web Services (AWS) or Google Cloud (GCP)
- Experience with Azure DevOps and Git workflows
- Experience with monitoring/observability tools and implementing automated solutions
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.