TechInsights logo
TechInsights1 month ago

Senior Site Reliability Engineer (Remote USA)

$149,100–$157,800 year

Remote · Denver, Colorado, United States or US

Type
Full Time
Level
Senior Level
Education
Bachelors Degree
Company size
Unknown
Industry
Information Services

Job Summary

Senior Site Reliability Engineer role (Remote USA) focused on owning platform reliability and AI operations for a high-scale semiconductor data platform. Responsibilities include owning SLOs/SLIs and error budgets for production services, designing reliability patterns for AI agent pipelines, managing blast radius containment, maturing active-active architecture toward 24-hour RTO, leading incident response and post-incident reviews, and enabling software/AI engineering teams with CI/CD standards, IDP adoption, and self-service tooling. Requires deep SRE expertise, leadership at senior IC level, and strong experience with AWS, Terraform/GitOps, Datadog observability, Docker/Kubernetes, Python/Bash, and CI/CD workflows. Preference for experience with agentic AI systems, AI workloads observability, and IDP tooling; remote in US with occasional travel.

Required Qualifications

  • Bachelor's degree in Computer Science, Engineering, or equivalent
  • 6–8 years of progressive experience in site reliability engineering, platform engineering, or DevOps
  • Deep expertise in AWS (EKS, Lambda, CloudWatch) and multi-region architecture
  • Proficiency with Terraform and GitOps
  • Hands-on Datadog experience
  • Strong containerization with Docker and Kubernetes (EKS preferred)
  • Proficiency in Python and/or Bash; knowledge of Java and Spring Boot
  • Experience with CI/CD pipelines (Bitbucket Pipelines, GitHub Actions)
  • Familiarity with IDP tooling (Backstage, Atlassian Compass) preferred
  • Experience with AI/ML workload infrastructure or agentic system operations considered a strong asset
Sorce

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started

$149k – $158k / yr

Senior Site Reliability Engineer (Remote USA) · TechInsights

Apply on Sorce