Staff SRE - Observability

Focused logo
Focused
Chicago, Illinois, United States or New York City, New York, United StatesRemoteFull Time$160,000–$200,000 yearMid LevelNot SpecifiedSmall
Apply on Sorce

Apply to this job by swiping right.

Posted 7 months ago

Job Summary

Seeking an experienced Staff Observability Consultant with expertise in OpenTelemetry and strong Platform Engineering skills to help organizations optimize their observability infrastructure. Candidates should have a solid background in monitoring, distributed systems, and cloud engineering.

Required Qualifications

  • 3-7 years of experience in observability, monitoring, and distributed systems
  • Deep hands-on experience with OpenTelemetry ecosystem, including SDKs, APIs, and specifications
  • Proficiency with OpenTelemetry Collector configuration, processors, exporters, and receivers
  • Strong understanding of telemetry data models, semantic conventions, and instrumentation best practices
  • 5+ years of Platform Engineering or DevOps experience with focus on site reliability, observability, and incident response
  • Proficiency with Infrastructure as Code tools (Terraform, Pulumi, CloudFormation, CDK)
  • Strong experience with CI/CD platforms (GitHub Actions, GitLab CI, Jenkins, ArgoCD)
  • Hands-on experience with major cloud providers (AWS, GCP, Azure) and their observability services
  • Experience with container technologies (Docker, Podman) and container registries
  • Knowledge of networking, security, load balancing, and distributed systems concepts
  • Experience implementing SRE practices including error budgets and toil metrics
  • Proficiency in incident management, on-call procedures, and post-mortem culture
  • Experience with capacity planning, performance optimization, and scalability design
  • Proficiency in multiple programming languages preferred (Go, Python, Java, Node.js, Rust)
  • Strong scripting and automation skills (Bash, Python, PowerShell)
  • Understanding of software engineering best practices and testing methodologies

Desired Qualifications

  • Understanding of Large Language Models (LLMs) and their application in DevOps
  • Knowledge of vector databases, embeddings, and retrieval-augmented generation (RAG)
  • Experience with AI/ML model deployment and monitoring in production environments
  • Strong technical writing and documentation skills
  • Ability to present complex technical concepts to diverse stakeholders
  • A passion for knowledge sharing

Additional Requirements

  • This role will require being in the Chicago office three days per week and up to 20% travel within the United States.
  • Focused is unable to sponsor or take over sponsorship of the employment Visa process at this time.