Nscale Operations UK logo
Nscale Operations UK1 week ago

Site Reliability Engineer

$100,000–$170,000 year

Remote · United States

Type
Full Time
Level
Mid Level
Education
Not Specified
Company size
Unknown

Job Summary

Build and improve automation, tooling, and infrastructure for AI workloads; define and maintain basic SLOs/SLIs and monitoring dashboards; participate in incident response and post-incident reviews; collaborate with Engineering, Networking, and Infrastructure teams to improve system stability; learn from senior engineers and grow in reliability engineering; exposure to cloud/Kubernetes/HPC and AI/gpu workloads; competitive base salary plus equity/bonus programs and flexible work environment.

Required Qualifications

  • 2–5 years of experience in Site Reliability Engineering, Systems Engineering, or Software Engineering in Data Center Environment
  • 2+ years programming skills (e.g., Python, Go, or similar) with interest in automation and tooling
  • Working knowledge of Linux systems, networking concepts, and distributed systems
  • Experience troubleshooting system or application issues in production environments
  • Familiarity with monitoring or observability tools (e.g., logs, metrics, dashboards)
  • Strong willingness to learn and improve reliability and operational practices
  • Ability to work in fast-paced environments and collaborate across teams
Sorce

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started

$100k – $170k / yr

Site Reliability Engineer · Nscale Operations UK

Apply on Sorce