Nscale Operations UK1 week ago

Site Reliability Engineer

Nscale Operations UK

$100,000–$170,000 year

Remote · United States

United StatesRemoteFull Time$100,000–$170,000 yearMid LevelNot SpecifiedUnknown

Type

Full Time

Level

Mid Level

Education

Not Specified

Company size

Unknown

Job Summary

Build and improve automation, tooling, and infrastructure for AI workloads; define and maintain basic SLOs/SLIs and monitoring dashboards; participate in incident response and post-incident reviews; collaborate with Engineering, Networking, and Infrastructure teams to improve system stability; learn from senior engineers and grow in reliability engineering; exposure to cloud/Kubernetes/HPC and AI/gpu workloads; competitive base salary plus equity/bonus programs and flexible work environment.

Required Qualifications

2–5 years of experience in Site Reliability Engineering, Systems Engineering, or Software Engineering in Data Center Environment
2+ years programming skills (e.g., Python, Go, or similar) with interest in automation and tooling
Working knowledge of Linux systems, networking concepts, and distributed systems
Experience troubleshooting system or application issues in production environments
Familiarity with monitoring or observability tools (e.g., logs, metrics, dashboards)
Strong willingness to learn and improve reliability and operational practices
Ability to work in fast-paced environments and collaborate across teams

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started