Senior SRE Engineer (MLOps) - AI
On-site · Makkah, Mecca Region, Saudi Arabia
Job Summary
Senior SRE Engineer (MLOps) responsible for reliability of ML and agentic AI services in production, building observability across the AI stack, designing safe-release patterns for models, prompts, and tools, providing operational support for inference APIs and AI workflows on Kubernetes/EKS, establishing guardrails for agentic systems, defending tool-calling against prompt injection, and driving AI cost governance, while collaborating with AI/engineering teams to enable scalable, secure, and cost-efficient production deployments.
Required Qualifications
- 4+ years in SRE, platform engineering, DevOps, or production infrastructure
- Hands-on experience with Kubernetes and cloud-native systems in production
- Familiarity with deploying ML projects
- Strong command of CI/CD, GitOps, observability, and incident response
- Solid experience with infrastructure-as-code, secrets management, and networking
- Ability to write automation or platform tooling in Python, or a similar language
- Production judgment — measurable, debuggable, safe-to-change systems
- Ability to work across teams and communicate trade-offs
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.