Carson Group logo
Carson Group3 days ago

Director, Production Engineering

$127,000–$167,000 year

Remote · United States

Type
Full Time
Level
Senior Level
Education
Bachelors Degree
Company size
Large

Job Summary

Director, Production Engineering responsible for leading production reliability strategies and governance across enterprise, driving observability (monitoring, logging, distributed tracing, telemetry), incident management, and production readiness. Leads a function focused on service health, reliability metrics (SLOs/SLIs), operational risk, and performance improvements; partners with technology leaders to align reliability and operational excellence with business objectives; delivers executive-level reporting on operational health and risk, and steers AI-enabled operational practices (AIOps) to improve reliability and efficiency. Oversees governance frameworks for ownership, configuration management, feature management, and operational controls, and acts as incident commander to drive rapid, data-informed incident responses and post-incident learning. Responsible for setting standards, governance, and continuous improvement across production engineering, observability, incident response, and AI platform operations. Strong collaboration with cross-functional teams and leadership to ensure production services are highly available, scalable, and resilient.

Required Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, Engineering, Management Information Systems, or related field required
  • One year of relevant experience may be substituted for each year of required education
  • Minimum of ten years of technology leadership experience in production operations, reliability engineering, or platform operations required
  • Experience leading observability, monitoring, incident management, and operational governance programs required
  • Experience with AIOps strategy, AI-enabled operational practices, or AI platform operations preferred
  • Architecture-level mastery of AI and cloud-based operational systems required
  • Expertise in reliability engineering, observability, monitoring, and service health management required
  • Strong knowledge of incident management, root cause analysis, and operational risk practices required
  • Experience with Service Level Objective, Service Level Indicator, and operational metrics frameworks required
  • Proven ability to communicate operational performance and risk to executive leadership required
  • Strong leadership, communication, and cross-functional collaboration skills required
Sorce

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started

$127k – $167k / yr

Director, Production Engineering · Carson Group

Apply on Sorce