Incident Manager
$136,125–$165,000 year
On-site · San Francisco, California, United States
Job Summary
In this Incident Manager role, you will be responsible for managing high-visibility incidents and customer escalations, ensuring minimal disruption and rapid resolution of critical issues. You will lead incident responses during crises, leverage data analytics for enhanced resiliency, and develop preventive strategies through detailed post-incident reviews. Additionally, you will troubleshoot and resolve complex technical issues related to Infiniband and HPC infrastructure while collaborating with engineering teams for continuous improvement of processes. Strong technical proficiency in Linux, Virtualization, and Kubernetes is essential, alongside crisis management experience and excellent communication skills.
Required Qualifications
- 4-5 years of customer-facing experience
- 3-5+ years in a team leadership role
- proven track record in crisis management
- strong technical experience with Linux, Virtualization, Kubernetes
- excellent communication skills
Desired Qualifications
- NVIDIA certification
- Linux certification
- Kubernetes certification
Additional Requirements
- Equal Opportunity Employer
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.