Inference Optimization Manager
$234,000–$286,000 year
Remote · United States or Canada
Job Summary
Lead a high-impact performance engineering team to drive LLM inference optimization across Modular Cloud, shaping the technical roadmap and collaborating with GTM, Product, and Engineering to deliver state-of-the-art performance across GPUs and ASICs. Translate real customer workloads into continuous optimization, guide cross-stack improvements from GPU kernels to cloud infrastructure, and foster exponential team growth. Champion external thought leadership through blog posts and industry best practices while ensuring the platform scales with demand and evolving architectures.
Required Qualifications
- 5+ years in distributed systems or performance engineering
- experience leading or managing engineering teams
- track record of shipping durable, reusable software tools and libraries
- ability to translate customer/product needs into engineering direction
- creative, curious problem-solving mindset
- collaborative, team-oriented mindset
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.