OpenAI12 months ago

Software Engineer, Inference - Multi Modal

OpenAI

On-site · San Francisco, California, United States

San Francisco, California, United StatesOn-siteFull TimeMid LevelNot SpecifiedAI ServicesLarge

Type

Full Time

Level

Mid Level

Education

Not Specified

Company size

Large

Industry

AI Services

Job Summary

Join OpenAI's Inference team to design and implement high-performance infrastructure for serving multimodal models at scale. Collaborate with researchers and product teams to optimize systems for real-time audio and image processing. Responsibilities include system-level improvements, enabling reliable production services from experimental workflows, and addressing complex data handling. Ideal candidates should have experience in LLMs or multimodal models, knowledge of GPU dynamics, and a passion for innovative, cross-functional work.

Required Qualifications

Experience building and scaling inference systems for LLMs or multimodal models
Experience with GPU-based ML workloads
Familiarity with inference tooling like vLLM, TensorRT-LLM, or custom model parallel systems

Desired Qualifications

Experience working with image generation or audio synthesis models in production
Exposure to distributed ML training or system-efficient model design

Additional Requirements

Qualified applicants with arrest or conviction records will be considered for employment consistent with applicable laws.

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started