OpenAI logo
OpenAI8 months ago

Software Engineer, Inference – AMD GPU Enablement

On-site · San Francisco, California, United States

Type
Full Time
Level
Mid Level
Education
Not Specified
Company size
Large
Industry
AI Services

Job Summary

Hiring engineers to scale and optimize OpenAI’s inference infrastructure on emerging GPU platforms, with a focus on AMD. Responsibilities include debugging and optimizing distributed inference workloads, validating performance on large GPU clusters, and collaborating with teams to optimize GPU kernels and collective communication libraries. Required skills include knowledge of GPU kernel development, understanding of communication libraries, and experience with distributed systems. Ideal candidates will enjoy solving complex performance challenges in a fast-paced environment.

Required Qualifications

  • Experience writing or porting GPU kernels using HIP, CUDA, or Triton
  • Familiarity with communication libraries like NCCL/RCCL
  • Experience with distributed inference systems
  • Problem-solving skills in end-to-end performance across hardware and system libraries
  • Enthusiasm for building infrastructure from first principles

Desired Qualifications

  • Contributions to open-source libraries like RCCL, Triton, or vLLM
  • Experience with GPU performance tools (Nsight, rocprof, perf) and memory/comms profiling
  • Prior experience deploying inference on other non-NVIDIA GPU environments
  • Knowledge of model/tensor parallelism, mixed precision, and serving 10B+ parameter models

Additional Requirements

  • Background checks for applicants will be administered in accordance with applicable law
Sorce

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started

OpenAI

Software Engineer, Inference – AMD GPU Enablement

Apply on Sorce