Anthropic8 months ago

Performance Engineer, GPU

Anthropic

$280,000–$850,000 year

Hybrid · San Francisco, California, United States

San Francisco, California, United StatesHybridFull Time$280,000–$850,000 yearMid LevelBachelors DegreeAI ServicesStartup

Type

Full Time

Level

Mid Level

Education

Bachelors Degree

Company size

Startup

Industry

AI Services

Job Summary

GPU Performance Engineer role involves architecting and implementing foundational GPU performance systems to maximize utilization and inference efficiency for large language models. Responsibilities span from low-level tensor core optimizations to coordinating thousands of GPUs in distributed environments, with opportunities to develop custom kernels, co-design attention mechanisms, and optimize end-to-end training and inference pipelines. Preferred skills include CUDA/Triton/CUTLASS, Flash Attention, tensor core optimization, NCCL/NVLink, mixed-precision, and experience with production-scale ML infrastructure. Visa sponsorship is offered; hybrid in-office policy expects presence at least 25% of the time; location shown is San Francisco, CA, USA.

Required Qualifications

Bachelor’s degree or equivalent
Experience with GPU programming and optimization at scale
Strong collaboration and pair programming
Experience with distributed systems and multi-node GPU clusters
Proficiency in GPU kernel development and optimization techniques
Familiarity with ML frameworks and compilers (e.g., PyTorch, JAX, XLA)

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started