Baseten logo
Baseten3 months ago

Software Engineer — GPU Networking & Distributed Systems

$150,000–$300,000 year

On-site · San Francisco, California, United States

Type
Full Time
Level
Mid Level
Education
Not Specified
Company size
Startup
Industry
AI Software

Job Summary

In this role, you will work on integrating RDMA/RoCE/InfiniBand capabilities into the inference stack to achieve significant improvements in bandwidth and latency. You will optimize distributed inference by implementing networking layers for efficient Disaggregated KV Cache Offload, enable fast startup speeds for LLMs, validate networking performance on cutting-edge hardware, and build tools for observability. Candidates should have experience with high-performance networking protocols, be proficient in C++ or Python, and possess a strong understanding of NVIDIA architectures.

Required Qualifications

  • Deep experience with high-performance networking protocols (InfiniBand, RoCE v2)
  • Fluency in C++ or Python
  • Understanding of the memory hierarchy in modern NVIDIA architectures (H100/Blackwell)
  • Ability to work with communication libraries (NCCL, NVSHMEM)

Desired Qualifications

  • Deep knowledge of NCCL, NVSHMEM, and UCX
  • Experience with GPUDirect Storage (GDS) or high-performance filesystems like Weka or 3FS
  • Familiarity with TensorRT-LLM, vLLM, or Sglang
  • Experience running low-level benchmarks to "qualify" new hardware clusters
Sorce

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started

$150k – $300k / yr

Software Engineer — GPU Networking & Distributed Systems · Baseten

Apply on Sorce