Senior Software Engineer, CUTLASS Platform
$152,000–$287,500 year
On-site · Austin, Texas, United States or Santa Clara, California, United States
Job Summary
Develop core CUTLASS components of the platform including Tensor Core MMAs, copies, synchronization barriers, and schedulers in CUDA C++ and the CUTLASS Python DSL. Contribute to the MLIR-based backend compiler stack for the CUTLASS Python DSL by designing dialects and associated compiler passes. Author example kernels utilizing CUTLASS abstractions to showcase novel GPU hardware features critical for achieving high performance. Collaborate with GPU architecture, CUDA, and NVVM/PTX compiler teams to provide feedback on programming models and to assess performance of future GPU hardware features.
Required Qualifications
- Masters or PhD degree in Computer Science, Computer Engineering, or related field (or equivalent experience)
- 3+ years of relevant industry experience
- Strong proficiency in C++ programming and software design, including debugging, performance evaluation, and testing
- Experience working with high-performance code generation and knowledge of compiler transformations and optimizations
- A deep understanding of computer architecture and parallel computing programming models
- Hands-on compiler design experience, particularly in MLIR (optional)
- Experience writing high-performance kernels at low levels of abstractions like NVVM/PTX for GPUs or other similar parallel processing architectures (preferred)
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.