Lead Software Engineer - AI/ML Deep Learning & GPU ML Serving
On-site · Palo Alto, California, United States
Job Summary
Lead Software Engineer responsible for designing, developing, and troubleshooting secure, scalable software solutions in ML/AI workloads. Focus on optimizing deep learning models for production inference (including quantization and batching), deploying GPU workloads in Kubernetes, building scalable low-latency web services and APIs, and guiding architecture discussions toward reliability and scalability. Collaborate with software engineering teams to adopt emerging technologies, manage data analysis and visualization, and drive production-ready ML systems using frameworks like TensorFlow, PyTorch, TorchServe, and Triton, with cloud and NoSQL experience (AWS/GCP, Docker, Kubernetes, Cassandra).
Required Qualifications
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Professional software development experience, with emphasis on ML systems
- Strong proficiency in Python and experience with ML frameworks (TensorFlow, PyTorch, or similar)
- Experience with cloud technologies (Docker, Kubernetes, EKS) and public clouds (AWS, GCP)
- Hands-on experience with ML model serving frameworks (TorchServe, TensorFlow Serving, Triton Inference Server)
- Experience deploying and managing GPU workloads in Kubernetes
- Familiarity with scalable, low-latency systems based on web services and APIs
- Experience with NoSQL databases (Cassandra or equivalent) for high-throughput data access
- Understanding of GPU resource management and cost optimization
- Experience with modern microservices architecture
- Ability to lead the design of large-scale systems and evaluate tradeoffs
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.