Software Engineer- ModelZoo
On-site · Hyderabad, Telangana, India
Job Summary
Port and deploy deep learning models from PyTorch/TensorFlow to ML accelerator hardware, optimize performance with focus on latency and throughput, contribute to model quantization (e.g., INT8) to shrink model size while preserving accuracy, and perform profiling/debugging of the ML inference pipeline on accelerators. Requires hands-on experience deploying/optimizing models on GPUs or specialized accelerators, proficiency in C++ and Python, knowledge of CUDA/cuDNN, and familiarity with inference engines (TensorRT, OpenVINO, TensorFlow Lite); educational background in CS/EE or related field; knowledge of cloud platforms and CI/CD for ML is a plus.
Required Qualifications
- Proficiency in PyTorch and TensorFlow
- Hands-on experience deploying and optimizing models on GPUs or specialized accelerators
- Experience with model quantization (Post-Training Quantization)
- Strong proficiency in C++ and Python
- Experience with GPU programming models like CUDA/cuDNN
- Familiarity with ML inference engines/runtimes (TensorRT, OpenVINO, TensorFlow Lite)
- Foundational understanding of computer architecture
- Version control with Git and collaborative workflows
- Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.