Lead Software Engineer - MLOps Platform
On-site · London, England, United Kingdom
Job Summary
Lead MLOps Platform Engineer responsibilities include designing and developing a scalable ML platform to support model training, deployment, and monitoring; building and maintaining infrastructure for automated ML pipelines with reliability and reproducibility across frameworks; implementing tools for model versioning, experiment tracking, and lifecycle management; developing systems to monitor model performance and address data drift; collaborating with data scientists and engineers on model integration/deployment patterns; optimizing resource utilization for training and inference; designing and implementing a testing framework (unit, component, integration, end-to-end, performance, champion/challenger); ensuring platform compliance with data privacy, security, and regulatory standards; mentoring team members on platform design principles, coding practices, and implementation patterns for high-quality maintainable solutions.
Required Qualifications
- Proficiency in coding in recent versions of Java and/or Python programming languages
- Experience with MLOps tools and platforms (e.g., MLflow, Amazon SageMaker, Google VertexAI, Databricks, BentoML, KServe, Kubeflow)
- Experience with cloud technologies (AWS/Azure/GCP) and distributed systems, web technologies and event drive architectures
- Understanding of data versioning and ML models lifecycle management
- Hands-on experience with CI/CD tools (e.g., Jenkins, GitHub Actions, GitLab CI)
- Knowledge of infrastructure-as-code tools (e.g., Terraform, Ansible)
- Strong knowledge of containerization and orchestration tools (e.g. Docker, Kubernetes)
- Proficiency in operating, supporting, and securing mission critical software applications
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.