MLOps Platform Engineer (SageMaker)
$187,200–$212,160 year
On-site · Plano, Texas, United States
Job Summary
MLOps Platform Engineer to set up SageMaker Unified Studio, create domain configurations and multi-environment promotion workflows; build MLOps pipelines using SageMaker Pipelines (data extraction from Snowflake, preprocessing, training, evaluation, model registration); manage SageMaker Model Registry with cross-account promotion and lineage; configure MLflow for experiment tracking; implement IAM (Okta SSO, SailPoint entitlements, execution/service roles); develop real-time and batch model serving; implement model monitoring for data and model drift; set up data catalog and lineage; own platform observability and operations (CloudWatch, Datadog). Requires 10-15 years of software engineering experience in cloud infra or ML platform ops; 5+ years AWS with SageMaker; 3+ years production MLOps pipelines; IaC (Terraform/CDK/CloudFormation); IAM design; MLflow; SageMaker Pipelines; Snowflake; Kubernetes; networking and security. Added bonus: SageMaker Unified Studio domain provisioning, SageMaker Feature Store, SageMaker Model Monitor, AWS ML Specialty certification.
Required Qualifications
- 10-15 years of software engineering experience focused on cloud infrastructure or ML platform operations
- 5+ years hands-on with AWS, including SageMaker
- 3+ years building and operating production MLOps pipelines
- Experience with SageMaker Unified Studio or Studio Classic
- Infrastructure-as-Code with Terraform, CDK, or CloudFormation
- IAM design for ML platforms
- MLflow or equivalent experiment tracking
- SageMaker Pipelines or similar workflow orchestration
- Model serving with real-time endpoints and batch prediction
- Snowflake as data source for ML pipelines
- Kubernetes (EKS) and container orchestration
- Networking and security — VPC, security groups, private endpoints, cross-account connectivity
Desired Qualifications
- 5+ years hands-on with AWS, including deep expertise in Amazon SageMaker (Studio, Pipelines, Model Registry, Endpoints, Feature Store)
- 3+ years building and operating production MLOps pipelines — training, versioning, deployment, monitoring, rollback
- Infrastructure-as-Code with Terraform, CDK, or CloudFormation
- IAM design for ML platforms — execution roles, service roles, cross-account access, Lake Formation, SSO/SAML
- MLflow or equivalent experiment tracking
- SageMaker Pipelines or similar workflow orchestration (Airflow, Step Functions)
- Model serving — real-time endpoints, batch transform, auto-scaling, endpoint monitoring
- Snowflake as a data source for ML pipelines
- Kubernetes (EKS) and container orchestration
- Networking and security — VPC, security groups, private endpoints, cross-account connectivity
- SageMaker Unified Studio domain provisioning, custom blueprints, project standardization
- SageMaker Feature Store for online/offline feature management
- SageMaker Model Monitor — data quality checks, bias detection, drift detection
- AWS Machine Learning Specialty certification
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.