DevOps - Lead Software Engineer
On-site · Columbus, Ohio, United States
Job Summary
Lead Software Engineer responsible for reliability, scalability, and automation of AI-powered applications and infrastructure. You will partner with engineering and stakeholders to deliver modern observability, intelligent incident response, and autonomic operations across our applications. Responsibilities include ensuring reliability, scalability, and performance of AI-assisted application and platform operations; designing and implementing AI-driven solutions for intelligent alerting, noise reduction and auto-correlation; building and maintaining observability, monitoring, and telemetry; building and supporting automation for alerting, anomaly detection, and self-healing workflows; mentoring engineers on AIOps standards and operational excellence; and defining and executing the roadmap for AI-assisted SRE and observability.
Required Qualifications
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Demonstrates strong experience in SRE, DevOps, or Platform Engineering roles
- Strong hands-on experience with AWS (ECS, Lambda, API Gateway, Bedrock, CloudWatch, RDS, EKS)
- Hands-on experience with AWS and LLM APIs
- Expertise in observability tools: OpenTelemetry, Grafana, Prometheus, ELK, CloudWatch
- Experience with CI/CD tools (GitHub Actions, Jenkins, Spinnaker)
- Proven track record in automation, operational tooling, and event-driven workflows
- In-depth understanding of distributed systems, microservices, and cloud architectures
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.