AI/LLM SRE Software Engineer III
On-site · Columbus, Ohio, United States
Job Summary
Lead Software Engineer specializing in AI/LLM-powered SRE and observability within JPMorgan Chase's Employee Platforms. Responsible for reliability, scalability, and automation of AI-assisted applications and infrastructure; design and implement AI-driven alerting, noise reduction, and auto-correlation systems; build and maintain observability, monitoring, and telemetry for AI applications; develop automation for alerting, anomaly detection, and self-healing workflows; mentor engineers on AIOps standards; define and execute the roadmap for AI-assisted SRE and observability; collaborate across engineering and stakeholders to drive operational excellence.
Required Qualifications
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Demonstrates strong experience in SRE, DevOps, or Platform Engineering roles
- Strong hands-on experience with AWS (ECS, Lambda, API Gateway, Bedrock, CloudWatch, RDS, EKS)
- Hands-on experience with AWS Bedrock, OpenAI, or LLM APIs
- Expertise in observability tools: OpenTelemetry, Grafana, Prometheus, ELK, CloudWatch
- Experience with CI/CD tools (GitHub Actions, Jenkins, Spinnaker)
- Proven track record in automation, operational tooling, and event-driven workflows
- In-depth understanding of distributed systems, microservices, and cloud architectures
- Preferred qualifications may include experience with AI-powered coding assistants like GitHub Copilot, windsurf; familiarity with prompt engineering, embeddings, and RAG pipelines; experience building operational copilots or chatbots for runbooks or troubleshooting; proficiency in Python (Go is a plus)
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.