Senior Site Reliability Engineer - Support
On-site · Pune, Maharashtra, India
Job Summary
Senior Site Reliability Engineer - Support for a banking platform: diagnose, troubleshoot, and resolve complex production and customer issues across Kubernetes-based environments; hands-on, customer-facing role requiring investigation, root-cause analysis, and clear communication across technical and non-technical stakeholders. Leverage observability tools (Prometheus, Grafana, Loki, OpenTelemetry, Datadog, Splunk) and AI-assisted operations for incident insights, auto-remediation, and runbook improvements. Provide platform support across cloud, hybrid, and on-prem deployments, with deployment troubleshooting as needed. Ideal candidates bring ownership mindset, practical problem-solving, and ability to move between logs, metrics, customer context, and engineering discussions to resolve issues.
Required Qualifications
- Hands-on troubleshooting experience in production or customer-facing technical environments
- Kubernetes and containers experience
- Ability to read, interpret, and correlate logs, metrics, traces, and alerts
- Strong understanding of Linux, networking fundamentals, APIs, and distributed system behavior
- Customer-facing experience in investigations, escalations, or production support
- Experience with AI-native tooling and integrating LLMs into incident investigation and support workflows
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.