DevOps Engineer with Splunk
Remote · Bengaluru, Karnataka, India
Job Summary
Junior Observability Engineer to design, implement, and optimize enterprise observability solutions across applications, infrastructure, and cloud environments. Develop dashboards, alerts, and telemetry frameworks to provide real-time visibility into system health and performance. Build automation solutions to eliminate repetitive tasks and enable runbook automation, self-healing workflows, and automated incident triage. Define and implement SLIs, SLOs, and alerting strategies to improve service reliability, and drive improvements in MTTD/MTTR through telemetry-driven insights. Leverage AIOps capabilities for alert correlation and intelligent incident response. Integrate observability platforms with CI/CD pipelines, AWS/GCP, and ITSM tools such as ServiceNow. Collaborate with engineering, product, and operations teams to establish observability standards and operational readiness. Requires 3+ years in Observability/SRE, hands-on with Splunk, Dynatrace, Grafana, OpenTelemetry, AWS and GCP, Python automation, MELT, Terraform, and strong troubleshooting and collaboration skills. Bachelor’s degree in a related field (or equivalent experience).
Required Qualifications
- 3+ years of experience in Observability Engineering, Site Reliability Engineering, or related domains
- Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent experience)
- Hands-on experience with observability platforms such as Splunk, Dynatrace, Grafana, and OpenTelemetry
- Strong expertise in AWS and GCP knowledge, with familiarity with cloud-native architectures
- Proficiency in Python for automation and operational tooling
- Experience implementing metrics, logs, events, and distributed tracing (MELT) across distributed systems
- Hands-on experience with Terraform and Infrastructure as Code practices
- Strong understanding of SLIs, SLOs, alerting strategies, and incident response frameworks
Desired Qualifications
- 3+ years of experience in Observability Engineering, Site Reliability Engineering, or related domains
- Hands-on experience with observability platforms such as Splunk, Dynatrace, Grafana, and OpenTelemetry
- Strong expertise in AWS and GCP knowledge, with familiarity with cloud-native architectures
- Proficiency in Python for automation and operational tooling
- Experience implementing metrics, logs, events, and distributed tracing (MELT) across distributed systems
- Hands-on experience with Terraform and Infrastructure as Code practices
- Strong understanding of SLIs, SLOs, alerting strategies, and incident response frameworks
- Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent experience)
- Nice to Have: Experience with AIOps platforms and intelligent alerting solutions; Kubernetes knowledge; ServiceNow integration; relevant certifications
- Relocation program is offered and relocation assistance is available
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.