philips/jobs-and-careers3 days ago

Site Reliability Engineer (SRE)

philips/jobs-and-careers

On-site · Bengaluru, Karnataka, India

Bengaluru, Karnataka, IndiaOn-siteFull TimeSenior LevelNot SpecifiedUnknown

Type

Full Time

Level

Senior Level

Education

Not Specified

Company size

Unknown

Job Summary

Design and scale observability frameworks (metrics, logs, traces, event streams) across cloud environments. Define and manage SLIs/SLOs to ensure high availability, performance, and reliability. Build proactive, AI-driven monitoring systems to detect anomalies and predict failures. Develop automation and self-healing capabilities to reduce manual intervention and improve system resilience. Enable event-driven operations, integrating with tools like ServiceNow, PagerDuty, and Slack. Collaborate with engineering, SecOps, and FinOps teams to improve reliability, security, and cost efficiency. You have 8+ years of SRE/Cloud/Platform Engineering experience and are proficient with Prometheus, Grafana, Datadog, OpenTelemetry, CloudWatch; coding in Python/Go/Bash; and deploying Docker/Kubernetes in cloud-native environments. We work in-office at least 3 days per week; onsite role in Bangalore, India.

Required Qualifications

8+ years in SRE/Cloud/Platform Engineering with AWS production environment experience
Expertise in Prometheus, Grafana, Datadog, OpenTelemetry, CloudWatch, and managing SLIs/SLOs
Strong skills in Python, Go, or Bash
Experience with distributed systems, microservices, Docker, and Kubernetes
Knowledge of event-driven operations and incident tools (ServiceNow, PagerDuty, Slack)
Cross-functional collaboration experience and drive for reliability, security, and cost optimization

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started