Senior Site Reliability Engineer — Observability Engineer | NordVPN
Remote · Poland
Job Summary
Senior SRE focused on observability: design, build, and improve monitoring pipelines and observability tooling across globally distributed infrastructure; define service-level monitoring based on golden signals; reduce alert fatigue with meaningful, actionable alerts; develop and maintain custom exporters, scripts, and integrations for metrics and log collection; collaborate with data teams on anomaly detection and data-driven operational insights; understand service signals and what to measure.
Required Qualifications
- Distributed systems observability
- Monitoring architecture and dashboards
- Golden signals thinking (latency, traffic, errors, saturation)
- Alert design and on-call management
- Python scripting and automation
- Linux administration and debugging
- Networking fundamentals
- Experience with monitoring/exporters (Prometheus-based, Telegraf, Grafana, OpenSearch)
- Familiarity with Naemon (Nagios) and related tooling
- Onboarding to new systems/services into monitoring from scratch
- Habit of data-driven operational insights and anomaly detection
- Hybrid work flexibility and remote collaboration
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.