Senior Site Reliability Engineer — Observability Engineer | NordVPN
Remote · Lithuania or Vilnius, Vilnius, Lithuania
Job Summary
Senior SRE focused on observability: design, build, and improve monitoring pipelines and observability tooling across globally distributed infrastructure; define and implement service-level monitoring based on golden signals; reduce alert fatigue by building meaningful alerts; develop and maintain custom exporters, scripts, and integrations for metrics and log collection; collaborate with the data team on anomaly detection and data-driven operational insights; understand service signals and their meaning. Tools include Naemon (Nagios), Prometheus exporters, Telegraf, Fluent Bit, VictoriaMetrics, OpenSearch, Grafana; required skills include Python scripting, Linux administration, and networking fundamentals. Bonus points for SaltStack, advanced networking, data knowledge, onboarding systems into monitoring, and familiarity with agentic engineering.
Required Qualifications
- Distributed systems observability
- Monitoring architecture and dashboards
- Golden signals (latency, traffic, errors, saturation)
- Alert design to reduce noise and improve actionability
- Custom exporters, scripts, and integrations for metrics/log collection
- Collaboration with data team on anomaly detection and data-driven operational insights
- Understanding service signals and what to measure
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.