Schwarz Digits logo
Schwarz Digits4 months ago

(Senior) Site Reliability Engineer - STACKIT Control Plane (m/f/d)

On-site · Heilbronn, Baden-Wurttemberg, Germany

Type
Full Time
Level
Senior Level
Education
Not Specified
Company size
Unknown

Job Summary

Collaborate with development teams to shorten time-to-detect by enhancing monitoring and alerting infrastructure and ensuring services adhere to defined SLOs. Improve time-to-mitigation by creating playbooks, designing dashboards for first responders, and ensuring telemetry data (logs and metrics) is comprehensive. Act as a reliability consultant to educate teams on reliability patterns and shift-left practices. Design and refine development practices, including CI/CD pipelines, to support progressive delivery (Canary, Blue/Green). Proactively analyze and optimize the scalability of the Control Plane, addressing bottlenecks in distributed consensus, database throughput, and kernel-level networking. Participate in a compensated on-call rotation, leading incident responses and blameless post-mortems and Root Cause Analyses.

Required Qualifications

  • 3+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering
  • expert-level knowledge of Kubernetes Control Plane internals (API Server, Controller Manager, Scheduler, etcd)
  • proficiency in Go for production-grade code to build automation tools or Kubernetes Operators
  • experience with Infrastructure as Code and container infrastructure
  • Linux system internals (kernel tuning, memory management) and networking (TCP/IP, CNI, Load Balancers, eBPF)
  • experience with datastores (PostgreSQL, Redis) and messaging systems (Kafka, NATS) in scalable environments
  • on-call rotation experience
  • focus on reliability patterns and shift-left philosophy
  • CI/CD design and enable progressive delivery strategies (Canary, Blue/Green)
  • ability to optimize scalability of distributed control planes
  • monitoring, logging, and telemetry design to meet defined SLOs
Sorce

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started

Schwarz Digits

(Senior) Site Reliability Engineer - STACKIT Control Plane (m/f/d)

Apply on Sorce