Senior Site Reliability Engineer
$87,000–$103,500 year
Remote · Poland or PL
Job Summary
Senior Site Reliability Engineer at PandaDoc (Remote Europe). Own and influence the incident management process end-to-end; maintain and evolve the on-prem observability stack; keep production applications running smoothly via on-call rotations; develop automations and tools to support platform reliability; contribute to production services with performance and resiliency in mind; collaborate with product engineers to foster SRE principles within the R&D organization; mentor the SRE team and product engineers. Requires solid programming experience in Python (Django and AsyncIO) and/or Java (Spring Boot), experience with observability tools (LGTM - Loki, Grafana, Tempo, Mimir), production Python services, AWS and Kubernetes, relational databases (PostgreSQL) and messaging systems (RabbitMQ, NATS, Kafka), and strong on-call troubleshooting skills. Proficiency in English.
Required Qualifications
- Solid programming experience in Python (Django and AsyncIO) and/or Java (Spring Boot)
- Experience maintaining an observability tools suite (Loki, Grafana, Tempo, Mimir)
- Experience developing and maintaining Python services in production
- Strong experience with AWS and Kubernetes
- Proficiency with relational databases (PostgreSQL) and messaging systems (RabbitMQ, NATS, Kafka)
- On-call SRE engineer with hands-on troubleshooting of distributed systems in production
- Ownership mentality and strong communication/knowledge-sharing skills
- Proficiency in English (written and spoken)
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.