Senior Site Reliability Engineer
Hybrid · São Paulo, São Paulo, Brazil
Job Summary
Senior Site Reliability Engineer on Braze's MongoDB Platform responsible for reliability, scalability, and observability of MongoDB infrastructure. Own MongoDB reliability at scale, design and operate the MongoDB platform to meet enterprise-grade SLAs, build proactive monitoring and alerting with MongoDB-specific observability, lead capacity planning and sharding strategy, drive root-cause analysis and permanent improvements, improve developer experience with schema/index guidance and self-service tooling, manage lifecycle of MongoDB clusters on Kubernetes using Operator and IaC (Terraform/Ansible), implement automated backup/restore/point-in-time recovery workflows, contribute to internal tooling in Ruby/Go, participate in PagerDuty on-call rotations, and drive incident retrospectives with a bias toward automation and documentation. Candidates should have 5+ years in software/devops/SRE, hands-on MongoDB expertise, strong Linux fundamentals, programming skills in Python/Go/Ruby/JavaScript, experience with Terraform/Ansible, Docker and Kubernetes, and a bias toward asynchronous collaboration across global remote teams. Nice-to-have: experience at multi-terabyte/sharded MongoDB, MongoDB Atlas/Ops Manager, other Braze stack technologies, DBRE background. #LI-Hybrid
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.