Asset & Wealth Management - Site Reliability Engineer - Vice President - Richardson
On-site · Richardson, Texas, United States
Job Summary
Site Reliability Engineer (Vice President) at Goldman Sachs responsible for delivering highly available, scalable, fault-tolerant platform services across on-prem and cloud environments. Lead architectural design, automation, and incident management while mentoring senior engineers and collaborating with executive stakeholders to drive reliability improvements. Key focus areas include capacity planning, observability (monitoring, logging, tracing), deployment automation (canary releases), and adoption of SRE best practices across the organization. Required skills include strong software and systems engineering fundamentals, proficiency in Java/Python/Go, cloud (AWS/GCP), Docker/Kubernetes, IaC (Terraform/ CloudFormation), configuration management (Puppet/Chef/Ansible), monitoring/alerting tooling (Prometheus, Grafana, ELK, Datadog), CI/CD (Jenkins, GitLab, Maven), and a track record of solving complex operational challenges. Educational requirement points to an advanced degree in CS or related field or equivalent practical experience; preferred experience with distributed databases, Kafka, and GCP BigQuery, plus superior communication to engage global teams and executives.
Required Qualifications
- Minimum of 6+ years of hands-on Site Reliability Engineering experience
- Proficiency in Java, Python, or Go
- Extensive experience with cloud platforms (AWS, GCP)
- Containerization and orchestration (Docker, Kubernetes)
- Infrastructure as Code tools (Terraform, CloudFormation)
- Configuration management (Puppet, Chef, Ansible)
- Monitoring, logging, and tracing (Prometheus, Grafana, ELK, Datadog)
- CI/CD tools (Jenkins, GitLab, Maven)
- Strong problem-solving and analytical abilities
- Excellent communication and collaboration skills
- Advanced degree in Computer Science or related field (Bachelor/Master/PhD) or equivalent practical experience
- Experience with distributed databases, BigQuery, Kafka (preferred)
- On-call leadership experience
- Mentorship and technical leadership capabilities
- Experience architecting scalable, fault-tolerant systems
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.