Staff Data Engineer
Hybrid · Gurugram, Haryana, India
Job Summary
Lead Data Engineering team to design and build scalable ETL pipelines using Spark and other Big Data technologies, and help design the architecture of the Internal Data Platform to support a robust medallion architecture. Provide thought leadership on reducing cloud infrastructure costs and design/build AI agents to automate development and support tasks. Collaborate with Security/Compliance to ensure data permissions and regulations, and work with Data Platform and Governance teams to make data scalable, consumable, and discoverable. Requires 10+ years in enterprise data lakes/warehouses, 5+ years Spark and Python, 5+ years hands-on AWS or GCP experience, AI knowledge with codegen/tools and agentic frameworks, and experience with Hive, Iceberg, Glue, and big data file types (Parquet, Avro, JSON).
Required Qualifications
- 10+ years experience working on enterprise data lakes/warehouses
- 5+ years of Spark and Python experience
- 5+ years of direct hands-on experience working with AWS or GCP
- Thorough AI knowledge, particularly with codegen tools and agentic frameworks
- Hive, Iceberg, Glue, or other technologies that expose big data as tables
- Familiarity with different big data file types such as Parquet, Avro, and JSON
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.