SDE 3
On-site · Lucknow, Uttar Pradesh, India
Job Summary
Architect a self-serve Data Platform-as-a-Product powering InMobi's global-scale data ecosystem. Integrate OSS tools, proprietary services, and Cloud/SaaS into unified infrastructure. Requires deep data engineering expertise (batch/streaming pipelines, data modeling, query optimization) combined with platform engineering to build production-grade solutions for data engineers and analysts. Core responsibilities include bridging OSS tools (Spark, Flink, Airflow, Iceberg) with internal services and cloud offerings into cohesive data platform infrastructure, enabling push-button data workflows; operating distributed systems processing petabytes of data daily; owning multi-region Kubernetes infrastructure with elastic scalability and fault tolerance; optimizing compute utilization for large-scale batch and real-time streaming with sub-second latency; building telemetry and data quality frameworks for 24/7 uptime and enforcing SLAs/SLOs with automated incident response and data validation.
Required Qualifications
- 7–10 years building, optimizing, and operating production data platforms
- Deep data engineering fundamentals: data modeling, partitioning strategies, query optimization
- Distributed compute: Spark (PySpark/Scala), Flink streaming, performance tuning at petabyte scale
- Data lake architecture: Iceberg table format, Polaris catalog, schema evolution, time travel
- Orchestration: Airflow DAG development, dependency management, SLA monitoring
- Data transformation: DBT modeling, testing, documentation, incremental builds
- Data quality: Great Expectations, dqueue validation frameworks, drift detection
- Query acceleration: Velox, Gluten integration, columnar formats (Parquet, ORC)
- Data governance: OpenMetadata catalog, lineage tracking, access control
- Kubernetes platform development: operators (Spark/Flink), Yunikorn scheduler, multi-tenancy, autoscaling
- Cloud infrastructure: GKE multi-region clusters, GCS object storage, hybrid cloud/on-prem architecture
- Programming: Python, PySpark, Scala for data pipelines and platform tooling
- IaC: Terraform, Helm, GitOps for reproducible deployments
- CI/CD: Automated testing, deployment pipelines for data platform components
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.