Senior Data Engineer
On-site · Warsaw, Mazovia, Poland
Job Summary
Senior Data Engineer to design and build cloud-native data platforms, migrate on-premises legacy systems to the cloud, and establish AI-ready data infrastructure. Lead greenfield initiatives, implement near-real-time ingestion with event-driven patterns, enforce platform standards (Data Lake/Lakehouse, medallion architecture, data contracts), and refactor Spark/PySpark scripts for performance. Drive CI/CD and testing practices across data pipelines, promote AI tooling and agentic workflows, ensure data quality, observability, and reliability, and develop self-service tooling and microservices to simplify platform usage for other teams.
Required Qualifications
- 5+ years of professional experience in Data Engineering
- Strong Python and SQL development skills for pipeline development and optimisation
- Proficiency in Apache Spark / PySpark, including query optimisation and performance tuning
- Hands-on experience with Databricks (preferred) or Snowflake
- Experience with at least one major cloud provider: Azure (preferred), AWS, or GCP
- Experience with stream processing technologies (Kafka, Spark Structured Streaming)
- Solid understanding of ETL/ELT patterns, data modelling (dimensional, Data Vault), and data warehousing
- Experience with orchestration tools (Apache Airflow, Azure Data Factory, or equivalent)
- Knowledge of Infrastructure as Code (Terraform or equivalent)
- Understanding of production-grade system requirements: reliability, scalability, observability, and performance
- Upper-Intermediate English level
- Familiarity with RAG pipeline design and LLM integration patterns
- Knowledge of data governance frameworks and tools (Unity Catalog, Apache Atlas, or similar)
- Experience with dbt for data transformation and modelling
- Familiarity with MLflow, Feature Stores, or ML platform integration
- Self-driven and proactive in identifying improvements
- Strong problem-solving mindset with attention to detail
- Open to experimenting with emerging technologies and approaches
Desired Qualifications
- 5+ years of professional experience in Data Engineering
- Strong Python and SQL development skills for pipeline development and optimisation
- Proficiency in Apache Spark / PySpark, including query optimisation and performance tuning
- Hands-on experience with Databricks (preferred) or Snowflake
- Experience with at least one major cloud provider: Azure (preferred), AWS, or GCP
- Experience with stream processing technologies (Kafka, Spark Structured Streaming)
- Solid understanding of ETL/ELT patterns, data modelling (dimensional, Data Vault), and data warehousing
- Experience with orchestration tools (Apache Airflow, Azure Data Factory, or equivalent)
- Knowledge of Infrastructure as Code (Terraform or equivalent)
- Understanding of production-grade system requirements: reliability, scalability, observability, and performance
- Upper-Intermediate English level
- Familiarity with RAG pipeline design and LLM integration patterns
- Knowledge of data governance frameworks and tools (Unity Catalog, Apache Atlas, or similar)
- Experience with dbt for data transformation and modelling
- Familiarity with MLflow, Feature Stores, or ML platform integration
- Self-driven and proactive in identifying improvements
- Strong problem-solving mindset with attention to detail
- Open to experimenting with emerging technologies and approaches
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.