Senior AI Data Engineer
On-site · Kraków, Lesser Poland, Poland or Warsaw, Mazovia, Poland
Job Summary
Senior AI Data Engineer responsible for designing, building, and scaling robust ETL/ELT pipelines for AI workloads; transforming unstructured data into structured, vectorized formats for LLMs; maintaining data-to-model lifecycle and real-time feature pipelines; integrating with Kafka and event-driven systems; managing Feature Stores and data quality, lineage, and governance across on-premises and cloud data platforms; enabling vector DB storage for high-performance AI search; collaborating with data scientists, ML engineers, software engineers, and stakeholders to deliver scalable AI data solutions; requires deep expertise in Python/SQL, real-time streaming (Kafka/Flink/Spark), orchestration tools (Airflow/dbt/Prefect), vector databases, and cloud/Kubernetes environments.
Required Qualifications
- Bachelor's degree in Computer Science, Software Engineering, Information Systems, or related technical field. Equivalent practical experience will also be considered.
- 10+ years of experience in Data Engineering or Backend Engineering with a strong focus on data platforms and pipelines.
- 2+ years of hands-on experience supporting AI/ML data pipelines, including data preparation for machine learning and generative AI applications.
- Expert-level proficiency in Python and SQL; experience with Java or Scala is an advantage.
- Strong experience building and maintaining real-time data streaming solutions using Apache Kafka, Flink, or Spark Streaming.
- Hands-on experience with modern data orchestration and transformation tools such as Airflow, dbt, and Prefect.
- Experience working with Vector Databases and Feature Stores to support AI and machine learning workloads.
- Strong knowledge of cloud-based data services on AWS, Azure, or GCP, including services such as Glue, Kinesis, Data Factory, or Dataflow.
- Experience deploying and managing data workloads in Kubernetes (K8s) environments.
- Proven experience handling sensitive data within regulated industries such as Fintech, Healthcare, or other compliance-driven environments.
- Strong understanding of data quality, governance, security, and privacy best practices.
- Excellent problem-solving skills and the ability to collaborate effectively with cross-functional engineering, data, and AI teams.
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.