Senior Data Engineer
Remote · United States
Job Summary
Senior Data Engineer role focused on building Ceresti’s end-to-end data architecture for dementia-care data. You will own the durable landing zone for raw partner files and API payloads, implement validated ingestion pipelines into a Postgres-based transactional system, and operate a curated analytics layer enabling analytics, ML, and BI workloads without impacting production. You will design data contracts with health-plan and ACO partners, enforce HIPAA-compliant governance (PHI/PII classification, encryption, least privilege, audit logging, retention, de-identification), and drive observability with SLAs and data-freshness metrics. Key responsibilities include implementing scalable data pipelines (CSV/JSON/XML/HL7/X12, REST/SFTP), standing up the data warehouse/lakehouse-lite, selecting minimal toolsets (object storage, Dagster/Prefect/Airflow, dbt, a validation library), and mentoring engineers on SQL, schema design, and data systems practices. This role requires collaboration with backend, ML, product, and clinical stakeholders, delivering reliable feature data for ML models, and contributing to an agile, fast-moving data team that improves patient and caregiver outcomes.
Required Qualifications
- BS/BA degree or higher in Computer Science, Engineering, or a related technical field
- 8+ years of professional data engineering experience, with a track record of shipping production data systems end-to-end
- Mastery of PostgreSQL: schema design, indexing, query tuning, partitioning, logical replication, JSONB, extensions (pg_partman, pg_cron, pgvector, etc.)
- Experience designing and operating data pipelines (file-based ingestion and API-based ingestion)
- Hands-on experience with cloud platforms (AWS preferred) and their data primitives (S3, managed Postgres)
- Experience designing data warehouses and/or data lakes
- Strong experience with dbt and modern data modeling patterns (Kimball, Data Vault, One Big Table)
- Experience with orchestration frameworks (Dagster, Prefect, or Airflow)
- Strong Python skills for ingestion, validation, and tooling
- Experience with data validation and data-quality frameworks (Great Expectations, Pandera, Soda)
- Experience with change-data-capture from Postgres (logical replication)
- Data governance experience in a HIPAA-regulated environment or strong protective instincts for PHI/PII (encryption, least privilege, audit, de-identification); HITRUST or SOC 2 is a plus
- Familiarity with infrastructure-as-code and CI/CD for data systems
- Experience supporting ML workloads: feature tables, embeddings, vector search (pgvector), and LLM integration patterns
- Excellent written and verbal communication skills, able to explain complex schema decisions to both business stakeholders and partners
- Demonstrated experience working in Agile/Scrum teams
- Background check clearance required
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.