CenterWell Home Health2 weeks ago

Decision Intelligence Engineer - Next Best Action

CenterWell Home Health

$129,300–$177,800 year

Remote · United States

United StatesRemoteFull Time$129,300–$177,800 yearSenior LevelNot SpecifiedHEALTHCARELarge

Type

Full Time

Level

Senior Level

Education

Not Specified

Company size

Large

Industry

HEALTHCARE

Job Summary

Decision Intelligence Engineer – Next Best Action role focused on designing, training, and evaluating reinforcement learning policies for Humana’s Next Best Action platform in healthcare. The role involves developing decision-making models and implementing RL methods (e.g., PPO, A3C, DQN, CQL, Decision Transformer), framing member decision problems as MDPs/Partially Observable MDPs, and integrating clinical eligibility rules and program objectives into learning objectives. Responsibilities include building simulation/backtesting environments (discrete-event simulation, Monte Carlo), diagnosing and remediating failure modes (policy collapse, credit assignment issues, distributional shift), establishing automated evaluation gates within Databricks workflows, and instrumenting training with MLflow. You will own the nightly Databricks training workflow, perform feature engineering with Databricks/PySpark, manage model artifacts in MLflow, and collaborate with data engineers, rules engines, and platform architects to ensure production-grade, auditable decisioning. Preferred and required backgrounds include multi-agent frameworks, constrained optimization methods (LP/MIP/Lagrangian relaxation), experience in regulated domains (healthcare/insurance), and familiarity with simulation environments and observability tooling. This role is remote (nationwide in the US) with possible travel to Humana offices for training or meetings.

Required Qualifications

8+ years of software engineering or quantitative research experience building and operating large-scale production systems
3+ years of hands-on experience implementing reinforcement learning, operations research methods, or simulation-driven decision systems in production
Experience with policy gradient and value-based RL (PPO, A3C, DQN, CQL)
Experience with stochastic dynamic programming, discrete-event simulation, or large-scale constrained optimization
Deep familiarity with MDPs, Bellman-equation-based value estimation, reward shaping, and constraint formulation
Ability to diagnose failure modes in learned policies (instability, credit assignment, distributional shift)
Proficiency in Python 3.x; experience with PyTorch or TensorFlow
Experience with Ray RLlib or equivalent distributed frameworks
Experience with Databricks, PySpark, and Delta Lake for large-scale ML/data pipelines
Experience with MLflow for experiment tracking, model registry, and artifacts
Experience shipping production systems

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started