Member of Technical Staff — RL Research (New PhD Grad)
$250,000–$350,000 year
On-site · Seattle, Washington, United States
Job Summary
Own RL and post-training for large-scale omni models. Build Nuance’s RL/post-training stack from 0→1, including rollout generation, policy optimization, reward/reference model serving, data feedback loops, evaluation, checkpointing, observability, and debugging. Develop and scale post-training methods such as PPO, GRPO, DPO, rejection sampling, RLHF/RLAIF, online RL, and model-based data improvement. Design system abstractions connecting research ideas to production-scale RL runs (trainers, rollout workers, reward models, evaluators, data queues, experience buffers). Build evaluation and feedback loops for omni behavior (turn-taking, interruption, timing, emotional response, audiovisual coherence, real-time interaction quality). Optimize end-to-end post-training loop for throughput, latency, GPU utilization, and researcher iteration speed. Evolve the platform as algorithms and data sources change. Location: In-person in Seattle, five days a week; visa sponsorship available from day one.
Required Qualifications
- PhD completed or in final stretch
- Strong understanding of RL/post-training methods (policy optimization, reward modeling, preference optimization, rejection sampling, KL control)
- Experience with RL post-training pipelines and frameworks (Open-source or research)
- Experience with rollout serving systems and data feedback loops
- Strong software engineering fundamentals and systems-building experience
- Ability to reason about model behavior, rewards, and evaluation mismatch
- Curiosity and adaptability toward new RL algorithms and architectures
- Experience with multimodal post-training for audio/video/language models (bonus)
- Visa sponsorship from day one (O-1, H-1B, green card) is offered
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.