Nuance Labs3 days ago

Member of Technical Staff — RL Research (New PhD Grad)

Nuance Labs

$250,000–$350,000 year

On-site · Seattle, Washington, United States

Seattle, Washington, United StatesOn-siteFull Time$250,000–$350,000 yearMid LevelDoctorate Or Professional DegreeStartup

Type

Full Time

Level

Mid Level

Education

Doctorate Or Professional Degree

Company size

Startup

Job Summary

Own RL and post-training for large-scale omni models. Build Nuance’s RL/post-training stack from 0→1, including rollout generation, policy optimization, reward/reference model serving, data feedback loops, evaluation, checkpointing, observability, and debugging. Develop and scale post-training methods such as PPO, GRPO, DPO, rejection sampling, RLHF/RLAIF, online RL, and model-based data improvement. Design system abstractions connecting research ideas to production-scale RL runs (trainers, rollout workers, reward models, evaluators, data queues, experience buffers). Build evaluation and feedback loops for omni behavior (turn-taking, interruption, timing, emotional response, audiovisual coherence, real-time interaction quality). Optimize end-to-end post-training loop for throughput, latency, GPU utilization, and researcher iteration speed. Evolve the platform as algorithms and data sources change. Location: In-person in Seattle, five days a week; visa sponsorship available from day one.

Required Qualifications

PhD completed or in final stretch
Strong understanding of RL/post-training methods (policy optimization, reward modeling, preference optimization, rejection sampling, KL control)
Experience with RL post-training pipelines and frameworks (Open-source or research)
Experience with rollout serving systems and data feedback loops
Strong software engineering fundamentals and systems-building experience
Ability to reason about model behavior, rewards, and evaluation mismatch
Curiosity and adaptability toward new RL algorithms and architectures
Experience with multimodal post-training for audio/video/language models (bonus)
Visa sponsorship from day one (O-1, H-1B, green card) is offered

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started