Researcher, Computer Use - Agent Post-Training
$250,000–$380,000 year
On-site · San Francisco, California, United States
Job Summary
Teach models to operate computers by guiding agentic behavior across desktop and browser tools; design and run experiments to improve complex computer-use capabilities, including end-to-end improvements to the post-training stack (RL, data pipelines, graders, reward signals, evals, diagnostics, model-behavior analysis); build evals and environments to expose model failures and translate findings into product fixes or new research directions; collaborate with Codex and ChatGPT product teams to understand user needs and translate product signal into model improvements; work across research, product, infrastructure, data, evals, and safety boundaries to ship improvements in OpenAI's agent-post-training pipelines; candidates should have strong fundamentals in machine learning, software engineering, systems, statistics, or related fields, and hands-on experience with LLMs, RL/RLHF/RLAIF, post-training, evals, graders, synthetic data, and production ML systems; thrive on open-ended problems where the path is uncertain and impact-driven product work is valued.
Required Qualifications
- FullTime
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.