Anthropic8 months ago

ML/Research Engineer, Safeguards

Anthropic

$350,000–$500,000 year

Hybrid · San Francisco, California, United States

San Francisco, California, United StatesHybridFull Time$350,000–$500,000 yearMid LevelBachelors DegreeAI ServicesStartup

Type

Full Time

Level

Mid Level

Education

Bachelors Degree

Company size

Startup

Industry

AI Services

Job Summary

Develop classifiers to detect misuse and anomalous behavior at scale; build systems to monitor harms that span multiple exchanges and develop new methods to analyze signals across contexts; evaluate and improve safety of agentic products by modeling threats and deploying mitigations for prompt injection; conduct research on automated red-teaming, adversarial robustness, and other testing methods to uncover misuse; requires 4+ years in ML/RE, Python, and experience across the deployment pipeline; strong communication skills to explain complex concepts; role located in San Francisco with a location-based hybrid policy.

Required Qualifications

Bachelor’s degree or equivalent in a relevant field
4+ years of experience in ML engineering, research engineering, or applied research
Proficiency in Python and experience building ML systems
Ability to work across the research-to-deployment pipeline
Strong communication skills
Experience with language modeling and transformers (preferred)
Experience building classifiers, anomaly detection systems, or behavioral ML (preferred)
Experience with adversarial machine learning or red-teaming (preferred)
Experience with interpretability or probes (preferred)
Experience with reinforcement learning (preferred)
Experience with high-performance, large-scale ML systems (preferred)

Additional Requirements

Visa sponsorship: Anthropic sponsors visas; not guaranteed for every role

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started