Member of the Technical Staff, Biological Data
$150,000–$250,000 year
Remote · United States or New York City, New York, United States
Job Summary
Member of the Technical Staff for Biological Data, focusing on building and curating training datasets for a biological data–driven reasoning model. You will own data capturing how proteins and molecules interact, extend datasets beyond public sources using biological and chemical reasoning, and design benchmarks to measure biologically meaningful capabilities. Collaborate with model researchers to determine what biological signals to prioritize, how to sequence learning across modalities, and how to ensure data quality and coverage across scales. The role requires a PhD in a biological/chemical field with deep molecular understanding, experience handling large biological datasets, strong Python skills, and a data-centric research mindset. Bonus points for experience with structure prediction, molecular docking, virtual screening, publications in relevant fields, cheminformatics, and protein/molecular language models. Benefits include competitive salary and equity, comprehensive medical coverage, and a culture emphasizing ownership, excellence, practicality, honesty, and fun.
Required Qualifications
- PhD in computational biology, biophysics, structural biology, chemistry, biochemistry, or related biological field
- 2+ years of post-doctoral or industry research experience or equivalent depth through combined biology and computational background
- deep understanding of molecular interactions, protein structure, and biological data at the molecular level
- experience with large-scale biological or molecular datasets (sourcing, cleaning, integrating, analyzing)
- strong programming skills in Python and building computational pipelines for data processing at scale
- understanding of what machine learning models require from training data (coverage, quality, balance, evaluation)
- ability to construct training datasets capturing protein-molecule interactions and extend data beyond public databases
- ability to design benchmarks and evaluation strategies for biologically meaningful model capabilities
- ability to integrate data across biological scales and modalities
- curiosity about staying current with biological data sources, experimental methods, and molecular databases
- bonus points for experience with structure prediction, molecular docking, virtual screening
- bonus points for publications in computational biology, bioinformatics, or molecular informatics
- bonus points for cheminformatics or molecular data analysis
- bonus points for experience with protein or molecular language models
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.