Senior Scientist, Synthetic Data Generation
$168,000–$304,750 year
Remote · United States or New York City, New York, United States
Job Summary
Senior Scientist role focusing on building and advancing synthetic data generation pipelines for training frontier NLP/AI models. Lead open-source library and SDK development within NVIDIA's NeMo ecosystem, steer multimodal synthetic data generation (text, code, structured, image, video, audio), and publish research at top conferences while mentoring interns and junior researchers. Responsibilities include designing and maintaining scalable data-generation tools, evaluating data quality with automated pipelines, collaborating across research, engineering, product, and model teams, and contributing to open-source software with strong documentation and CI/CD practices. Requires a PhD or equivalent experience with a strong publication record and a deep technical understanding of LLMs, data shaping for pre- and post-training, and inference frameworks; plus experience building reusable software libraries and open-source contributions.
Required Qualifications
- PhD in Computer Science, Machine Learning, Statistics, or related field, or equivalent experience
- 3+ years research background in synthetic data generation, generative modeling, multimodal ML, or related areas
- Strong understanding of LLMs and data shaping for pre/post-training; experience with inference frameworks like vLLM or TGI
- Proven track record developing or maintaining software libraries used by a broad developer community
- Strong publication record at top venues (NeurIPS, ICML, ICLR, ACL or similar)
- Open-source contributions in ML or data tooling (preferred)
- Experience with multimodal generation or understanding (vision-language, document AI, video, or audio)
- Experience building scalable data pipelines for large-scale model training (throughput, distributed inference)
- Experience generating data for agentic, tool-use, or reinforcement-learning post-training
- Mentorship experience (e.g., mentoring interns/junior researchers)
- Commitment to diversity and inclusive work environment
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.