Chegg logo
Cheggtoday

Senior Software Engineer - Model Training & AI Evals

Remote · India

Type
Full Time
Level
Senior Level
Education
Not Specified
Company size
Enterprise

Job Summary

Senior Software Engineer to own end-to-end evaluation and benchmarking infrastructure for LLMs and base models, contribute hands-on to post-training pipelines, and lead domain-specific benchmarks and synthetic data generation to drive model improvements. Responsibilities include designing task-level evaluation frameworks, building comparative benchmarking pipelines, producing capability gap reports, tracking model-version regressions, and collaborating with product, curriculum, and research teams to translate eval insights into post-training and data flywheel decisions. Requires hands-on experience with SFT, RLHF, RLAIF, DPO, PPO, reward modeling, and data quality criteria, plus strong software engineering skills (Python, PyTorch/JAX) and experience with CI/CD and experiment tracking.

Required Qualifications

  • 5+ years of ML/AI engineering experience, with at least 2–3 years focused on large language models
  • Direct, hands-on experience at an LLM lab, AI research organization, or equivalent frontier AI team
  • Familiarity with the full model lifecycle: pre-training data, post-training alignment, eval, and production deployment
  • Deep practical expertise in post-training methods: SFT, RLHF, RLAIF, DPO, PPO
  • Experience with reward modeling, preference data curation, and quality control for alignment pipelines
  • Demonstrated experience designing LLM evaluation frameworks beyond standard benchmarks
  • Hands-on experience building synthetic data generation pipelines for addressing model capability gaps
  • Validating synthetic data quality through downstream model performance experiments
  • Proven track record of comparative benchmarking across leading foundation models
  • Experience training or fine-tuning vertical/industry-specific foundation models
  • Strong software engineering fundamentals: Python, PyTorch or JAX, distributed training
  • Publications or applied research contributions in LLM evaluation or alignment (preferred)
  • Experience with multi-modal models or agents with external tool/API use
  • Exposure to red-teaming, adversarial evaluation, or safety benchmarking
  • Model distillation, speculative decoding, or inference optimization experience
  • Prior experience in education, STEM, legal, biomedical, or enterprise software vertical
Sorce

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started

Chegg

Senior Software Engineer - Model Training & AI Evals

Apply on Sorce