Software Engineer, Data Infrastructure & Acquisition - Chapel Hill, NC, USA
$140,000–$200,000 year
On-site · Remote, Oregon, United States
Job Summary
Software Engineer for Speechify’s Data side of the AI team, focused on data collection to support model training, building high-quality petabyte-scale datasets, and optimizing ingestion workflows. Responsibilities include sourcing new audio data, expanding and operating the ingestion pipeline on GCP managed with Terraform, collaborating with AI Scientists to improve cost/throughput/quality, and helping craft the dataset roadmap for next-generation products. Ideal candidate has BS/MS/PhD in CS, 5+ years of software experience, strong scripting (bash/Python), Docker and IaC experience, and familiarity with cloud infrastructure and data processing workflows. The role emphasizes building scalable data infrastructure to power speech and AI products, including collaboration with researchers and leadership, and contributing to an entrepreneurial, asynchronous culture.
Required Qualifications
- BS in Computer Science or a related field or equivalent
- MS or PhD in Computer Science or related field preferred
- 5+ years of industry software development experience
- Proficiency with bash and Python in Linux environments
- Proficiency with Docker and Infrastructure-as-Code concepts
- Professional experience with at least one major Cloud Provider (GCP)
- Experience with web crawlers or large-scale data processing workflows is a plus
- Strong communication skills (written and verbal)
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.