Systems Engineer, HPC (US & Canada)
Hybrid · New York City, New York, United States or San Francisco, California, United States
Job Summary
Systems Engineer / System Administrator at Mistral AI to design, operate, and scale large-scale Linux environments and HPC/cloud infrastructure. Hybrid role combining systems administration, automation, and engineering to support petabyte-scale storage and clusters with hundreds to thousands of nodes. Responsibilities include operating and maintaining Linux environments, monitoring health, troubleshooting incidents, and ensuring high availability; scaling clusters; working on storage systems (Ceph, Lustre, NFS); automating tasks with Python, Bash, Ansible, Terraform; contributing to deployment, provisioning, and lifecycle management; shaping system design and architecture; and collaborating with HPC/infrastructure, Platform/DevOps, and research teams. Ideal candidates have strong Linux administration skills, experience with HPC or cloud infrastructure, familiarity with job schedulers like Slurm, and solid cross-domain troubleshooting; knowledge in containers/orchestration (Kubernetes), storage, networking (Ethernet, InfiniBand), Infrastructure as Code, and GPU/AI/ML is valuable.
Required Qualifications
- Strong Linux systems administration experience
- Experience working in large-scale environments (HPC clusters or cloud infrastructure)
- Experience with Job schedulers (e.g. Slurm)
- Solid troubleshooting skills across systems, hardware, and networks
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.