Sr. HPC & IT Systems Engineer (7723)
On-site · Kanata, Ontario, Canada
Job Summary
Senior HPC & IT Systems Engineer to manage a large-scale HPC farm (Cloud/On-Premises) and ensure seamless remote access to data centers, while leading IT projects, supporting R&D engineers, and optimizing EDA tool environments. Responsibilities include Linux cluster administration, LSF workload management, NetApp storage, virtualization (VMware), OS image management, NIS/LDAP migrations, remote access reliability, security compliance, and collaboration with global IT teams. Preferred skills include cloud/IaC (AWS/Azure/GCP, Ansible, Terraform), containerization (Docker, Kubernetes), CI/CD, monitoring stacks (Prometheus, Grafana, ELK), ITIL processes, and experience with IC design communities; Mandarin and English proficiency.
Required Qualifications
- Bachelor’s degree in CS, CE, EE, or related field, or equivalent experience
- 7+ years in IT support, systems administration, and infrastructure engineering, ideally in R&D/HPC
- Expert-level Linux administration (Red Hat preferred) with advanced scripting (Python, Perl, Shell)
- Extensive experience with HPC workload managers (e.g., LSF)
- Deep expertise in enterprise storage administration (e.g., NetApp)
- Proven experience with high-performance remote desktop/visualization solutions and WAN network optimization
- Demonstrated experience with system deployment, OS image management, and directory services (NIS/LDAP)
- Solid experience with virtualization platforms (e.g., VMware)
- Strong understanding of EDA tool ecosystems, design flows, and licensing
- Experience with design data management systems
- Strong network fundamentals (TCP/IP, routing, firewalls) and hardware troubleshooting
- Exceptional problem-solving, communication, and organizational skills
- Self-motivated, high integrity, and results oriented
Desired Qualifications
- RHCE/A
- VCP
- NCDA
- LSF Admin
- Cloud environment management (AWS, Azure, GCP)
- IaC tools such as Ansible or Terraform
- Containerization technologies (Docker, Kubernetes)
- CI/CD pipelines
- Monitoring and logging systems (Prometheus, Grafana, ELK stack)
- ITIL or similar service management frameworks
- Familiarity with Large Language Model (LLM) concepts in an R&D context
- Mandarin and English proficiency
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.