Inference Stack Engineer
Hybrid · Gdańsk, Pomerania, Poland
Job Summary
Design and build components of an AI inference stack, from high-level model representation to low-level execution. Develop and extend a Python-based DSL for expressing AI workloads and kernels. Work on compiler infrastructure including IR design and transformation pipelines, graph lowering and optimization passes, and backend code generation for target execution environments. Optimize model execution for latency, throughput, memory efficiency, and numerical stability. Contribute to runtime systems responsible for model execution and scheduling. Profile and analyze inference workloads to identify system bottlenecks. Collaborate closely with hardware and systems engineers on execution efficiency. Influence architecture decisions for next-generation AI execution platforms.
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.