FocusKPI logo
FocusKPI1 week ago

AI Infrastructure & Experience Engineer

$145,600–$164,320 year

On-site · Mountain View Santa Clara County, California, United States

Type
Contract
Level
Mid Level
Education
Bachelors Degree
Company size
Startup
Industry
Data Science Services

Job Summary

AI Infrastructure & Experience Engineer to deploy and optimize multiple LLMs and generative models on local inference hardware. Responsibilities include inference optimization through quantization and caching, CUDA-based kernel development, bridging inference backends with orchestration layers and frontends (e.g., OpenWebUI), rapid prototyping of demos to showcase model memory and context-aware web search, and integrating local AI compute with peripheral devices. Requires hands-on experience with NVIDIA ecosystems, ARM64, C++, Python, Rust, CUDA, llama.cpp/TensorRT-LLM/Ollama, FastAPI, Docker/Kubernetes, and frontend tooling (React/Next.js). Minimum 3 years of relevant experience and a CS-related degree preferred.

Required Qualifications

  • Recent experience in model optimization
  • Hardware & Compute: Proven experience with NVIDIA ecosystems and ARM64 architecture
  • Systems Programming: Proficiency in C++, Python, and Rust; CUDA experience with custom kernels
  • AI/ML Frameworks: Experience with llama.cpp, TensorRT-LLM, Ollama; orchestration frameworks like LiteLLM
  • Software Engineering: FastAPI, Docker/Kubernetes, sandbox environments, low-latency API design
  • Full-Stack Prototyping: Frontend UIs with React/Next.js or similar
Sorce

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started

$146k – $164k / yr

AI Infrastructure & Experience Engineer · FocusKPI

Apply on Sorce