FocusKPI1 week ago

AI Infrastructure & Experience Engineer

FocusKPI

$145,600–$164,320 year

On-site · Mountain View Santa Clara County, California, United States

Mountain View Santa Clara County, California, United StatesOn-siteContract$145,600–$164,320 yearMid LevelBachelors DegreeData Science ServicesStartup

Type

Contract

Level

Mid Level

Education

Bachelors Degree

Company size

Startup

Industry

Data Science Services

Job Summary

AI Infrastructure & Experience Engineer to deploy and optimize multiple LLMs and generative models on local inference hardware. Responsibilities include inference optimization through quantization and caching, CUDA-based kernel development, bridging inference backends with orchestration layers and frontends (e.g., OpenWebUI), rapid prototyping of demos to showcase model memory and context-aware web search, and integrating local AI compute with peripheral devices. Requires hands-on experience with NVIDIA ecosystems, ARM64, C++, Python, Rust, CUDA, llama.cpp/TensorRT-LLM/Ollama, FastAPI, Docker/Kubernetes, and frontend tooling (React/Next.js). Minimum 3 years of relevant experience and a CS-related degree preferred.

Required Qualifications

Recent experience in model optimization
Hardware & Compute: Proven experience with NVIDIA ecosystems and ARM64 architecture
Systems Programming: Proficiency in C++, Python, and Rust; CUDA experience with custom kernels
AI/ML Frameworks: Experience with llama.cpp, TensorRT-LLM, Ollama; orchestration frameworks like LiteLLM
Software Engineering: FastAPI, Docker/Kubernetes, sandbox environments, low-latency API design
Full-Stack Prototyping: Frontend UIs with React/Next.js or similar

Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.

Hiring someone like this?

Get your role in front of qualified candidates on Sorce.

Get started