We're in beta · Starting with US & Canada · Shipping weekly — your feedback shapes RiseMe
Haystack logo
Haystack Verified
Software, Developer Tools, Analytics

AI Engineer

San Diego, California, United StatesOnsiteFull Time$200,800–$301,200 /yrPosted 1 month ago

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Sign up to see compensation estimate

Principal SW Engineer - LLM Serving (Cloud AI) | San Diego, California | On-site | $200,800 - $301,200

We're working with a global leader in semiconductor innovation and wireless technology on this exciting opportunity. Join a powerhouse engineering team dedicated to revolutionizing Cloud AI through high-performance LLM inference acceleration and next-generation silicon software.

As a Principal Engineer, you will lead the architecture and deployment of large-scale commercial software solutions. You’ll dive deep into LLM serving frameworks like vLLM and PyTorch to optimize carrier-grade machine learning workloads on multi-core SoC architectures. This is a high-impact role driving the future of AI infrastructure from R&D to global commercial deployment.

The Role

• Lead the design and development of high-performance software for LLM serving, utilizing frameworks like vLLM to maximize inference throughput.

• Architect and optimize neural networks across the full product lifecycle, focusing on Multi-modal and reasoning models for cloud-scale AI environments.

• Perform deep-dive bottleneck analysis and performance modeling on multicore architectures, including NoCs, caches, memory subsystems, and PCIe interfaces.

• Collaborate cross-functionally to bridge the gap between AI compiler technology and hardware acceleration, ensuring seamless integration with machine learning accelerators.

• Write and maintain high-performance, low-latency code in C++ and Python for sophisticated SoC architectures and math libraries.

What You'll Need

• 8+ years of professional software or systems engineering experience (or 6+ years with a PhD) in high-performance computing environments.

• Proven expertise in LLM serving frameworks (vLLM) and strong development skills in PyTorch for optimizing neural networks.

• Advanced proficiency in C++, Python, and Linux systems programming, with a focus on multicore architecture fundamentals (Memory, Bus, SoC).

• Deep understanding of linear algebra, math libraries, and neural network operators essential for machine learning acceleration.

• Master's or PhD in Computer Science or Computer Engineering with a track record of delivering complex commercial software projects at scale.

What's On Offer

• Competitive base salary of $200,800 - $301,200 plus a significant discretionary annual bonus program.

• Generous annual RSU grants, providing true ownership in a global tech pioneer.

• Comprehensive benefits package designed to support health, wealth, and work-life balance.

• Opportunity to work at the forefront of the AI revolution, influencing how the world’s largest models are served and scaled.

Apply via Haystack today!

Ready to apply?
You'll be redirected to Haystack's application page.

Similar roles