AI Engineer
Compensation estimateAI
See base, equity, bonus, and total comp estimates for this role — free, no credit card.
Sign up to see compensation estimatePrincipal SW Engineer - LLM Serving (Cloud AI) | San Diego, California | On-site | $200,800 - $301,200
We're working with a global leader in semiconductor innovation and wireless technology on this exciting opportunity. Join a powerhouse engineering team dedicated to revolutionizing Cloud AI through high-performance LLM inference acceleration and next-generation silicon software.
As a Principal Engineer, you will lead the architecture and deployment of large-scale commercial software solutions. You’ll dive deep into LLM serving frameworks like vLLM and PyTorch to optimize carrier-grade machine learning workloads on multi-core SoC architectures. This is a high-impact role driving the future of AI infrastructure from R&D to global commercial deployment.
The Role
• Lead the design and development of high-performance software for LLM serving, utilizing frameworks like vLLM to maximize inference throughput.
• Architect and optimize neural networks across the full product lifecycle, focusing on Multi-modal and reasoning models for cloud-scale AI environments.
• Perform deep-dive bottleneck analysis and performance modeling on multicore architectures, including NoCs, caches, memory subsystems, and PCIe interfaces.
• Collaborate cross-functionally to bridge the gap between AI compiler technology and hardware acceleration, ensuring seamless integration with machine learning accelerators.
• Write and maintain high-performance, low-latency code in C++ and Python for sophisticated SoC architectures and math libraries.
What You'll Need
• 8+ years of professional software or systems engineering experience (or 6+ years with a PhD) in high-performance computing environments.
• Proven expertise in LLM serving frameworks (vLLM) and strong development skills in PyTorch for optimizing neural networks.
• Advanced proficiency in C++, Python, and Linux systems programming, with a focus on multicore architecture fundamentals (Memory, Bus, SoC).
• Deep understanding of linear algebra, math libraries, and neural network operators essential for machine learning acceleration.
• Master's or PhD in Computer Science or Computer Engineering with a track record of delivering complex commercial software projects at scale.
What's On Offer
• Competitive base salary of $200,800 - $301,200 plus a significant discretionary annual bonus program.
• Generous annual RSU grants, providing true ownership in a global tech pioneer.
• Comprehensive benefits package designed to support health, wealth, and work-life balance.
• Opportunity to work at the forefront of the AI revolution, influencing how the world’s largest models are served and scaled.
Apply via Haystack today!
Similar roles
- AI EngineerFetchJobs.co · Richmond, Virginia, United States · Remote
- Distinguished AI EngineerCapital One · Mclean, Virginia, United States · Onsite
- AI EngineerHaystack · United States · Hybrid
- Entry Level AI EngineerEmonics LLC · Massachusetts, United States · Onsite
- Distinguished AI EngineerCapital One · New York, New York, United States · Onsite