AI Engineer

San Diego, California, United StatesRemoteFull Time$111,300–$166,900 /yrPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

A world-leading pioneer in wireless innovation and semiconductor technology seeks a Senior AI Platform Engineer to build and operate high-performance infrastructure for next-generation generative AI. This role involves designing massive-scale LLM hosting environments using Kubernetes and multi-cloud architectures (AWS, GCP, Azure). Responsibilities include architecting LLM serving infrastructure, scaling Kubernetes clusters with GPU scheduling, deploying agentic workflow orchestration systems, owning the observability stack, and developing hybrid-search solutions for RAG workflows. The ideal candidate will have 5-7 years of experience in Platform Engineering, MLOps, or SRE roles, with expertise in Kubernetes, Infrastructure as Code, LLM serving frameworks, Python, Linux, and cloud security.

Senior AI Platform Engineer

We're working with a world-leading pioneer in wireless innovation and semiconductor technology on this exciting opportunity.

Join a powerhouse engineering team building the high-performance infrastructure that powers the next generation of generative AI. You will design and operate massive-scale LLM hosting environments using Kubernetes and multi-cloud architectures across AWS, GCP, and Azure to move AI from research into production reality.

The Role

• Architect and manage large-scale LLM hosting and serving infrastructure using AWS Bedrock, GCP Vertex, and Azure AI Foundry.

• Build and scale production-grade Kubernetes clusters featuring GPU scheduling, advanced autoscaling, and high availability for ML workloads.

• Deploy and optimize agentic workflow orchestration systems like n8n to automate complex AI-driven processes at an enterprise scale.

• Own the observability stack, implementing deep-visibility systems using Elasticsearch, Prometheus, Grafana, and OpenTelemetry to monitor performance and latency.

• Develop hybrid-search solutions utilizing large-scale Elasticsearch clusters and vector search technologies to support RAG workflows.

What You'll Need

• 5–7 years of deep experience in Platform Engineering, MLOps, or SRE roles with a focus on cloud-native deployments.

• Expert-level hands-on experience with Kubernetes orchestration, including GPU resource management and Infrastructure as Code using Terraform and Helm.

• Proven track record hosting and serving LLMs in production using frameworks like vLLM, Triton, KServe, or Ray Serve.

• Strong proficiency in Python and scripting for automation, alongside deep Linux systems administration and cloud security (IAM/Networking) knowledge.

• Experience with vector databases (Milvus, Pinecone, or Elasticsearch Vector) and advanced service mesh architectures like Istio or Linkerd.

What's On Offer

• Competitive base salary of $111,300 - $166,900 plus a significant annual discretionary bonus program.

• Generous annual RSU (Restricted Stock Unit) grants, allowing you to share in the long-term success of a global tech leader.

• A flexible "Remote-Friendly" work environment backed by a highly competitive benefits package for you and your family.

• The chance to work at the absolute cutting edge of the AI revolution with the resources of a global semiconductor giant.

Apply via Haystack today!

Ready to apply?

You'll be redirected to Haystack's application page.

Is this role right for you?

Role summary

Similar roles