AI Engineer

San Diego, California, United StatesRemoteFull Time$111,300–$166,900 /yrPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

A Senior AI Platform Engineer is sought to join a leading semiconductor technology company focused on wireless innovation. This role involves designing and operating massive-scale LLM hosting environments using Kubernetes and multi-cloud architectures (AWS, GCP, Azure) to bring generative AI from research to production. Key responsibilities include architecting LLM serving infrastructure, building scalable Kubernetes clusters with GPU support, deploying orchestration systems, managing observability stacks, and developing hybrid-search solutions for RAG workflows. The position requires 5-7 years of experience in Platform Engineering, MLOps, or SRE, with expertise in Kubernetes, Infrastructure as Code, Python, and cloud security.

Senior AI Platform Engineer

We're working with a world-leading pioneer in wireless innovation and semiconductor technology on this exciting opportunity.

Join a powerhouse engineering team building the high-performance infrastructure that powers the next generation of generative AI. You will design and operate massive-scale LLM hosting environments using Kubernetes and multi-cloud architectures across AWS, GCP, and Azure to move AI from research into production reality.

The Role

• Architect and manage large-scale LLM hosting and serving infrastructure using AWS Bedrock, GCP Vertex, and Azure AI Foundry.

• Build and scale production-grade Kubernetes clusters featuring GPU scheduling, advanced autoscaling, and high availability for ML workloads.

• Deploy and optimize agentic workflow orchestration systems like n8n to automate complex AI-driven processes at an enterprise scale.

• Own the observability stack, implementing deep-visibility systems using Elasticsearch, Prometheus, Grafana, and OpenTelemetry to monitor performance and latency.

• Develop hybrid-search solutions utilizing large-scale Elasticsearch clusters and vector search technologies to support RAG workflows.

What You'll Need

• 5–7 years of deep experience in Platform Engineering, MLOps, or SRE roles with a focus on cloud-native deployments.

• Expert-level hands-on experience with Kubernetes orchestration, including GPU resource management and Infrastructure as Code using Terraform and Helm.

• Proven track record hosting and serving LLMs in production using frameworks like vLLM, Triton, KServe, or Ray Serve.

• Strong proficiency in Python and scripting for automation, alongside deep Linux systems administration and cloud security (IAM/Networking) knowledge.

• Experience with vector databases (Milvus, Pinecone, or Elasticsearch Vector) and advanced service mesh architectures like Istio or Linkerd.

What's On Offer

• Competitive base salary of $111,300 - $166,900 plus a significant annual discretionary bonus program.

• Generous annual RSU (Restricted Stock Unit) grants, allowing you to share in the long-term success of a global tech leader.

• A flexible "Remote-Friendly" work environment backed by a highly competitive benefits package for you and your family.

• The chance to work at the absolute cutting edge of the AI revolution with the resources of a global semiconductor giant.

Apply via Haystack today!

Ready to apply?

You'll be redirected to Haystack's application page.

Is this role right for you?

Role summary

Similar roles