Vectara logo
Vectara Verified
Artificial Intelligence, Software, Machine Learning, Search Technology

Platform Engineer

United StatesHybridFull TimePosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

We are seeking a Platform Engineer with 2+ years of experience in platform engineering, DevOps, SRE, or backend infrastructure roles. The ideal candidate will have strong Kubernetes experience, proficiency in infrastructure-as-code tools like Terraform or Helm, and hands-on experience with at least one major cloud provider (AWS preferred). You should be comfortable reading and contributing to backend codebases in languages such as Go, Python, or Java, and have working knowledge of CI/CD systems. Experience with ML inference workloads, streaming/messaging systems, data stores, GitOps, observability tooling, and supporting enterprise customers is also highly valued. This is a hands-on role responsible for owning and maintaining the infrastructure that runs our deploy-anywhere platform, including Kubernetes clusters, CI/CD pipelines, IaC, and observability stack.

### Who you are
- 2+ years in platform engineering, DevOps, SRE, or backend infrastructure roles
- Strong Kubernetes experience (deployment, debugging, scaling — not just `kubectl apply`)
- Hands-on with infrastructure-as-code: Terraform, Helm, or Pulumi
- Experience with at least one major cloud provider (AWS preferred; GCP or Azure also valued)
- Proficiency in one or more of: Go, Python, Java. Comfortable reading and contributing to backend codebases
- Working knowledge of CI/CD systems (GitHub Actions, Bazel, ArgoCD, or similar)
- Solid fundamentals in Linux, networking, and distributed systems
- Experience deploying or operating ML inference workloads (model serving, GPU scheduling, vLLM, TensorFlow Serving, or similar)
- Familiarity with streaming/messaging systems (Kafka, Pulsar) and data stores (MariaDB/PostgreSQL, Aerospike, ClickHouse, OpenSearch)
- Experience with GitOps workflows (ArgoCD, Flux)
- Exposure to air-gapped or on-premises Kubernetes deployments
- Background in observability tooling (Prometheus, Grafana, OpenTelemetry, Datadog)
- Experience providing technical support or working directly with enterprise customers on infrastructure issues
- Comfort with AI-assisted development workflows and managing AI coding agents

### What the job involves
- You'll own the infrastructure that runs our deploy anywhere platform — from Kubernetes clusters serving ML inference at scale to the CI/CD pipelines, IaC, and observability stack that keep it all reliable
- This is a hands-on role: you'll write Helm charts and Terraform one day, debug a Kafka consumer lag issue the next, and ship a backend service feature the day after
- You'll deploy across AWS, GCP, and on-premises (including air-gapped environments), and you'll participate in an on-call rotation supporting enterprise customers
- Build and maintain infrastructure-as-code (Terraform, Helm) for our AWS EKS and GCP GKE clusters, plus on-premises deployments (including Tanzu and air-gapped environments)
- Own CI/CD pipelines (GitHub Actions, Bazel, ArgoCD) and drive GitOps adoption
- Deploy, scale, and optimize ML/NLP inference workloads (vLLM, PyTorch, GPU scheduling with various Kubernetes scalers)
- Build and improve observability: Prometheus, Grafana, Datadog,, and OpenTelemetry
- Collaborate with Field Engineering to support PoCs and platform deployments in customer cloud VPCs and on-prem environments
- Contribute to backend services (Java 21, Python, gRPC) and platform features
- Improve system reliability, scalability, and developer experience across the engineering org

### Benefits
- Comprehensive medical insurance
- Life & disability insurance
- Employee assistance program
- Free gym access at HQ
- Annual global summit
- Free snacks and beverages
- Bereavement leave
- Monthly stipend for phone/internet
- Maternity/Paternity leave
- Minimal meeting Fridays
- Flexible working hours
- Unlimited vacation time policy
- Home office stipend for remote employees

Ready to apply?
You'll be redirected to Vectara's application page.

Similar roles