ML Infrastructure Engineer

Seattle, Washington, United StatesOnsiteFull Time$250,000–$450,000 /yrPosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

A well-funded AI startup is seeking an ML Infrastructure Engineer to build and scale production-grade infrastructure for their real-time visual conversational AI product. The role involves architecting and optimizing distributed systems, data pipelines, and GPU clusters using technologies like Kubernetes and Terraform. Candidates should have at least 2 years of experience in ML infrastructure or distributed systems, proficiency in Python and either Rust or Go, and experience supporting production services. This is a full-time, onsite position in Seattle, WA, with compensation ranging from $250K to $450K plus equity.

ML Infrastructure / Systems Engineer

*Seattle, WA (5 days onsite)*

*Compensation: $250K – $450K + equity*

*Relocation support and visa sponsorship available*

About:

Our client is a well-funded AI startup building real-time visual conversational AI that allows users to interact with AI through live video and voice experiences. They recently raised a $50M Series A and are scaling their engineering team ahead of their first major product release. The founding team includes researchers and engineers from leading AI labs and technology companies, and they are building a small, highly technical team focused on developing core AI systems and infrastructure.

What they are looking for

Our client is seeking engineers who enjoy building production-grade infrastructure for machine learning systems. The ideal candidate has experience designing scalable distributed systems, optimizing infrastructure for performance and reliability, and supporting machine learning workloads in production environments.

What you will work on

Building and scaling the serving infrastructure for multimodal AI models
Optimizing inference systems for latency, throughput, and cost
Architecting real-time systems that support long-lived video and audio connections
Designing distributed data pipelines for large-scale processing and evaluation
Managing and optimizing GPU clusters using Kubernetes and Terraform
Building CI/CD, evaluation, and deployment systems for safe and reliable model iteration
Working closely with ML researchers and product engineers to bring new models into production

What they are looking for in candidates

2+ years of experience building machine learning infrastructure or distributed systems
Strong experience building or operating large-scale data pipelines
Experience supporting production services, including monitoring, incident response, and capacity planning
Proficiency in Python and either Rust or Go
Strong experience with Kubernetes, Terraform, and cloud infrastructure
Experience working in fast-paced engineering environments

Experience with real-time systems, multimedia models, or GPU inference infrastructure is a strong plus.

This is a full-time role based in Seattle, and candidates should be open to working onsite five days per week.

Ready to apply?

You'll be redirected to AustinWorks's application page.

Is this role right for you?

Role summary

Similar roles