Data Engineer

San Francisco, California, United StatesOnsiteFull Time$180,000–$250,000 /yrPosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

MetaVoice is seeking a Data Engineer to build AI for voice conversations that are natural and human-like. This role involves developing infrastructure and distributed data pipelines to process large volumes of multimodal data for AI/ML products. The ideal candidate will have experience with batch processing, real-time streaming systems, and distributed orchestration tools like Spark, Kafka, Flyte, and Kubernetes. A bonus would be experience designing complex transformation pipelines for speech processing. This is an in-person role in San Francisco, offering significant equity and a chance to work on cutting-edge AI research for production scenarios.

At MetaVoice, our goal is an
AI for voice conversations that's as natural as talking to a person.

Today's voice AI fails at real-world conversations. It’s slow, turn-based like a walkie-talkie, breaks with interruptions, and doesn’t understand emotion.

Developers can't build compelling experiences and users disengage. This limits voice AI to simple receptionist tasks and basic customer support, blocking meaningful services (sales, therapy, coaching) where dialogue and emotional intelligence matter most. Scaling current tech does not work.

We're building voice AI that's as natural as talking to a person. Our approach is a duplex speech-to-speech model that learns conversational behaviour directly from data.

MetaVoice is founded by:

Sid, founding engineer at Wayve.ai ( $2B+ raised)
Vatsal, co-created Alexa’s first AI voice.

We’re passionate about building products people truly love.

Requirements and Experience

- Experience building infrastructure & distributed data pipelines to process 10s of TBs of data
- Experience working with multimodal data in the context of AI/ML products or systems
- Demonstrated ability to learn quickly and adapt in fast-paced environments
- Experience with batch processing, real-time streaming systems and distributed orchestration (e.g., Spark, Kafka, Flyte, Kubernetes)
- *(Bonus)*
Experience designing complex transformation pipelines for speech processing (e.g., transcription, diarization, enhancement, filtering)

What we offer

Change the world (when we succeed)
Environment to do the best work of your life.
Small team, great people.
Opportunity to make cutting-edge research work from scratch for production scenarios.
Work with 100+ TBs of data and 10B+ parameter models.

Culture

We're an in-person team in San Francisco & love working together. It helps us learn from one another and make decisions quickly.
We ship fast & obsess over making customers happy.
We offer high autonomy, allowing everyone to do their best work.

Compensation & Benefits

Significant equity as a founding team member
$180K-$250K base
Immigration support, we are immigrants ourselves
Fully covered medical, dental, and vision insurance
401(k)

Interview Process

We move with speed, and aim to go from first interview to offer within a week.

Vibe check (30 mins): chat with one of the founders about the vision and your background
2 (x45 mins) Technical interviews
Onsite co-work: Spend half a day with us. We'll build and iterate on a data system together.

Ready to apply?

You'll be redirected to MetaVoice's application page.

Is this role right for you?

Role summary

Similar roles