Openmart Verified
Retail Technology, Artificial Intelligence, SaaS, E-commerce
First ML engineer
Foster City, California, United StatesOnsiteFull Time$150,000–$250,000 /yrPosted 2 months agoHidden Gem · YC Startup
Role summary
Seeking a first Machine Learning Engineer to build human-like, real-time voice models. Responsibilities include designing and generating conversational audio training data, training and fine-tuning audio/speech models, and building evaluations for latency, overlap, and interruption. The ideal candidate has a strong ML background with PyTorch, experience with audio or speech models, and an intuition for timing, latency, and real-time systems. A startup/ownership mindset is crucial. Nice-to-have experience includes TTS, ASR, speech-to-speech, or streaming inference.
## Machine Learning Engineer (Audio Training)
We’re building **human-like, real-time voice models** focused on natural turn-taking, interruption handling, and low-latency speech.
### What you’ll do
* Design and generate **conversational audio training data**
* Train and fine-tune **audio / speech models**
* Build evaluation for **latency, overlap, and interruption**
* Own the loop: **data → training → eval → production**
### What we’re looking for
* Strong ML background (PyTorch)
* Experience with **audio or speech models**
* Solid intuition for **timing, latency, and real-time systems**
* Startup / ownership mindset
### Nice to have
* TTS, ASR, speech-to-speech, or streaming inference experience
**Competitive comp + meaningful equity. Founding-level ownership.**
We’re building **human-like, real-time voice models** focused on natural turn-taking, interruption handling, and low-latency speech.
### What you’ll do
* Design and generate **conversational audio training data**
* Train and fine-tune **audio / speech models**
* Build evaluation for **latency, overlap, and interruption**
* Own the loop: **data → training → eval → production**
### What we’re looking for
* Strong ML background (PyTorch)
* Experience with **audio or speech models**
* Solid intuition for **timing, latency, and real-time systems**
* Startup / ownership mindset
### Nice to have
* TTS, ASR, speech-to-speech, or streaming inference experience
**Competitive comp + meaningful equity. Founding-level ownership.**