Human Data Operations Lead

San Mateo, California, United StatesOnsiteFull Time$100,000–$150,000 /yrPosted 2 months agoHidden Gem · YC Startup

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

Besimple is seeking a founding Human Data Operations Lead to establish and scale their AI-first data operations. This remote, full-time role involves designing processes, recruiting and training a global annotator bench, and implementing quality systems. The lead will translate customer needs into clear rubrics, utilize AI coding tools and basic Python/SQL for automation and analysis, and collaborate with Product/Eng to refine models and platform roadmaps. Success requires a company-builder mindset, customer-facing clarity, people leadership, hands-on data/AI tool experience, and a bias for execution. The role offers significant scope and equity in defining the company's operational foundation.

**Type:** Full-time Remote (US-first)\
**Team:** Founding Ops & Customer Delivery

## About Besimple

We provide the data layer for audio models. Our mission is to bring AI into the real world naturally. We believe that AI can meaningfully empower humanity **through the most natural interface - voice**. We’re a small, nimble team of passionate builders who believe humans must remain in the loop.

## The Role (Founding)

This is our **founding operations** role. You won’t “run a process”—you’ll **design the process, the playbooks, and the bar** for what world-class, AI-first data operations looks like. You’ll take ambiguous customer needs, turn them into crisp rubrics and workflows, **recruit and train bench globally**, and stand up the quality systems, dashboards, and SLAs that become Besimple’s operating backbone. As we grow, you’ll scale the org you built—hiring, coaching, and evolving best practices.

You’ll use **AI coding tools** (Copilot/Cursor/Codex) and lightweight **Python/SQL** to automate processes, analyze variance/drift, and accelerate delivery. You’ll partner with customers to **define and refine annotation requirements**, and with Product/Eng to shape UX, guardrails, and platform roadmap.

## What You’ll Do

* **Own customer programs end-to-end:** translate goals into schemas, rubrics, gold sets, and success metrics; pilot → scale with clear reporting and write-ups.
* **Define & refine requirements with customers:** run scoping sessions, lock criteria/edge-case taxonomies/IAA targets; iterate as models and prompts change.
* **Recruit, onboard, and train annotators:** source SMEs, design paid trials, build training artifacts, calibrate on gold data, and manage QA/arb loops.
* **Ship with AI-accelerated ops:** write quick scripts and notebooks for data transforms, audits, log parsing, schema reconciliation, and quality analytics.
* **Build the operating system:** SLAs, sampling plans, consensus/appeals, audit trails, and continuous calibration; make quality measurable and repeatable.
* **Close the loop:** drive prompt/model/policy experiments; surface insights to Product/Eng; propose UI tweaks and guardrails that raise signal-to-noise.

## What Will Make You Successful

* **Company-builder mindset:** you’ve built 0→1 programs or teams, created playbooks, and raised the bar for quality and speed.
* **Customer-facing clarity:** you convert open-ended asks into precise pass/fail criteria and aren’t afraid to propose a better spec.
* **People leadership:** you attract, calibrate, and motivate high-judgment annotators while holding a crisp, documented bar.
* **Hands-on with data & AI tools:** comfortable with AI coding assistants plus **basic Python/SQL** to answer questions fast and automate the dull bits.
* **Execution bias:** you prefer small pilots and rapid iteration over lengthy specs, and you over-communicate risks and status.

## Qualifications

* 2–4+ years in data/product/research operations for ML/AI, relevance, or safety—or equivalent “high-judgment at scale” experience.
* Track record **recruiting, onboarding, and training** annotators/raters with gold-set calibration and QA loops.
* Demonstrated **program ownership**: requirements, change management, stakeholder updates, and postmortems.
* Excellent writing: rubrics, edge-case guides, SOPs, and crisp weekly reports.

## Nice to Have

* Trust & Safety, RLHF/RLAIF, search/relevance, or regulated domains (medical, legal, finance).
* Experience designing evaluator UIs, prompt templates, or judgment tasks for LLMs/multimodal models.
* Familiarity with IAA stats, sampling methods, or experiment design.

## Compensation & Ownership

Founding-level role with meaningful equity and scope to **define what it means to build an AI-first data annotation company**—from playbooks and metrics to culture and hiring.

Ready to apply?

You'll be redirected to Besimple AI's application page.