Principal Data Platform Engineer

Portland, Oregon, United StatesRemoteFull TimePrincipal$200,000–$300,000 /yrPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

Audiience is seeking a Principal Data Platform Engineer to design and build the data infrastructure powering their AI models in the publishing industry. This founding engineering role focuses on creating data pipelines, quality systems, and feedback loops from scratch, without legacy technical debt. The ideal candidate will have strong experience with production data pipelines (Spark, dbt, Airflow), ML data quality, annotation workflows, vector databases, and cloud data stacks. They will define data strategy, ensure data health through observability, and partner with ML research. This is a fully remote position requiring Pacific Time zone hours.

## About Audiience

We're transforming how content is created and trusted in publishing. We deliver technology that is accurate, scalable, and creative – built to elevate both craft and integrity. We attract the best in the business not through traditional methods, but through the solutions we create and the culture we've built.

## Our Culture

Low ego, high confidence – We sharpen each other through continuous improvement
Open communication – Even when it creates necessary conflict
Systems thinking – We solve complex problems through collaboration
Human-centered – We work because we love what we do, but we are human first
Integritycreativity – We win together or not at all

## The Role

The dirtiest secret in AI is this: your models are only as good as your data. Everyone knows it. Very few can actually do something about it. At Audiience, we're entering a niche market in publishing that has never had an AI-native solution - which means we get to define what good data looks like, from scratch, without the legacy technical debt that bogs down every other team working on similar problems.

We're building a founding engineering team of rare individuals who can think from first principles and build with precision. This is a seat reserved for someone who is obsessed with the data flywheel - who understands that in AI, data infrastructure isn't a support function, it's the competitive moat. You'll design and own the pipelines, quality systems, and feedback loops that make our models measurably better over time.

This role is not about maintaining someone else's stack. It's about architecting the system that powers a category-defining product. If you've ever wanted to build the data foundation for something important - from the very first line - this is that opportunity.

## What You'll Do

Design and build the data ingestion, transformation, and labeling pipelines that power our AI models
Define and enforce data quality standards, annotation schemas, and governance frameworks from the ground up
Build feedback loops that capture real-world model performance and translate it back into training signal
Partner closely with ML research and infrastructure to ensure data formats, volumes, and quality match training needs
Create observability into data health - drift detection, quality degradation, and coverage gaps
Shape the data strategy for a domain where ground truth doesn't exist yet - you'll help invent it

## What We're Looking For

Core Technical Expertise

Strong experience building production-grade data pipelines (Spark, dbt, Airflow, or equivalent)
Deep understanding of data quality, schema evolution, and versioning in ML contexts
Familiarity with annotation and labeling workflows and tooling (Label Studio, Scale AI, or equivalent)
Experience with vector databases, embedding pipelines, and retrieval infrastructure
Proficiency with cloud data stacks (S3/GCS, Snowflake, BigQuery, or equivalent)
Solid understanding of how data decisions directly impact model performance and training dynamics

Communication

Communication excellence – Can write clear data contracts, schema documentation, and governance policies that engineers and non-technical stakeholders can actually use
Demonstrated ability to explain data quality tradeoffs and their downstream model consequences in writing and in conversation

Background

Degree not required
Prior experience as a data engineer, ML data platform engineer, or data infrastructure lead
Startup or fast-moving environment experience is a plus

## Your Mindset

Problem-solving prowess – You see problems others don't and solve them in ways others can't
Tenacious learner – Self-taught capabilities and continuous improvement are in your DNA
Systems thinker – You understand how complex systems interact and create elegant solutions
Results-oriented – Bias toward flexibility, impact, and getting it done
Collaborative by nature – You believe we can only win if we do it together

## Nice to Have

Experience with RLHF data pipelines or human preference data collection
Prior work in media, publishing, or content-heavy domains
Contributions to open data tooling or data-centric AI initiatives
Experience with synthetic data generation or augmentation strategies
Previous startup or early-stage engineering experience
Volunteer work

## Why Join Us

Build the data foundation for something that has never existed - in a market that has never been touched
Join a founding core of technical builders who treat data as the strategic asset it actually is
Solve extraordinary problems with fewer resources than competitors - your impact is magnified
Work with brilliant misfits who value craft, integrity, and creativity over politics
Own the data flywheel - your work directly determines how fast and how well our models improve
Continuous learning - work at the bleeding edge of AI data infrastructure with teammates who challenge and sharpen you

## Location

This role is fully remote; however, you must be willing to work Pacific Time zone hours. Occasional travel will be required for team workshops.

## Come Work With Us!

We offer competitive compensation and benefits, equity, generous time off to recharge, and flexible working hours.

We are an equal opportunity employer committed to building a diverse team. We welcome applications from all backgrounds, especially those who might not check every box but possess the savant-level problem-solving abilities we seek.

Ready to apply?

You'll be redirected to Audiience's application page.

Is this role right for you?

Role summary

Similar roles