Senior Data Engineer
Role summary
Tebra is seeking a Senior Data Engineer with an AI/ML focus to architect, build, and operate specialized data infrastructure. This hands-on role involves transforming healthcare data into high-quality training sets and real-time inference features, partnering closely with ML Engineers. You will own large data sub-systems, focusing on scalable pipelines for feature extraction, training data generation, and model monitoring. Key responsibilities include ensuring data availability through Feature Stores or Data Lakehouses, monitoring production pipelines for anomalies, leading design reviews, and developing frameworks for data quality. The role requires advanced Python and SQL proficiency, experience with modern data infrastructure like Spark and Kafka, and familiarity with MLOps concepts.
Tebra only initiates contact with candidates via email from an official Tebra email address (@tebra.com, @patientpop.com, or @kareo.com) or through our applicant tracking system, Greenhouse. We will only ask you to provide sensitive personal information through our official application portal — not via social media or text message. We do not conduct interviews via instant messaging.
# About the Role
As a Senior Data Engineer focused on AI/ML, you'll architect, build, and operate the specialized data infrastructure that powers Tebra's intelligent features. You will serve as a technical subject matter expert in data systems, partnering closely with Machine Learning Engineers to transform raw, messy healthcare data into high-quality training sets and real-time inference features.
This is a hands-on role where you will own large data sub-systems, translating business requirements into software solutions that accelerate our ability to deploy AI. You'll tackle technical challenges head-on—from data versioning to feature serving—ensuring our ML models are fed by reliable, scalable, and performant pipelines.
# Your Area of Focus
- Architect and write software that solves complex business problems, specifically designing scalable pipelines for feature extraction, training data generation, and model monitoring logs.
- Own and serve as a Subject Matter Expert (SME) for large software systems, such as the organization's Feature Store or Data Lakehouse, ensuring data availability for both experimentation and production inference.
- Continuously monitor data pipelines in production, detect data drift or quality anomalies, and implement automated recovery systems to ensure the reliability and freshness of features and training data over time.
- Lead Engineering Design Reviews, providing well-articulated and reasoned explanations for architecture decisions (e.g., choosing between batch processing for training vs. real-time streaming for inference).
- Write software frameworks that can be extended by others on the team, such as automated data quality checks and schema validation tools that prevent training-serving skew.
- Translate business requirements into software solutions, bridging the gap between raw data sources and the structured inputs needed for advanced ML models.
- Know when and how to optimize complex code, specifically tuning Spark jobs or SQL queries to handle massive datasets required for Large Language Model (LLM) fine-tuning or deep learning.
- Collaborate cross-functionally including ML engineers to implement MLOps best practices, including data versioning, lineage tracking, and reproducibility.
- Expert at scoping tasks, breaking down complex data infrastructure initiatives into manageable deliverables for the squad.
# Your Professional Qualifications
- 5+ years of professional software development experience.
- Deep technical subject matter expertise in 3+ general areas of software development (e.g., Big Data Processing, Distributed Systems, Data Modeling).
- 3+ years of hands-on experience in Data Engineering with a focus on supporting analytics or data science teams.
- Advanced proficiency in Python and SQL. You are comfortable writing production-grade code for data transformation and orchestration (not just scripts).
- Proven ability to architect and write software that enables ML at scale—moving beyond simple ETL to building robust data platforms.
- Strong background in modern data infrastructure relevant to AI (e.g., Spark, Airflow, Kafka, Vector Databases).
- Experience with Data Lake/Lakehouse architectures (e.g., Databricks, Snowflake, Delta Lake) and understanding how to structure data for efficient model training.
- Familiarity with MLOps concepts: You understand the difference between a training set and a test set, and you know what "data leakage" is and how to prevent it in the pipeline.
- Proven ability to deploy and maintain data systems in production with CI/CD, monitoring, and alerting.
- Excellent technical communication and a product mindset—comfortable driving initiatives from concept to delivery.
# Bonus Points
- Background in healthcare software operations or working with structured business data.
- Experience implementing or managing a Feature Store (e.g., Feast, Tecton).
- Familiarity with Data Versioning Control tools (e.g., DVC, LakeFS).
- Published research or conference papers in data engineering, distributed systems, or machine learning.
- Experience with retrieval-augmented generation (RAG) pipelines or vector search infrastructure.
- Contributions to open-source data or ML infrastructure projects.
*(For Recruiter use only)* #LI-SS1 #LI-Remote
# About Tebra
Kareo and PatientPop have joined forces to become Tebra, the digital backbone for practice well-being. While our teams are still supporting both products, our new hires and current employees are now united as Team Tebra.
Tebra aims to unlock better healthcare by helping independent practices bring modernized care to patients everywhere. Well over 100,000 providers trust Tebra to elevate their patient experience, and help them grow their practice. At Tebra, we're building the future of well-being together. That shared vision for tomorrow begins with compassion and humanity today.
# Our Values
## Start with the Customer
We get to know our customers - and their patients - and look at the world through their lens.
## Keep It Simple
Healthcare is too complex. We aim to simplify it for everyone.
## Stay Entrepreneurial
We reject the status quo and solve problems with creativity, perseverance, and a bias to action.
## Better Together
We are diverse, humble, and collaborative. We put the team first and win together.
## Celebrate Success
Life is short and joy is underrated. We take time to have fun and celebrate success.
# Perks & Benefits
United States: In addition to our healthcare benefits, we also offer amazing perks! Need work from home basics? We offer a discount through Dell! We also offer a number of resources to help you keep your mind and body healthy. Check out Gympass for a great workout, or TelusEmployee Assistance Program to find mental health resources, along with other resources for everyday occurrences.
Costa Rica: To assist with all of life's needs, Tebra also offers a wellness and childcare subsidy and a University/Education discount! We also offer a number of resources to help you keep your mind and body healthy. Check out Gympass for access to health and fitness apps, or Telus Employee Assistance Program to find mental health resources, along with other resources for everyday occurrences.
# Compliance & Privacy Disclosures
*NOTE: Tebra is an equal opportunity employer. All applicants will be considered for employment without attention to age, race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.*
*California residents who apply or are recruited for a job with us: please carefully review our California-specific Privacy Notice under the California Consumer Protection Act here:* *https://www.tebra.com/privacy-policy/california-supplemental-notice/*
*If you would like to report a fraudulent Tebra job posting, please contact us at* *talentacquisition@tebra.com* *and consider reporting your experience to the FBI's Internet Crime Complaint Center or the Better Business Bureau to help keep others safe online, too.*
*As part of our commitment to a fair and efficient hiring process, Tebra utilizes BrightHire, an interview intelligence platform, for our phone and video screenings.* *This technology records and transcribes interviews to help us ensure consistency, reduce bias, and make more informed hiring decisions.* *By applying for this position, you acknowledge that your interview may be recorded.*
Similar roles
- Senior Data EngineerExperion Technologies · Plano, Texas, United States · Hybrid
- Lead Data EngineerSmart IT Frame LLC · Los Angeles, California, United States · Hybrid
Principal Data EngineerRS21: A Data Science and Visualization Company · United States · Remote
Senior Data EngineerRaag Solutions · Bellevue, Washington, United States · Onsite- Lead Data EngineerRetail Insight Ltd · Illinois, United States · Hybrid