Director Of Data Engineering

United StatesOnsiteFull TimeDirectorPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

This Director of Data Engineering role is a senior individual contributor focused on designing, building, and optimizing a core data platform for healthcare data. The position requires hands-on technical expertise in developing scalable data pipelines (ETL/ELT), implementing data quality checks, and driving architectural improvements. Responsibilities include transforming clinical data into common models (OMOP, FHIR), optimizing cloud resources, and collaborating with other teams. The role also involves technical leadership through code reviews and mentorship, with a strong emphasis on SQL, Python, cloud platforms (AWS, Azure, GCP), distributed processing (PySpark), and healthcare data standards.

About the Role

As a Director Data Engineer, you will be a hands-on technical expert responsible for designing, building, and optimizing the core data platform that transforms complex healthcare data into actionable insights. You will write high-quality, production-grade code to develop robust data pipelines, implement best-in-class data quality checks, and drive the technical evolution of our data architecture. This role is a senior individual contributor position, focused on solving the most challenging technical problems in clinical decision support and research and ensuring the integrity and performance of our large-scale clinical datasets.

Responsibilities

- Pipeline Development
: Design, build, and maintain scalable and reliable data pipelines (ETL/ELT) for the ingestion and processing of large volumes of customer clinical data.
- Data Transformation
: Directly implement code to efficiently map diverse customer datasets into our common data models (e.g., OMOP, FHIR), ensuring data fidelity and consistency.
- Architecture & Optimization
: Identify and implement technical improvements to the data engineering architecture, including optimizing distributed data processing and cloud resource utilization for cost and performance.
- Quality & Governance
: Develop and embed advanced data quality checks, monitoring, and validation frameworks to maintain the highest standards of data reliability in clinical datasets.
- Technical Collaboration
: Partner with Software Engineering and Data Science teams to translate complex business requirements into robust, scalable technical solutions and data models.
- Mentorship & Standards
: Act as a technical leader, establishing coding standards, performing code reviews, and mentoring mid-level engineers on deep technical subjects.

Minimum Qualifications

Bachelor's degree in Computer Science, Data Engineering, or a related field; Master’s degree preferred.
8+ years of hands-on experience in data engineering, focused on large-scale data systems.
Expert-level proficiency in SQL (any dialect) and Python, with deep experience in cloud platforms such as AWS, Azure, or GCP.
Extensive, proven track record of working hands-on with healthcare data, including advanced knowledge of relevant standards and data models (e.g., FHIR, OMOP).
Deep technical mastery of distributed data processing and streaming frameworks like PySpark, and experience with workflow orchestration tools (e.g., Airflow, Dagster).

Ready to apply?

You'll be redirected to Cohort AI's application page.

Is this role right for you?

Role summary

Similar roles