Staff Data Engineer

Boston, Massachusetts, United StatesRemoteFull TimeStaff$150,000–$190,000 /yrPosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

Gradient AI is seeking a Staff Data Engineer for a fully remote opportunity to enhance its AI-powered insurance solutions. This role involves leading the improvement of scalable data infrastructure, platforms, and architecture, focusing on building reliable data pipelines and orchestration frameworks. The engineer will design and implement data systems for ML/AI models, process complex healthcare datasets (EHR, claims, genomic), and ensure compliance with regulations like HIPAA. Key responsibilities include ETL development using SQL, AWS, and big data technologies, and collaborating with data scientists to prepare data for modeling. The ideal candidate has 7+ years of experience, deep expertise in healthcare data, proficiency in Python and SQL, and hands-on experience with big data tools (Spark, Databricks, Snowflake) and orchestration frameworks (Airflow, Dagster, Prefect). Experience with DevOps practices (CI/CD, IaC, Docker, Kubernetes) is also required.

*This is a fully remote opportunity.*

Gradient AI:

Gradient AI is revolutionizing Group Health and P&C insurance with AI-powered solutions that help insurers predict risk more accurately, improve profitability, and automate underwriting and claims. Our SaaS platform taps into one of the industry's largest data lakes—tens of millions of policies and claims—to deliver deep, actionable insights. Trusted by leading carriers, MGAs, TPAs, and self-insured employers, Gradient AI has grown rapidly since our founding in 2018. Backed by $56M in Series C funding, we're scaling fast—and it's an exciting time to join the team.

About the Role:

We are seeking a Staff Data Engineer to lead the improvement and refinement of scalable data infrastructure, data platforms, and data architecture that power our predictive analytics solutions. This role focuses on building reliable, high-performance data pipelines and orchestration frameworks that enable efficient data movement across systems. The ideal candidate brings deep expertise in modern data platforms, an understanding of big data tools, and distributed systems, paired with experience working with complex healthcare datasets (e.g., claims or clinical data). You'll play a key role in shaping our data foundation, ensuring robustness, scalability, and operational excellence across the platform.

How you will make an impact:

Design, build, and implement data systems to support ML and AI models for our health insurance clients, ensuring strict compliance with healthcare data privacy and security regulations (e.g., HIPAA).
Develop tools for extracting, processing, and profiling diverse healthcare data sources, including EHRs, medical claims, pharmacy data, and genomic data.
Collaborate with data scientists to transform large volumes of health-related and bioinformatics data into modeling-ready formats, prioritizing data quality, integrity, and reliability in healthcare applications.
Build and maintain infrastructure for the extraction, transformation, and loading (ETL) of data from a variety of sources using SQL, AWS, and healthcare-specific big data technologies and analytics platforms.
Ensure data pipelines meet the unique requirements of health, medical, and bioinformatics data processing, including translating complex medical and biological concepts into actionable data requirements.

Skills needed to succeed:

BS in Computer Science, Bioinformatics, or another quantitative discipline with 7+ years of relevant working experience.
Deep expertise in health, medical, and bioinformatics data, including real-world healthcare datasets, with a strong understanding of the complexities and challenges of processing medical and biological information.
Proficiency in Python and SQL within a professional environment.
Hands-on knowledge of big data tools like Apache Spark (PySpark), Databricks, Snowflake, or similar platforms
Experience with data orchestration frameworks such as Airflow, Dagster, or Prefect.
Experience with modern DevOps practices, including CI/CD, IaC (Terraform), containerization (Docker/Kubernetes), and cloud environments (AWS preferred).
Knowledge of data transformation tools, such as dbt, is a plus

What We Offer:

A fun, team-oriented startup culture.
Generous stock options - we all get to own a piece of what we're building.
Unlimited vacation days.
Flexible schedule that supports working from home.
Full benefits package includes medical, dental, vision, 401k, paid paternal leave, and more.
Ample opportunities to learn and take on new responsibilities.

We are an equal opportunity employer.

Salary Range: $150,000-190,000 base salary annually.
This role is also eligible for an annual performance bonus, equity grant, and a comprehensive benefits package. In accordance with the Massachusetts Pay Transparency Law, we are providing a good-faith salary range for this position at the time of posting. The actual salary offered will depend on the level at which the candidate is hired, as well as their experience, skills, qualifications, and location. Compensation may grow over time through merit-based increases, promotions, and company-wide adjustments. If your salary expectations fall outside this range, we still encourage you to apply so we can have a conversation.

Ready to apply?

You'll be redirected to Gradient AI's application page.

Is this role right for you?

Role summary

Similar roles