
Data Engineer II
At SCP Health, What You Do Matters
As part of the SCP Health team, you have an opportunity to make a difference. At our core, we work to bring hospitals and healers together in the pursuit of clinical effectiveness. With a portfolio of over 8 million patients, 7500 providers, 30 states, and 400 healthcare facilities, SCP Health is a leader in clinical practice management spanning the entire continuum of care, including emergency medicine, hospital medicine, wellness, telemedicine, intensive care, and ambulatory care.
Why You Will Love Working Here
- Strong track record of providing excellent work/life balance.
- Comprehensive benefits package and competitive compensation.
- Commitment to fostering an inclusive culture of belonging and empowerment through our core values - collaboration, courage, agility, and respect.
Responsibilities
- Pipeline Development: Design, build, and maintain scalable data pipelines to ingest internal application data (Scheduling, HR, Finance, etc) and external clinical data (EHR extracts, HL7, FHIR) into the Data Platform.
- Medallion Lifecycle Management: Design and implement transformation logic to move data from Bronze to Gold, ensuring adherence to data quality standards, documentation expectations, and business rules.
- Domain Stewardship: Partner with business and technical stakeholders to map source system values to standardized enterprise models, supporting core workflows and consistent definitions across teams.
- Performance, Reliability & Cost Optimization: Monitor and optimize Snowflake usage, query performance, and pipeline latency; apply practical cost controls (e.g., right-sizing warehouses, resource monitors) and ensure dependable batch and near-real-time data availability.
- Data Governance, Security & Access Controls: Implement HIPAA-compliant data handling practices, including role-based access control, row-level security, data masking, and audit logging; support access request validation and periodic access reviews for sensitive datasets.
- Integration, Source Onboarding & Reusable Patterns: Partner with the Facility Integration team and App Dev teams to onboard new sources; profile data, document source-to-target mappings, and build reusable ingestion/validation patterns that support reliable handoffs and downstream consumption.
- On-Call Support & Incident Response: Participate in an on-call rotation to respond to pipeline failures and data availability issues; triage incidents, communicate status to stakeholders, and drive issues to resolution with appropriate post-incident follow-up.
- Data Quality, Testing & Service Levels: Develop and maintain automated tests and reconciliation checks (e.g., row counts, referential integrity, threshold checks); define and monitor data freshness, completeness, and availability targets for key datasets.
- Documentation, Metadata & Standards: Create and maintain pipeline documentation, data dictionaries, runbooks, and key metadata (definitions, owners, refresh cadence) to improve discoverability, auditability, and consistent engineering practices.
- Release & Change Management: Coordinate safe deployment of data pipeline and model changes across environments (dev/test/prod), ensuring version control, peer reviews, and rollback plans are followed.
- Requirements & Delivery Partnership: Work with business and analytics partners to clarify requirements, define acceptance criteria, and deliver curated datasets that support reporting, dashboards, and downstream operational workflows.
Knowledge, Skills, And Abilities
- Core SQL & Programming: Strong proficiency in writing and optimizing complex SQL queries. Competency in Python for data scripting, API interactions, and basic automation tasks.
- Snowflake Proficiency: Solid working knowledge of Snowflake fundamentals, including virtual warehouses, stages, and the use of Tasks and Streams for change data capture (CDC).
- Data Transformation & Medallion Logic: Practical experience using dbt (data build tool) to move data through Bronze, Silver, and Gold layers. Ability to apply business logic to transform raw clinical data into structured, joinable tables.
- Healthcare Data Literacy: Familiarity with healthcare-specific data formats (HL7, FHIR, or flat-file EMR extracts). Understanding of how clinical data (diagnoses, procedures, provider IDs) supports clinical, financial, and operational workflows.
- Data Quality, Observability & Operations: Ability to implement automated tests and monitoring (e.g., null/threshold checks, freshness checks, alerts) and troubleshoot pipeline issues using root-cause analysis and runbooks to restore service safely.
- Problem Solving: A disciplined approach to troubleshooting data discrepancies between source systems and the Data Platform.
- Data Modeling & Warehousing Concepts: Knowledge of dimensional modeling (star/snowflake schemas), slowly changing dimensions (SCD), and the tradeoffs between normalized and denormalized designs for analytics and reporting workloads.
- ETL/ELT & Data Integration Patterns: Ability to design reliable batch and near-real-time loads using incremental strategies (e.g., watermarking, CDC patterns), idempotent processing, and backfill/reprocessing techniques; working knowledge of RESTful APIs, authentication (API keys/OAuth), and common data formats (JSON, CSV, Parquet).
- Engineering Practices (CI/CD, Version Control & Code Quality): Familiarity with automated build/test/deploy pipelines for analytics engineering (e.g., dbt jobs), environment promotion (dev/test/prod), and rollback approaches; ability to follow code review standards and create reusable components (macros, shared modules) using consistent conventions.
- Documentation, Metadata & Communication: Skill in producing clear technical documentation (data dictionaries, lineage notes, operating procedures), maintaining key metadata (definitions, ownership, refresh cadence), and explaining data concepts and tradeoffs to technical and non-technical partners.
- Stakeholder Partnership: Ability to gather requirements, ask clarifying questions, and translate business needs (e.g., revenue cycle, coding, scheduling) into scalable data solutions with well-defined acceptance criteria.
- Prioritization & Ownership: Ability to manage multiple initiatives, communicate progress and risks early, and take end-to-end ownership from design through production support in an Agile environment.
- Security, Privacy & Access Controls: Demonstrated discretion handling sensitive data; working knowledge of least-privilege RBAC and auditing concepts; ability to follow HIPAA-aligned handling practices and escalate potential compliance concerns appropriately.
- Cost Awareness (Snowflake/FinOps): Ability to interpret warehouse usage and query profiles, apply practical cost controls (resource monitors, scheduling, right-sizing), and balance performance with consumption.
Pay Range
74,840.00 - 123,816.00 USD annually
This range represents the anticipated base salary for this role. Actual compensation will be determined based on experience, qualifications, and internal equity considerations.
We offer a comprehensive benefits package designed to support your health, financial well-being, and work-life balance, including medical dental, vision insurance, a 401(k) plan with a company match, paid time off and holidays, professional development support, and employee wellness resources.
Visit our website for further information.
https://myscpbenefits.com/
Login name: corp-guest
Password: weheal
Similar roles
- Data Engineer IICyOpsPath - Job Marketplace · Chantilly, Virginia, United States · Hybrid
- Senior Data Engineer IICyOpsPath - Job Marketplace · Chantilly, Virginia, United States · Hybrid
Data Engineer IIPOOLCORP · Covington, Louisiana, United States · Onsite- Data Engineer IIpropio · Kansas, United States · Onsite
- Data Engineer IIandrew morgan · United States · Onsite