We're in beta · Starting with US & Canada · Shipping weekly — your feedback shapes RiseMe
Programmers.io logo
Programmers.io Verified
IT Services, Software Development, Staff Augmentation, Consulting

Data Engineer - Python/PySpark

Irving, Texas, United StatesHybridFull TimePosted 1 month agoVisa sponsorship available

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Sign up to see compensation estimate

Job Role: Data Engineer - Python/PySpark

Location: Irving TX (3 Days onsite/week)

Duration: Full-Time

Job Description:

  • Strong hands-on development experience in Python, PySpark, and SQL.
  • Experience building large-scale ETL/ELT pipelines for structured and unstructured data.
  • Deep understanding of Spark and distributed computing fundamentals (transformations, shuffles, optimization).
  • Experience with big data frameworks such as Hadoop and Spark.
  • Proficiency with Git-based repositories (Bitbucket / GitHub).
  • Experience working with AWS, Azure, or GCP environments.
  • Strong understanding of database design, data modeling, warehouse schemas (star/snowflake).
  • Experience with CI/CD automation and pipeline development.
  • Strong analytical and troubleshooting skills for resolving complex data issues.
  • Ability to collaborate with cross-functional teams and convert business requirements into technical solutions.
  • Design, develop, and maintain robust, scalable ETL/ELT pipelines.
  • Write efficient, reusable, and scalable code in Python and PySpark for distributed data processing.
  • Review existing data engineering code and identify opportunities for refactoring or performance improvement.
  • Implement data validation, cleansing, reconciliation, and quality checks across the data lifecycle.
  • Collaborate with IT and business stakeholders to understand data requirements and translate them into solutions.
  • Monitor pipeline performance, troubleshoot failures, and optimize for latency, throughput, and cost.
  • Participate in code reviews, enforce coding standards, and contribute to engineering best practices.
  • Build and maintain CI/CD pipelines for testing, packaging, and deployment of data pipelines.
  • Ensure data reliability, security, and consistency across environments.
  • Work with cloud services and big data platforms to support modern data architecture.
Ready to apply?
You'll be redirected to Programmers.io's application page.

Similar roles