Data & Software Engineer

Virginia, United StatesOnsiteFull TimePosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

We are seeking a Data & Software Engineer to join a small team focused on building complex data flows for a custom application. The role requires advanced Python programming, familiarity with Java, and a strong understanding of data security, privacy, governance, and compliance principles. You will be responsible for building production data pipelines and ETL workflows at scale, leveraging tools like Python, Spark, Docker, AWS services, SQL databases (MySQL, PostgreSQL), and orchestration tools such as Airflow. Experience with data catalogs, lineage tracking, geospatial data, and AI/ML integration is also essential. The ideal candidate will work with stakeholders to design solutions with minimal oversight and contribute to documentation and best practices.

Overview:

We are seeking a Data & Software Engineer works with a small team to build complex data flows for a custom application. Successful candidate will have advanced Python programming skills, familiarity with Java, an understanding of data security, privacy, governance and compliance principles and a demonstrated history of building production data pipelines and ETL workflows at scale. Candidate must have experience:

Responsibilities:

- Building end-to-end data pipelines leveraging Python

Using orchestration tools to deploy data pipelines, including configuring and updating Spark Jobs

Containerizing and deploying applications in cloud environments like AWS.
Working with MySQL and PostgreSQL including performance tuning, schema design, and query optimization for complex, analytical workloads.
Leveraging industry standard tools for code control (Git, IaaC control, etc.)
Working with data catalogs, tracking data lineage and handling a variety of data formats, including Geospatial.
Using Bash scripting for automation and data processing tasks
Integrating Al/ML services and models
- Work with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight
Leverage strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks
Leverage a background in large-scale data migration or platform modernization efforts

Contribute to data engineering documentation, best practices, and design patterns.

Qualifications:

Active TS/SCI W/ Polygraph required.
Bachelor's degree in Computer Science, Engineering, Finance, or a related technical field, or equivalent practical experience.
Minimum of 5 years' experience with:

Apache Spark & PySpark
Advanced Python skills (including Pandas & NumPy)
Docker, Podman
AWS S3, Lambda & Step functions
Apache Iceberg, Airflow, etc.
SQL (with Trino)
NoSQL, DynamoDB
Unity Catalog OSS, Apache Polaris
Apache Superset
Terraform or CloudFormation
OpenLineage
H3, PostGIS

Ready to apply?

You'll be redirected to VTG Defense's application page.

Is this role right for you?

Role summary

Similar roles