
Data Automation Engineer
Role summary
Seeking a Data Automation Engineer to design and implement AI-driven automation solutions on AWS and Azure. Responsibilities include building scalable data pipelines, integrating cloud services, enterprise tools, and Generative AI for analytics, reporting, and customer engagement. Key tasks involve developing ETL/ELT processes, leveraging GenAI for data quality and LLM-assisted transformations, implementing SQL optimizations, applying CI/CD best practices, and ensuring security and compliance. Experience with data engineering tools, cloud platforms, and GenAI frameworks is required, along with strong troubleshooting skills and the ability to obtain a Public Trust clearance.
Job Description
Data Automation Engineer
Location:
Washington, DC
- Remote (fully remote with potential quarterly travel to Gaithersburg, MD / Washington D.C. metro area)
Clearance:
Public Trust (or willingness to obtain; must be a U.S. Citizen)
Note:
NOT OPEN TO C2C OR W2 REFERRALS AT THIS TIME
Job Description
Seeking a Data Automation Engineer to design and implement innovative, AI-driven automation solutions across AWS and Azure hybrid environments. Responsible for building intelligent, scalable data pipelines and automations integrating cloud services, enterprise tools, and Generative AI for mission-critical analytics, reporting, and customer engagement platforms.
Key Responsibilities
- Design and maintain data pipelines in AWS using S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB, and Step Functions
- Develop ETL/ELT processes between DynamoDB, SQL Server (AWS), and AWS ↔ Azure SQL systems
- Integrate AWS Connect CRM data into enterprise data pipelines for analytics and reporting
- Engineering ingestion pipelines with Apache Spark, Flume, Kafka for real-time/batch processing into Apache Solr, AWS OpenSearch
- Leverage Generative AI services (AWS Bedrock, Amazon Q, Azure OpenAI, Hugging Face, LangChain) for:
- Vector generation and embeddings from unstructured data
- Automated data quality checks, metadata tagging, and lineage tracking
- LLM-assisted transformation and anomaly detection in ETL
- Conversational BI interfaces for natural language access to Solr and SQL data
- AI-powered copilots for pipeline monitoring and troubleshooting
- Implement SQL Server stored procedures, indexing, query optimization, and performance tuning
- Apply CI/CD best practices using GitHub, Jenkins, or Azure DevOps
- Ensure security and compliance via IAM, KMS encryption, VPC isolation, RBAC, firewalls
- Support Agile DevOps processes with sprint-based delivery
Required Qualifications
- BS in Computer Science or related field with 2+ years data engineering/automation experience
- Hands-on experience with SQL, SSIS, Python, Spark, Bash, PowerShell, AWS/Azure CLIs
- Experience with AWS services (S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB)
- Familiarity with Apache Flume, Kafka, Solr for large-scale data ingestion and search
- Familiarity with LLM/Gen AI frameworks (AWS Bedrock, Azure OpenAI, or open-source platforms/tools)
- Experience integrating REST API calls in data pipelines and workflows
- Familiarity with JIRA, GitHub / Azure DevOps / Jenkins for SDLC and CI/CD automation
- Strong troubleshooting and performance optimization skills in SQL, Spark or other data engineering solutions
- Experience operationalizing Generative AI (GenAI Ops) pipelines, including model deployment, monitoring, retraining, and lifecycle management
- Good communication and presentation skills
- Ability to obtain Federal government Public Trust clearance
Preferred Qualifications (Plus)
- Certifications: AWS Data Engineer, AWS AI/ML Specialty, Azure AI Engineer, Databricks Certified Data Engineer
- Experience implementing RAG pipelines, embeddings, and vector search with Solr, OpenSearch, FAISS, Pinecone, or Pgvector/SQL Server vector types
- Experience with GenAI-powered coding tools (Claude Code, OpenAI Codex, VS Code)
- Experience with multi-cloud data integration (AWS ↔ Azure SQL)
- Familiarity with Microsoft BizTalk and SSIS for SQL Server ETL workflows
- Knowledge of data lineage/governance tools (Purview, Unity Catalog, AWS Glue Catalog)
- Familiarity with Infrastructure-as-Code (Terraform/CloudFormation, Bicep) for automated deployments
- Experience with compliance frameworks (FedRAMP, PCI-DSS, HIPAA)
Similar roles
Data Automation EngineerBCMC · Arlington, Virginia, United States · Onsite- Data Automation EngineerJobs via Dice · Houston, Texas, United States · Remote
- Data Automation EngineerSmart Tech Skills LLC · United States · Remote
- Data Automation EngineerAptonet · Washington, District of Columbia, United States · Remote
Data Automation EngineerCloud Centric Inc · United States · Remote