Data Engineer with AI/ML Experience
Compensation estimateAI
See base, equity, bonus, and total comp estimates for this role — free, no credit card.
Sign up to see compensation estimatePlease Note : Only H1B/H4-EAD Visa candidates will be considered.
Job Description
We are seeking an 8 –10 years experienced
Data Engineer with strong AI/ML expertise
to design, develop, and maintain data pipelines and AI-driven applications. The ideal candidate will have hands-on experience with
LLMs, data orchestration tools, and modern CI/CD practices
for AI model deployment.
Key Responsibilities
- Design and build scalable data pipelines to support generative AI and machine learning workloads.
- Implement and maintain RAG (Retrieval-Augmented Generation) pipelines integrating LLMs into production systems.
- Develop and optimize data workflows using orchestration tools such as
Airflow, Prefect, or Dagster
.
- Collaborate closely with ML engineers and data scientists to operationalize AI models into production.
- Manage data storage, transformation, and retrieval for model training and inference.
- Set up and maintain CI/CD pipelines using
GitHub Actions
for code and model deployments.
- Ensure high data quality, monitoring, and performance across data platforms.
Required Skills and Experience (Mandatory)
- 8–10 years of total experience in Data Engineering, with exposure to AI/ML systems.
- Strong programming proficiency in
Python
.
- Solid experience with
SQL
and large-scale
data processing pipelines
.
- Hands-on experience with
Large Language Models (LLMs)
and
Generative AI
frameworks.
- Experience implementing and maintaining
RAG pipelines
.
- Familiarity with
workflow orchestration tools
(Airflow, Prefect, Dagster, or similar).
- Experience in
CI/CD pipelines
for AI/model deployment using GitHub and GitHub Actions.
- Strong understanding of data integration, transformation, and versioning best practices.
Preferred Qualifications
- Experience with cloud platforms (AWS, Azure, or GCP).
- Familiarity with vector databases and embedding-based retrieval.
- Knowledge of containerization (Docker, Kubernetes).
- Excellent problem-solving and collaboration skills.