Machine Learning Engineer

United StatesOnsiteFull TimePosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

We are seeking an AI/Machine Learning Engineer to build and maintain production-grade ML systems at scale. This role involves owning the end-to-end ML lifecycle, from translating business needs into ML problems to deploying, monitoring, and continuously improving models. Responsibilities include designing scalable ML pipelines, developing and training models, engineering robust feature pipelines, deploying models as low-latency APIs or streaming services, and implementing robust monitoring and alerting systems. The ideal candidate will have strong Python programming skills, a deep understanding of ML fundamentals, experience with ML frameworks, and proficiency in data processing, MLOps, and cloud platforms.

Role Overview

We are hiring an AI / Machine Learning Engineer to build
production-grade ML systems
that operate at scale. This role focuses on turning data and models into reliable, high-performance services that directly impact business outcomes.

You will own the
end-to-end ML lifecycle
- from problem framing to deployment, monitoring, and continuous improvement.

What You’ll Own

- Translate business problems into
ML problem statements and measurable objectives
- Design and build
scalable ML pipelines
(batch + real-time)
- Develop and train models for prediction, ranking, and optimization
- Engineer robust
feature pipelines and data transformations
- Deploy models as
low-latency APIs or streaming services
- Implement
continuous training and model retraining pipelines
- Monitor
data drift, model drift, and system performance
- Conduct
experimentation (A/B testing, offline/online validation)
- Optimize systems for
accuracy, latency, and cost
- Ensure reliability with
logging, monitoring, and alerting systems

Core Technical Requirements

- Strong programming in Python with production-level standards
- Deep understanding of
machine learning fundamentals, supervised
& unsupervised learning
- Ensemble methods (Random Forest, XGBoost)
- Solid foundation in
statistics and probability
- Experience with
feature engineering and data pipelines
- Hands-on with ML frameworks (Scikit-learn, TensorFlow, PyTorch)
- Experience deploying models using
REST/gRPC APIs
- Understanding of
system design (scalability, fault tolerance, latency)

Data & Systems Engineering Expectations

- Strong SQL and data modeling skills
- Experience with
large-scale data processing (Spark or equivalent)
- Understanding of
data pipelines (ETL/ELT workflows)
- Familiarity with
streaming systems (Kafka or similar)
- Ability to debug and optimize
data + model pipelines end-to-end

MLOps & Production Readiness

- Experience with
CI/CD for ML systems
- Model versioning, experiment tracking, and reproducibility
- Monitoring pipelines for
data quality and model performance
- Experience with containerization (Docker) and orchestration (Kubernetes)
- Handling
rollback, failure recovery, and deployment strategies

Cloud & Infrastructure

- Experience with at least one cloud platform (AWS / Azure / GCP)
- Understanding of
distributed systems and scalable architecture
- Ability to optimize
compute cost vs performance trade-offs

Good to Have (Strong Differentiators)

- Deep learning (CNNs, transformers)
- Exposure to
Generative AI / LLM systems
- Experience with
recommendation systems or ranking models
- Knowledge of
feature stores and online/offline consistency
- Familiarity with
model explainability and fairness techniques

Qualifications

Bachelor’s or Master’s in Computer Science, AI, Data Science, or related field
3–8+ years of experience building ML systems in production
Strong fundamentals in math, algorithms, and data structures

How You Will Be Measured

Model performance (accuracy, precision/recall, business metrics)
System latency and throughput
Reliability (uptime, failure rate, recovery time)
Impact on business KPIs (revenue, cost, efficiency)

What Top Candidates Do Differently

- Think in
systems, not just models
- Balance
accuracy vs latency vs cost
- Build
reusable, scalable ML infrastructure
- Communicate clearly with both engineers and business stakeholders

Typical High-Impact Projects

Real-time recommendation and ranking systems
Fraud detection and risk scoring pipelines
Demand forecasting and supply optimization
Customer churn prediction and targeting models
Personalization engines for large-scale platforms

Reality of the Role (No Fluff)

This is not a “train model and done” role.

You are expected to:

Work with messy, incomplete data
Debug pipelines in production
Handle scale and failures
Deliver measurable business value

Why This Role Matters

You will directly influence:

Product decisions
Customer experience
Revenue and cost optimization

Ready to apply?

You'll be redirected to CodeX Tech-IT LLC's application page.

Is this role right for you?

Role summary

Similar roles