Realign logo
Realign Verified
Software, Business Process Management, Enterprise Architecture

ML Engineer / AI Operations

Morristown, New Jersey, United StatesOnsiteTemporary$170,000–$170,000 /yrPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

This role is for an ML Engineer focused on AI Operations, based in Morristown, NJ. The primary responsibility is to own and operate the CI/CD pipelines for existing ML services, ensuring robust deployment strategies like blue/green and canary releases with automated rollbacks. Key duties include implementing and managing model/data drift monitoring, setting up alerts and retraining triggers, and building production dashboards and incident workflows. The engineer will provide L2/L3 support for production ML systems, conduct root-cause analysis, and maintain operational documentation. Additionally, they will productionize research notebooks into maintainable services, manage model-serving APIs and batch jobs, and ensure models are integrated with CI/CD, observability, and monitoring stacks, enforcing traceability.

Morristown, New Jersey 07960 Posted April 2nd, 2026

Looking for more job opportunities? Click here!

Job Type: Full Time

Job Category: IT

Job Description

Job Title : ML Engineer - AI Operations
Location : Morristown, NJ (Onsite)
Fulltime

Skill: ML Engineer - AI Operations

Key responsibilities:

Own and operate CI/CD for existing ML services across dev/test/prod; standardize blue/green and canary releases with automated rollbacks.

Run model/data drift and performance monitoring with SLAs; define alerts, thresholds, and retraining triggers.

Build and maintain production dashboards, alerts, and incident workflows; codify on-call runbooks and escalation paths.

Partner with onshore model owners to diagnose metric degradation and land mitigations aligned to governance and controls.

Provide day-to-day L2/L3 support for production ML: triage, root-cause analysis, permanent fixes, and post-incident reviews.

Own operational documentation: runbooks, standard operating procedures, and recurring health checks.

Coordinate hotfixes and safe rollbacks with onshore teams; verify recovery via automated smoke tests.

Harden and productionize research notebooks into maintainable, testable services with CI, unit/integration tests, and linting.

Operate and evolve model-serving APIs and batch scoring jobs; integrate with enterprise schedulers and data platforms.

Ensure models are fully integrated into CI/CD, observability, and monitoring stacks; enforce traceability with experiment and model registries.

Validate successful delivery of model outputs to apps, chatbots, reports, and downstream systems with contract tests and data quality checks.

Required Skills:

Git/GitLab, Python, SQL, MLflow, Power BI, Snowflake.

OLAP/OLTP data modeling and architecture.

API frameworks (FastAPI/Flask), and

Nice to have:

Modern ELT tools (Fivetran/Airbyte).

Streaming/real-time data pipelines (e.g., Kafka, Kinesis, Redpanda).

Production ML service operations experience (experience in broader full-stack environments is a plus.

Required Skills

DEVOPS ENGINEER

SENIOR EMAIL SECURITY ENGINEER

Ready to apply?
You'll be redirected to Realign's application page.