ML Engineer
Role summary
This MLOps Engineer role focuses on deploying and managing large-scale deep learning inference workloads in data center environments, leveraging AI hardware accelerators. The engineer will be responsible for provisioning, orchestrating, and managing inference pipelines, automating infrastructure with IaC, and optimizing performance and uptime. Key duties include monitoring systems, performing root cause analysis, and collaborating with vendors and internal teams to manage physical infrastructure. The role requires 2+ years of experience in software or systems development with an MLOps/DevOps focus, proficiency in scripting languages like Python or Bash, and hands-on experience with Linux, bare-metal servers, and AI/ML inference hardware.
MLOps Engineer
We're working with a world leader in next-generation AI hardware and semiconductor innovation on this exciting opportunity.
Step into the heart of the generative AI revolution by leading the deployment of rack-scale deep learning workloads. You will leverage cutting-edge Cloud AI inference accelerators to optimize end-to-end pipelines for high-performance, energy-efficient data center environments.
The Role
• Lead the provisioning, orchestration, and lifecycle management of massive deep learning inference pipelines using advanced AI accelerators.
• Deploy and maintain Infrastructure-as-Code (IaC) tools to automate the management of servers, storage, and networking components.
• Optimize rack-scale performance and uptime, ensuring high availability for critical Cloud AI data center deployments.
• Monitor system health and usage trends, driving root cause analysis (RCA) and scaling infrastructure to meet global demand.
• Collaborate with external vendors and internal engineering teams to commission/decommission equipment and manage physical infrastructure.
What You'll Need
• 2+ years of professional experience in Software Engineering or Systems Development (MLOps/DevOps focus preferred).
• Strong technical proficiency in Python, Bash, or C/C++ for automation and infrastructure scripting.
• Hands-on experience with Linux environments, bare-metal servers, and virtualization platforms.
• Practical knowledge of AI/ML inference workloads and managing hardware accelerators/GPUs in a data center context.
• Proven ability to troubleshoot complex networking, storage, and server issues in high-pressure environments.
What's On Offer
• Competitive base salary of $108,300 - $162,500 plus a discretionary annual bonus program.
• Opportunity for annual RSU (Restricted Stock Unit) grants, aligning your success with the company's growth.
• Comprehensive benefits package covering health, wellness, and professional development.
• The chance to work at the forefront of AI hardware acceleration and large-scale data center innovation.
Apply via Haystack today!
Similar roles
ML EngineerSolventum · Austin, Pennsylvania, United States · Remote
Senior ML EngineerQuilter · Los Angeles, California, United States · Remote
ML EngineerSundayy · United States · Remote- ML EngineerKforce Inc · Atlanta, Georgia, United States · Hybrid
- ML EngineerJobs via Dice · Atlanta, Georgia, United States · Hybrid