Machine Learning Engineer
Role summary
We are seeking a Machine Learning Engineer to build large-scale neural simulators for modeling physical world dynamics. This role involves designing spatiotemporal architectures for long-horizon prediction, training multi-billion parameter models on distributed GPU clusters, and owning the end-to-end training loop. You will improve model behavior through failure analysis, data curation, and scaling strategies. The ideal candidate has strong coding skills in Python, C++, or Rust, hands-on experience with video generation, diffusion, or multimodal models, and a deep understanding of model architecture and scaling laws. Experience with large-scale distributed training and a comfort at the intersection of modeling and systems are essential. This is a highly technical, individual contributor role focused on structured, long-horizon world modeling.
Machine Learning Engineer — World Models
We are building large-scale neural simulators that model the dynamics of the physical world, enabling robots to understand and anticipate events, so they can plan actions instead of reacting. The focus is on extending video generation systems into
controllable, persistent world models
capable of reasoning over time, causality, and interaction.
What You’ll Work On
- Designing spatiotemporal architectures for long-horizon prediction
- Training multi-billion parameter models on distributed GPU clusters
- Owning the end-to-end training loop (data, optimization, evaluation, iteration)
- Improving model behavior through failure analysis, data curation, and scaling strategies
Requirements
- Strong coding skills (Python, C++, or Rust)
- Hands-on experience with video generation, diffusion, or multimodal models
- Deep understanding of model architecture and scaling laws
- Experience running and debugging large-scale distributed training jobs
- Comfortable working at the intersection of modeling and systems
Nice to Have
- Work on world models, simulation, or model-based RL
- Experience with spatiotemporal transformers or latent dynamics models
- Familiarity with GPU optimization / distributed training frameworks
This is a highly technical, IC-heavy role focused on pushing beyond short-form generation into structured, long-horizon world modeling.
Similar roles
Machine Learning EngineerMastech Digital · Dallas, Texas, United States · Onsite- Machine Learning EngineerEdurech Technoogy · Santa Clara, California, United States · Hybrid
- Machine Learning EngineerMORSE Corp · Boston, Massachusetts, United States · Hybrid
- Machine Learning EngineerReddit · San Francisco, California, United States · Remote
- Machine Learning EngineerReddit · New York, New York, United States · Remote