AWS ML Engineer

Philadelphia, Pennsylvania, United StatesOnsiteFull TimePosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

Elsevier is seeking an AWS ML Engineer with an extensive AWS/Cloud engineering background to join their Health platforms team. This role bridges Data Science and Engineering to operationalize experimental NLP/IR/GenAI models into secure, reliable, and scalable services. You will focus on AI-based features, search/ranking quality, and knowledge graph aware retrieval within a vast medical and scholarly data landscape. Key responsibilities include automating ML workflows, managing model registries, developing CI/CD for ML, implementing MLOps solutions on AWS, scaling SageMaker pipelines, designing GAR+RAG systems, building evaluation pipelines, and optimizing infrastructure costs. The role requires current experience in ML Engineering, MLOps, and shipping ML/GenAI systems to production, with a strong understanding of cloud platforms and search technologies.

AWS ML Engineer- We are not looking for a AI/ML Engineer. Must have an extensive AWS/Cloud engineering background.

About the team, this team that powers Elsevier’s Health platforms: Clinical Key AI, Sherpath AI, and AI-driven automated clinical and content workflows. You will bridge Data Science and Engineering to turn experimental NLP/IR/GenAI models into secure, reliable, and scalable services. Our systems operate over one of the world’s largest medical and scholarly landscapes.

About the role, as a Senior Machine Learning Engineer you’ll work on AI-based features (GenAI, Agentic AI, RAG, etc.) search/ranking quality, and knowledge graph aware retrieval while enforcing content rights and editorial confidentiality.

Key Responsibilities

ML & LLM Engineering, Search and Recommendation Engines

Automate and orchestrate machine learning workflows across major cloud and AI platforms (AWS, Azure, Databricks, and foundation model APIs such as OpenAI).
Maintain and version model registries and artifact stores to ensure reproducibility and governance.
Develop and manage CI/CD for ML, including automated data validation, model testing, and deployment.
Implement ML Engineering solutions using popular MLOps platforms such as AWS SageMaker, MLflow, Azure ML.
Scale end-end custom Sagemaker pipelines.
Design and implement the engineering components of GAR+RAG systems (e.g., query interpretation and reflection, chunking, embeddings, hybrid retrieval, semantic search), manage prompt libraries, guardrails and structured output for LLMs hosted on Bedrock/SageMaker or self-hosted.
Design and implement ML pipelines that utilize Elasticsearch/OpenSearch/Solr, vector DBs, and graph DBs .
Build evaluation pipelines: offline IR metrics (NDCG, MAP, MRR), LLM quality metrics (faithfulness, grounding), and A/B testing.
Optimize infrastructure costs through monitoring, scaling strategies, and efficient resource utilization.
Stay current with the latest GAI research, NLP and RAG and apply the state-of-the-art in our experiments and systems.

Collaboration

Partner with Subject-Matter Experts, Product Managers, Data Scientists and Responsible AI experts to translate business problems into cutting edge data science solutions
Collaborate and interface with Operations Engineers who deploy and run production infrastructure.

Qualifications

Current experience in ML Engineering, MLOps platforms, shipping ML or search/GenAI systems to production.
Strong Python, Java, and/or Scala experience will be considered a plus.
Hands-on‑ experience with major cloud vendor solutions (AWS, Azure and/or Google)
Experience with Search/vector/graph technologies (e.g., Elasticsearch / OpenSearch / Solr / Neo4j).
Experience in evaluating LLM models.
A strong understanding of the Data Science Life Cycle including feature engineering, model training, and evaluation metrics.
Background in health technology and/or medical content workflows is preferred.
Familiarity with ML frameworks, e.g., PyTorch, TensorFlow, PySpark.
Experience with large-scale data processing systems, e.g., Spark.
Experience with statistical analysis, machine learning theory and natural language processing.

Elsevier is a renowned global information analytics company that primarily focuses on providing scientific, technical, and medical (STM) research content, tools, and services. It is one of the largest publishers of academic journals and scholarly literature in the world. Elsevier operates in various domains, including science, technology, medicine, social sciences, and more. They publish a vast number of peer-reviewed journals covering a wide range of disciplines. These journals act as platforms for researchers and academics to share their findings and contribute to the advancement of knowledge in their respective fields.

Ready to apply?

You'll be redirected to Elsevier's application page.

Is this role right for you?

Role summary

Similar roles