T. Mims Corp. logo
T. Mims Corp. Verified
Construction, Civil Engineering

Senior AI/ML Engineer

Lakeland, Florida, United StatesOnsiteFull TimeSenior$175,000–$230,000 /yrPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

We are seeking a Senior AI/ML Engineer with 4+ years of experience in building, fine-tuning, and deploying large language models (LLMs) in production. This role focuses on designing scalable ML systems, optimizing inference efficiency, and delivering production-grade AI solutions across the full ML lifecycle, from distributed GPU training to cloud deployment. Responsibilities include architecting end-to-end ML systems, optimizing LLM performance (latency, throughput, cost), implementing prompt engineering and RAG systems, and deploying APIs. The ideal candidate will have a Master's or PhD in a related field, strong Python skills, and experience with ML frameworks like PyTorch or TensorFlow, cloud platforms (AWS/GCP), and MLOps. Success involves delivering scalable, cost-efficient, and reliable LLM systems with robust monitoring and safety features.

Overview

We are seeking a highly motivated Senior AI/ML Engineer with 4+ years of experience building, fine-tuning, and deploying large language models (LLMs) in production. This role is focused on designing scalable, high-performance ML systems, improving inference efficiency, and delivering reliable, production-grade AI solutions.

You will work across the full ML lifecycle—from distributed GPU training to cloud deployment—, optimizing cost and performance, and collaborating cross-functionally to deliver impactful AI products.

Key Responsibilities

  • Design, build, and deploy scalable AI/ML systems for production environments
  • Optimize LLM performance across latency, throughput, memory usage, and cost
  • Architect end-to-end ML systems, making tradeoffs across performance, scalability, and reliability
  • Develop, fine-tune, and evaluate LLMs and deep learning models
  • Implement advanced prompt engineering strategies to improve output quality, consistency, and reliability
  • Build and optimize retrieval-augmented generation (RAG) systems, including integration with vector databases
  • Apply model optimization techniques such as quantization, pruning, batching, and efficient inference strategies
  • Deploy and maintain production-grade APIs and model endpoints (e.g., FastAPI)
  • Design and maintain distributed data pipelines and cloud-based ML infrastructure
  • Build and maintain MLOps pipelines, including experiment tracking, model versioning, and CI/CD workflows
  • Implement monitoring, logging, and alerting systems for model performance, drift detection, and system reliability
  • Develop robust evaluation frameworks, including offline evaluation, online testing, and A/B experimentation
  • Implement safety, alignment, and guardrail mechanisms to mitigate hallucinations, bias, and unsafe outputs
  • Optimize infrastructure and deployment strategies for cost efficiency
  • Partner with product, engineering, and leadership teams to translate business requirements into scalable AI solutions
  • Stay current with emerging research, tools, and best practices in AI/ML

Required Qualifications

  • Master’s or PhD in Computer Science, Machine Learning, Artificial Intelligence, or a related field
  • 4+ years of hands-on experience building and deploying ML/LLM systems in production
  • Strong proficiency in Python (required) and experience with C++ (preferred)
  • Deep experience with ML frameworks such as PyTorch and/or TensorFlow
  • Strong understanding of NLP, LLMs, and deep learning architectures
  • Proven experience optimizing models for production, including GPU acceleration and efficient inference
  • Hands-on experience with distributed training
  • Experience deploying models at scale with AWS or Google Cloud
  • Experience building APIs using FastAPI
  • Strong experience with Linux and scripting
  • Proficiency with Git
  • Solid understanding of databases (PostgreSQL, MySQL)

Nice to Have

  • Experience with TensorRT-LLM, vLLM, or DeepSpeed
  • Experience with LangChain or LlamaIndex
  • Experience with OpenAI, Anthropic, or open-weight models
  • Familiarity with MLflow, Weights & Biases, or Kubeflow
  • Experience with LLM evaluation frameworks
  • Experience with RLHF or DPO
  • Experience with multimodal models
  • Contributions to open-source or research publications

What Success Looks Like

  • Deliver scalable LLM systems in production
  • Reduce latency and infrastructure costs while maintaining quality
  • Build reliable systems with strong monitoring and safety
  • Contribute to scalable architecture decisions
  • Drive measurable improvements in model performance

Pay: $175,000.00 - $230,000.00 per year

Work Location: In person

Ready to apply?
You'll be redirected to T. Mims Corp.'s application page.

Similar roles