T. Mims Corp. Verified

Construction, Civil Engineering

Senior AI/ML Engineer

Lakeland, Florida, United StatesOnsiteFull TimeSenior$175,000–$230,000 /yrPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

We are seeking a Senior AI/ML Engineer with 4+ years of experience in building, fine-tuning, and deploying large language models (LLMs) in production. This role focuses on designing scalable ML systems, optimizing inference efficiency, and delivering production-grade AI solutions across the full ML lifecycle, from distributed GPU training to cloud deployment. Responsibilities include architecting end-to-end ML systems, optimizing LLM performance (latency, throughput, cost), implementing prompt engineering and RAG systems, and deploying APIs. The ideal candidate will have a Master's or PhD in a related field, strong Python skills, and experience with ML frameworks like PyTorch or TensorFlow, cloud platforms (AWS/GCP), and MLOps. Success involves delivering scalable, cost-efficient, and reliable LLM systems with robust monitoring and safety features.

Overview

We are seeking a highly motivated Senior AI/ML Engineer with 4+ years of experience building, fine-tuning, and deploying large language models (LLMs) in production. This role is focused on designing scalable, high-performance ML systems, improving inference efficiency, and delivering reliable, production-grade AI solutions.

You will work across the full ML lifecycle—from distributed GPU training to cloud deployment—, optimizing cost and performance, and collaborating cross-functionally to deliver impactful AI products.

Key Responsibilities

Design, build, and deploy scalable AI/ML systems for production environments

Optimize LLM performance across latency, throughput, memory usage, and cost

Architect end-to-end ML systems, making tradeoffs across performance, scalability, and reliability

Develop, fine-tune, and evaluate LLMs and deep learning models

Implement advanced prompt engineering strategies to improve output quality, consistency, and reliability

Build and optimize retrieval-augmented generation (RAG) systems, including integration with vector databases

Apply model optimization techniques such as quantization, pruning, batching, and efficient inference strategies

Deploy and maintain production-grade APIs and model endpoints (e.g., FastAPI)

Design and maintain distributed data pipelines and cloud-based ML infrastructure

Build and maintain MLOps pipelines, including experiment tracking, model versioning, and CI/CD workflows

Implement monitoring, logging, and alerting systems for model performance, drift detection, and system reliability

Develop robust evaluation frameworks, including offline evaluation, online testing, and A/B experimentation

Implement safety, alignment, and guardrail mechanisms to mitigate hallucinations, bias, and unsafe outputs

Optimize infrastructure and deployment strategies for cost efficiency

Partner with product, engineering, and leadership teams to translate business requirements into scalable AI solutions

Stay current with emerging research, tools, and best practices in AI/ML

Required Qualifications

Master’s or PhD in Computer Science, Machine Learning, Artificial Intelligence, or a related field

4+ years of hands-on experience building and deploying ML/LLM systems in production

Strong proficiency in Python (required) and experience with C++ (preferred)

Deep experience with ML frameworks such as PyTorch and/or TensorFlow

Strong understanding of NLP, LLMs, and deep learning architectures

Proven experience optimizing models for production, including GPU acceleration and efficient inference

Hands-on experience with distributed training

Experience deploying models at scale with AWS or Google Cloud

Experience building APIs using FastAPI

Strong experience with Linux and scripting

Proficiency with Git

Solid understanding of databases (PostgreSQL, MySQL)

Nice to Have

Experience with TensorRT-LLM, vLLM, or DeepSpeed

Experience with LangChain or LlamaIndex

Experience with OpenAI, Anthropic, or open-weight models

Familiarity with MLflow, Weights & Biases, or Kubeflow

Experience with LLM evaluation frameworks

Experience with RLHF or DPO

Experience with multimodal models

Contributions to open-source or research publications

What Success Looks Like

Deliver scalable LLM systems in production

Reduce latency and infrastructure costs while maintaining quality

Build reliable systems with strong monitoring and safety

Contribute to scalable architecture decisions

Drive measurable improvements in model performance

Pay: $175,000.00 - $230,000.00 per year

Work Location: In person

Ready to apply?

You'll be redirected to T. Mims Corp.'s application page.

Is this role right for you?

Role summary

Similar roles