Senior AI Engineer – Inference & Agent Systems

New York, New York, United StatesHybridFull TimeSenior$175,000–$250,000 /yrPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

A high-growth AI + fintech company is seeking a Senior AI Engineer to own LLM-powered inference and agent systems in production. This role focuses on optimizing inference performance, orchestrating agents, and ensuring reliability at scale. The engineer will design and implement agent architectures, integrate advanced models, build robust orchestration using Temporal, and develop observability infrastructure. The ideal candidate has 3-7 years of AI/ML or backend experience, a proven track record of shipping AI applications, and experience with production systems resilient to LLM non-determinism. The position is hybrid, with remote options available.

About the Role

DeepRec is partnered with a high-growth AI + fintech company building advanced analytics platforms used by institutional investors.

They’re looking for a Senior AI Engineer to take ownership of LLM-powered systems in production - focusing on inference performance, agent orchestration, and reliability at scale.

This role sits at the core of the engineering team, working directly with leadership to define how agent-based systems are designed, evaluated, and deployed in real-world environments.

What You’ll Do

Optimise LLM inference performance, targeting sub-400ms Time to First Token (TTFT) across multi-step agent pipelines
Own the evaluation framework end-to-end, including ground truth datasets and automated scoring to detect regressions
Design and implement Plan → Execute → Synthesize agent architectures, enabling parallel sub-agent execution
Integrate and productionise state-of-the-art models (e.g. GPT, Claude, Gemini)
Build robust orchestration systems using Temporal (retries, timeouts, fault tolerance)
Develop observability infrastructure to trace model behaviour, tool usage, and system performance

What They’re Looking For

3–7 years of experience in AI/ML or backend engineering
Proven track record of shipping AI applications used by real customers
Experience building production systems resilient to LLM non-determinism
Strong understanding of LLMs, agent workflows, or distributed systems
Comfortable operating in a fast-paced, high-ownership startup environment
Bonus: experience within financial services or fintech

Tech Stack

Go, Python, Temporal, Kafka, PostgreSQL, Docker + modern LLM ecosystem

Compensation & Benefits

Salary: $175K – $250K
Competitive equity package

Working Model

Hybrid: New York or San Francisco (preferred)
Open to strong remote candidates across tier-one US tech hubs (e.g. Seattle)

About the Company

A fast-scaling AI-driven fintech platform helping institutional investors better understand portfolio risk, performance, and market positioning through proprietary datasets and advanced analytics.

Backed by leading investors, the company is rapidly growing and building cutting-edge infrastructure at the intersection of AI and financial markets.

Ready to apply?

You'll be redirected to DeepRec.ai's application page.