Senior AI Engineer – Inference & Agent Systems
Role summary
A high-growth AI + fintech company is seeking a Senior AI Engineer to own LLM-powered inference and agent systems in production. This role focuses on optimizing inference performance, orchestrating agents, and ensuring reliability at scale. The engineer will design and implement agent architectures, integrate advanced models, build robust orchestration using Temporal, and develop observability infrastructure. The ideal candidate has 3-7 years of AI/ML or backend experience, a proven track record of shipping AI applications, and experience with production systems resilient to LLM non-determinism. The position is hybrid, with remote options available.
About the Role
DeepRec is partnered with a high-growth AI + fintech company building advanced analytics platforms used by institutional investors.
They’re looking for a Senior AI Engineer to take ownership of LLM-powered systems in production - focusing on inference performance, agent orchestration, and reliability at scale.
This role sits at the core of the engineering team, working directly with leadership to define how agent-based systems are designed, evaluated, and deployed in real-world environments.
What You’ll Do
- Optimise LLM inference performance, targeting sub-400ms Time to First Token (TTFT) across multi-step agent pipelines
- Own the evaluation framework end-to-end, including ground truth datasets and automated scoring to detect regressions
- Design and implement Plan → Execute → Synthesize agent architectures, enabling parallel sub-agent execution
- Integrate and productionise state-of-the-art models (e.g. GPT, Claude, Gemini)
- Build robust orchestration systems using Temporal (retries, timeouts, fault tolerance)
- Develop observability infrastructure to trace model behaviour, tool usage, and system performance
What They’re Looking For
- 3–7 years of experience in AI/ML or backend engineering
- Proven track record of shipping AI applications used by real customers
- Experience building production systems resilient to LLM non-determinism
- Strong understanding of LLMs, agent workflows, or distributed systems
- Comfortable operating in a fast-paced, high-ownership startup environment
- Bonus: experience within financial services or fintech
Tech Stack
Go, Python, Temporal, Kafka, PostgreSQL, Docker + modern LLM ecosystem
Compensation & Benefits
- Salary: $175K – $250K
- Competitive equity package
Working Model
- Hybrid: New York or San Francisco (preferred)
- Open to strong remote candidates across tier-one US tech hubs (e.g. Seattle)
About the Company
A fast-scaling AI-driven fintech platform helping institutional investors better understand portfolio risk, performance, and market positioning through proprietary datasets and advanced analytics.
Backed by leading investors, the company is rapidly growing and building cutting-edge infrastructure at the intersection of AI and financial markets.