
Software Engineer, Data License Monitoring & Resiliency
Role summary
Bloomberg is seeking a Software Engineer for the Data License Monitoring & Resiliency team in New York. This role focuses on building AI-powered systems to enhance production infrastructure reliability and customer support. Responsibilities include designing and developing LLM-based chatbots for end-to-end customer issue resolution, creating AI-driven anomaly detection and predictive models for capacity management, and automating chaos testing. The engineer will also advise application teams on system designs for observability, scalability, and resiliency, and automate incident response. The ideal candidate has 4+ years of production code experience in an object-oriented language, a data-driven mindset, and strong communication skills. Experience with LLMs, ML for operations, containerization, and chaos engineering is highly preferred.
Software Engineer, Data License Monitoring & Resiliency
Location
New York
Business Area
Engineering and CTO
Ref #
10050114
## Description & Requirements
Bloomberg delivers billions of data points to hundreds of thousands of customers daily. When markets move, our customers need their data immediately and without fail. That's not a nice-to-have. It's the whole product.
The Data License Monitoring & Resiliency team exists to make sure that promise holds. We build the systems that watch, test, and heal our production infrastructure before customers ever notice a problem. And we're using AI and large language models to fundamentally change how we handle reliability and customer support.
Right now, too much of our support workflow depends on humans doing repetitive, context-heavy work: reading tickets, diagnosing known issues, walking customers through the same resolution paths. We're building AI-powered systems to change that. Think LLM-driven chatbots that understand our product deeply enough to resolve support issues autonomously, context-aware triage systems that route and diagnose problems before a human ever gets involved, and intelligent knowledge bases that learn from every resolved incident.
On the reliability side, we're applying machine learning to anomaly detection, capacity forecasting, and incident response so we catch and fix problems faster than any manual process allows.
What you'd actually be working on:
- AI-powered support automation. Design and build LLM-based chatbots and support tools that handle customer issues end-to-end. You'll work on context retrieval, prompt engineering, response quality evaluation, and the feedback loops that make these systems smarter over time.
- Intelligent reliability tooling. Build AI-driven anomaly detection that spots degradation patterns before they become incidents. Develop predictive models for capacity management and failure forecasting across production infrastructure.
- Automated chaos testing and game days. Design and run resilience tests, then use ML to analyze results and prioritize improvements that matter most.
- Architecture advisory. Work directly with application development teams across Data License to review system designs, identify reliability risks early, and advocate for patterns that make services observable, scalable, and resilient by default. You'll be the person teams come to when they want to build something that won't break at 3am.
- Incident response automation. Shrink mean-time-to-resolution by building intelligent runbooks that diagnose root causes and recommend or execute fixes without waiting for a human.
Toil elimination. If a human is doing it repeatedly, you'll be figuring out how to make a machine do it instead.
What you bring:
- 4+ years writing production code in an object-oriented language (C/C++, Python, Java)
- Degree in Computer Science, Engineering, Mathematics, or equivalent hands-on experience
- Comfort working across the full stack, from application code down to infrastructure and hardware
- A data-driven mindset: you measure before you optimize, and you're skeptical of gut-feel decisions
- Strong communication skills. You'll be advising other teams on architecture decisions, so you need to explain trade-offs clearly and build consensus without authority.
Willingness to pick up new tools fast
What would set you apart:
- Experience building with LLMs: prompt engineering, RAG pipelines, evaluation frameworks, or deploying conversational AI in production
- Hands-on work with ML applied to operations: anomaly detection, predictive scaling, AIOps
- Background in containerization (Docker, Kubernetes, Mesos)
- Chaos engineering or game day experience
- Infrastructure-as-code and configuration management tooling
- Track record defining and measuring SLIs/SLOs for production services
Experience in a consulting or advisory role where you influenced engineering decisions across multiple teams
Salary Range = 160000 - 240000 USD Annually + Benefits + Bonus
The referenced salary range is based on the Company's good faith belief at the time of posting. Actual compensation may vary based on factors such as geographic location, work experience, market conditions, education/training and skill level.
We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation (exempt roles only), paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.
Discover what makes Bloomberg unique - watch our for an inside look at our culture, values, and the people behind our success.
Sample Bloomberg interview questions
- 1
Design a stock ticker tracking system
system designmedium - 2
Design a Train Reservation System
system designmedium - 3
Design a Health Monitoring System for Database Servers
system designmedium - 4
Design Asset Price Management System
system designmedium - 5
How would you manage communication and resolution efforts during a server outage affecting multiple Bloomberg clients?
technicalmedium
Sign up for a personalized interview prep pack tailored to this role.