CloudIngest logo
CloudIngest Verified
Cloud Computing, Data Management, Software, Information Technology

Data Scientist (GenAI & Utilities)

Corpus Christi, Texas, United StatesHybridContractEntry-level (exp-based)Posted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

A leading utility provider is seeking a Senior Data Scientist with expertise in Generative AI (GenAI) and Large Language Models (LLMs) to join their AI task force. This role focuses on leveraging GenAI and Retrieval-Augmented Generation (RAG) systems to improve operational efficiency and customer experience within the utility sector. Responsibilities include developing RAG pipelines, fine-tuning LLMs, integrating vector databases with Python, analyzing unstructured utility data, and deploying models in cloud environments (Azure/AWS). The ideal candidate will have 15 years of data science experience, with at least 1 year in LLMs/GenAI, advanced Python and SQL skills, and domain knowledge in utilities or related industries.

HI,

Title: Data Scientist (GenAI & Utilities)

Location: Corpus Christi, TX (Hybrid/On-site preferred)

Client: Direct

Client is a leading utility provider serving the South Texas region. We are committed to innovating our operational efficiency and customer experience through advanced technology. We are currently building a specialized AI task force to leverage generative AI for operational excellence, including infrastructure maintenance, grid optimization, and enhanced customer support.

Position Summary
: We are seeking a highly skilled and motivated Senior Data Scientist with a strong background in Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems. The ideal candidate has direct experience applying AI within the utility, energy, or industrial sector. You will lead the design, development, and deployment of intelligent systems that translate unstructured operational data (technical manuals, maintenance logs, field reports) into actionable insights.

Key Responsibilities

  • Develop RAG Pipelines: Design and optimize Retrieval-Augmented Generation (RAG) models to enhance LLM-driven troubleshooting and field operations.
  • AI Model Development: Fine-tune and evaluate LLMs (e.g., GPT-4, Llama 3) for task-specific reasoning and accuracy.
  • Knowledge Integration: Build high-quality Python code to integrate vector databases (e.g., Pinecone, Milvus, Weaviate) to store and retrieve technical utility documents efficiently.
  • Operational Intelligence: Apply LLM solutions to analyze unstructured utility data, such as asset management logs, SCADA alerts, and maintenance records.
  • Deployment: Work with software engineers to deploy and scale models in cloud environments (Azure/AWS).
  • Collaboration: Work cross-functionally with field operations, engineering teams, and stakeholders to understand data needs and deliver solutions.

Required Qualifications

  • Experience: 15 years of professional experience in Data Science/Machine Learning, with at least 1 year focusing specifically on LLMs and Generative AI.
  • Technical Skills: Advanced proficiency in Python (PyTorch/TensorFlow, LangChain, LlamaIndex) and SQL.
  • Domain Knowledge: Proven experience in the utility, energy, oil & gas, or heavy industrial sectors.
  • RAG Expertise: Hands-on experience with vector embeddings, semantic search, and prompt engineering.
  • Education: Master’s or Ph.D. in Computer Science, Mathematics, Data Science, or a related quantitative field.

Thanks,

Dilip Kumar

Ready to apply?
You'll be redirected to CloudIngest's application page.