Jobs via Dice logo
Jobs via Dice Verified
Online Services, Human Resources, Technology Recruitment

Data Engineer - Hybrid

Minneapolis, Minnesota, United StatesHybridFull TimePosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

VIVA USA INC is seeking a Data Engineer with expertise in building and operating scalable batch and streaming data pipelines, particularly using Kafka and Google Cloud Platform services like BigQuery, Dataflow, and Vertex AI. The role involves developing high-performance distributed processing with Python and Spark, optimizing jobs for cost efficiency, and ensuring data quality and governance. A key responsibility is partnering with data scientists to operationalize ML models by building MLOps pipelines for training, deployment, and CI/CD. The engineer will also integrate on-prem Hadoop with GCP, provide technical mentorship, and uphold security and compliance standards. A BS/BA or equivalent and 4+ years of experience building large-scale data systems are required.

Dice is the leading career destination for tech experts at every stage of their careers. Our client, VIVA USA INC, is seeking the following. Apply via Dice today!
Title: Data Engineer - Hybrid
Mandatory skills:
streaming data pipelines,
Python, SQL,
data streaming technologies, Kafka, Flink, Spark Streaming,
Google Cloud Platform, Google Cloud Platform services, BigQuery, Dataflow, Dataproc, Pub/Sub, Vertex AI, AI Platform,
MLOps pipelines, scaling ML models,
ETL, ELT workflows, data quality, CI/CD, automated retraining,
data engineering, design patterns, package configuration, package deployment,
distributed frameworks, secure solutions
Description:
Data Engineer
Senior Engineer Fraud and Abuse Data Engineering
Design, build, and operate scalable batch and streaming data pipelines (Kafka) and ETL/ELT workflows across Hadoop and Google Cloud Platform (Google Cloud Platform); implement monitoring/alerting to meet reliability and SLA targets.
Develop high-performance distributed processing with Python, Spark, and Hive; optimize jobs, storage, and throughput for large-scale, high-volume datasets and cost efficiency.
Deliver curated, trustworthy datasets for analytics, reporting, and ML with strong data quality, lineage, and governance.
Partner with data scientists to operationalize ML on Google Cloud Platform (e.g., Vertex AI), building MLOps pipelines for training, deployment, CI/CD, monitoring, and automated retraining.
Integrate on-prem Hadoop data lakes with Google Cloud Platform services to enable seamless hybrid data and model workflows.
Collaborate with analysts and product engineers to ensure data is accessible, high-quality, and actionable; provide technical mentorship to junior engineers.
Uphold security, privacy, and regulatory compliance across all data engineering practices.
Continuously evaluate technologies and design patterns, and drive improvements in performance, scalability, and cost across Hadoop and Google Cloud Platform environments.
BA/BS or equivalent; 4+ years building large-scale data systems.
Proficient in core platforms; writes organized, maintainable code across multiple languages and distributed frameworks.
Skilled in package configuration/deployment and building custom solutions.
Designs robust tests; troubleshoots and resolves routine and non-routine issues independently.
Delivers high-performance, scalable, secure solutions (high throughput/low latency).
Operates effectively in Agile: communicates clearly with partners, aligns team priorities, and understands guest/business impact.
Influences and applies data/engineering standards and policies; maintains expertise and stays current through ongoing learning.
Technical Skills:
Strong proficiency in Python and SQL.
Hands-on experience with Apache Spark and the Hadoop ecosystem (HDFS, Hive, Pig, Oozie, Sqoop, YARN).
Experience with real-time data streaming technologies (Kafka, Flink, Spark Streaming).
Expertise with Google Cloud Platform services: BigQuery, Dataflow, Dataproc, Pub/Sub, Vertex AI (AI Platform).
Experience building and maintaining MLOps pipelines for deploying and scaling ML models.
Experience:
4 - 5 years
Notes:
Hybrid - Two days a week in the office
VIVA USA is an equal opportunity employer and is committed to maintaining a professional working environment that is free from discrimination and unlawful harassment. The Management, contractors, and staff of VIVA USA shall respect others without regard to race, sex, religion, age, color, creed, national or ethnic origin, physical, mental or sensory disability, marital status, sexual orientation, or status as a Vietnam-era, recently separated veteran, Active war time or campaign badge veteran, Armed forces service medal veteran, or disabled veteran. Please contact us at for any complaints, comments and suggestions.
Contact Details :
Account co-ordinator: Ramadas Kumaresan
VIVA USA INC.
3601 Algonquin Road, Suite 425
Rolling Meadows, IL 60008
|

Ready to apply?
You'll be redirected to Jobs via Dice's application page.

Similar roles