Software Engineer (Big Data)
Role summary
Quantcast is seeking a Software Engineer for its Big Data Services team. This role involves building, maintaining, and optimizing large-scale data processing systems and distributed storage/compute platforms. The engineer will collaborate with data scientists and researchers to ensure a scalable and efficient data infrastructure for advanced analytics and modeling. Experience with distributed systems like HDFS and Spark is preferred. The role offers exposure to real-world data science, modeling workflows, and best practices in scalable data infrastructure.
What you'll do:
Contribute to the development and optimization of large-scale data workflows using technologies such as Apache Spark or similar frameworks
Develop and operate our large scale data processing systems, making them more elastic and fault tolerant, solving distributed data and compute challenges
Assist in deploying and maintaining production systems, including CI/CD workflows
Provide technical input into roadmaps for the team
Write clean, maintainable, and well-tested code
Who you are:
Recent graduate (0-1 years) with a Bachelor's Degree in computer science or equivalent experience
Familiarity with data processing frameworks such as Apache Spark, Hadoop, or similar.
Familiarity with containerization tools (e.g., Docker and Kubernetes)
Experience with workflow orchestration tools (e.g., Airflow) is a plus
Proficient in Java and/or Python programming languages
Linux system administration/automation is a plus
Strong problem-solving and debugging skills
Organized and detail-oriented
What you'll learn in this role:
-
Hands-on experience with large-scale distributed systems in production
-
Exposure to real-world data science and modeling workflows
-
Best practices in building scalable and reliable data infrastructure
-
Collaboration across engineering and modeling teams
Sample Quantcast interview questions
- 1
Design a system for real-time audio/video streaming.
system designmedium - 2
Create a real-time intrusion detection system based on network traffic analysis.
system designmedium - 3
Build a real-time satellite imagery processing system.
system designmedium - 4
Clone an Undirected Graph Clone an undirected graph. Input: adjList = [[]] Output: [[]] Explanation: Creates a new, deeply cloned graph containing only one single node with zero connected neighbors.
codingmedium - 5
Serialize and Deserialize N-ary Tree Serialize and deserialize an N-ary tree. Input: root = [1] Output: 1 Explanation: The tree only contains a root node, resulting in a minimal serialized string representation that can be accurately rebuilt.
codingmedium
Sign up for a personalized interview prep pack tailored to this role.
Similar roles
- Senior Software Engineer (Big Data)KDA Consulting Inc · Virginia, United States · Onsite
- Senior Software Engineer (Big Data)KDA Consulting Inc · Virginia, United States · Onsite
- Senior Software Engineer (Big Data)KDA Consulting Inc · Virginia, United States · Onsite
- Senior Software Engineer (Big Data)KDA Consulting Inc · Virginia, United States · Onsite
- Senior Software Engineer (Big Data)KDA Consulting Inc · United States · Onsite