Data Engineer I
Role summary
Manus is seeking a Data Engineer I to build the data backbone for its production systems, connecting operations, quality, supply chain, finance, maintenance, and lab functions. This role involves discovering, structuring, and preparing data from diverse on-prem and cloud sources, transforming raw manufacturing and lab data into reliable datasets for analytics, reporting, and AI/ML applications. The Data Engineer will work with industrial OT systems and cloud data platforms, focusing on data ingestion, cleaning, transformation, and warehouse modeling. This Augusta, GA-based position offers opportunities for growth within a company focused on sustainable alternatives.
Back to Job Openings
Data Engineer I
Augusta GA
Apply
Share
Data Engineer
Manus works across industries and value chains to accelerate the transition to BioAlternatives - better performing and more sustainable versions of complex molecules traditionally sourced from plants, animals, or fossil fuels. Our platform is proven to work across scales, bridging the Valley of Death between lab and manufacturing more efficiently and more reliably to deliver the benefits of synthetic biology today.
The
Data Engineer
will play a critical role in building the data backbone that connects Manus’ production-related systems, including Operations, Quality, Supply Chain, Finance, Maintenance, and laboratory functions. This Augusta, GA-based role sits at the intersection of industrial OT systems (DCS, historians, LIMS, CMMS) and modern cloud data platforms, with responsibility for discovering, structuring, and preparing data from heterogeneous on-prem and cloud sources. By transforming raw manufacturing and laboratory data into reliable, well-modeled datasets, the Data Engineer enables analytics, reporting, and future AI/ML applications across scale-up, routine manufacturing operations, and continuous improvement.
Why Work At Manus
- Opportunity – For motivated, results-oriented team members, our growth creates opportunities for personal and professional advancement.
- Accountability – You are given the resources you need to succeed and the freedom to make it happen; in return, we hold each other accountable for our high expectations.
- Passion – We love what we do and enjoy working with others who feel the same way. We embrace the challenge and hard work that come with working on the cutting edge.
Education And Experience
- Master’s degree in data science, Computer Science, Information Systems, or related field
- 1-2 Year of Industry experience.
Core Responsibilities
- Support Data Survey
- Map workflows, systems, data owners, and data flows across all Production-related activities, including Operations, Quality, Supply Chain, Finance, Maintenance, Labs, etc.
- Document data types, formats, quality, retention, and access controls
- Help classify data sources for the ingestion pipeline (real-time, batch, API, file-based)
- Ingestion Layer Development
- Build connectors for on-prem systems
- Develop ingestion jobs using Python or ETL tools
- Implement Kafka producers to stream data to the cloud warehouse
- Work with the India team to ensure schema consistency and metadata requirements
- Data Cleaning & Transformation
- Normalize datasets from multiple systems into standard schemas
- Handle missing values, outliers, timestamp alignment, and unit harmonization
- Apply mapping tables, reference data, and business rules
- Prepare data for Silver (Data Vault) and Gold (Star Schema) layers
- Assist Warehouse Modeling Team
- Work with senior warehouse engineers in India to implement:
- Hubs, Links, Satellites (Data Vault 2.0)
- Dimension and Fact tables
- Data Quality checks (freshness, completeness, uniqueness)
- Documentation & Collaboration
- Maintain detailed documentation for ingestion pipelines
- Work closely with Manus operations, QA, engineering, and IT
- Provide weekly updates to the Program Lead
Required Technical Skills
Programming & Scripting
- Strong Python skills
- Working knowledge of SQL (joins, window functions, CTEs)
- Experience using Pandas, PySpark, or similar tools for transformation
Streaming & Messaging
- Understanding of Apache Kafka:
- Producers/consumers
- Topics, partitions, offset management
- Kafka connectors (optional but preferred)
ETL / ELT & Data Integration
- Experience with batch ingestion using:
- REST APIs
- ODBC/JDBC
- CSV/JSON pipelines
- Scheduled jobs
- Familiarity with Azure Data Factory, Airflow, or any orchestration tool is a plus
Data Modeling & Architecture
- Understanding of:
- Bronze/Silver/Gold patterns (Medallion Architecture)
- Data Lake concepts
- Data cleaning techniques
- Slowly Changing Dimensions (SCD) (optional)
Cloud & DevOps Exposure
(nice to have)
- Basic understanding of:
- Azure Storage
- Event Hubs
- Synapse or Databricks
- Git, CI/CD familiarity is a plus
Soft Skills
- Strong communication — required for the data survey
- Curiosity and willingness to work across manufacturing + biotech systems
- Ability to document findings clearly and consistently
- Collaborative mindset — must coordinate with geographically spread-out teams and willing to work in multiple time zones.
Apply
Similar roles
Data Engineer IPAR Technology · Toronto, Ontario, Canada · Remote
Senior Data Engineer IPax8 · United States · Remote
Senior Data Engineer IHeartland Business Systems · West Des Moines, Iowa, United States · Hybrid
Data Engineer IHoneywell Aerospace Technologies · Phoenix, Arizona, United States · Hybrid- Data Engineer IDirect Companies · Sioux Falls, South Dakota, United States · Onsite