Data Engineer

Toronto, Ontario, CanadaHybridFull TimePosted 1 month ago

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Role:

Data Engineer

Location:

Hybrid / In-Office (Downtown Toronto)

Company:

www.getboundless.ai

About Boundless AI

Boundless AI is redefining business financing with an AI-driven funding marketplace that connects companies of all sizes and industries with over 150+ capital providers - including SMB lenders, private credit funds, commercial banks, and venture debt firms. We help businesses unlock the right capital at the right time with speed, transparency, and intelligence.

The Role

We’re looking for a
Data Engineer

to design, build, and maintain the data pipelines that power our AI-driven platform. You will be building pipelines for our existing web application, processing documents, and integrating LLM-based workflows to extract structured data. This is a hands-on role that requires solid engineering fundamentals, a pragmatic approach to problem-solving, and the ability to balance immediate delivery with long-term scalability.

Responsibilities

Design, build, and maintain batch and event-driven data pipelines that ingest, transform, and deliver data reliably across the platform.
Process and extract structured data from unstructured sources, including text-based and image-based documents, leveraging OCR and LLM integrations where appropriate.
Collaborate with product and engineering teams to model, organize, and govern data assets, ensuring consistency, quality, and accessibility as the product grows.
Implement and maintain workflow orchestration using tools such as Apache Airflow, ensuring pipelines are observable, recoverable, and well-documented.
Work within our Django-based backend to extend and optimize data models and integrations through the ORM and supporting services.
Apply security best practices to protect sensitive data throughout the pipeline, including encryption, access control, and compliance with data handling policies.
Contribute to infrastructure tasks in AWS, including provisioning and configuring services using containerized deployments with Docker.
Participate in an agile environment, contributing to code reviews, technical discussions, and collaborative problem-solving with the team.

Qualifications

Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
4+ years of experience in a data engineering, backend engineering, or similar role with a focus on building data pipelines.
Experience integrating LLM APIs (e.g., Claude, GPT) into data workflows for tasks such as classification, extraction, or enrichment.
Strong proficiency in Python, with experience using data processing libraries such as Pandas or Polars.
Solid understanding of relational databases and SQL, with hands-on experience writing and optimizing queries (e.g., PostgreSQL, MySQL).
Experience designing and implementing ETL/ELT pipelines for batch processing workloads, including familiarity with workflow orchestration tools such as Apache Airflow.
Working knowledge of Docker and containerized application deployment.
Experience with cloud platforms, particularly AWS, and an understanding of core services relevant to data workloads (e.g., S3, RDS, Lambda).
Strong understanding of data quality, validation, and security best practices.
Effective communication skills and the ability to work collaboratively across cross-functional teams.
Demonstrated ability to take ownership of projects end-to-end, from scoping and design through implementation and maintenance.
Familiarity with Git, CI/CD pipelines, and DevOps practices for automated testing and deployment.

Desirable Qualifications

Experience with OCR technologies and document processing pipelines for extracting structured data from images and PDFs.
Exposure to machine learning concepts, particularly around data preparation for model training and inference.
Experience with Infrastructure as Code tools such as Terraform or CloudFormation.
Knowledge of event-driven architectures and streaming data patterns (e.g., Kafka, SQS, or similar).
Experience with cloud data warehouse tools such as Snowflake, BigQuery, or Redshift.

Why Join Boundless AI

Be part of a mission-driven company reshaping business funding.
Work directly with a small, high-calibre team - including the CEO and CTO - on high-impact initiatives.
Competitive compensation, equity, and benefits.
Fast-paced environment with significant ownership and autonomy.

Ready to apply?

You'll be redirected to Boundless AI's application page.

Compensation estimateAI

Similar roles