SUP! Hubert’s Substack

SUP! Hubert’s Substack

Share this post

SUP! Hubert’s Substack
SUP! Hubert’s Substack
How to implement Near-ZeroETL in a Data Mesh

How to implement Near-ZeroETL in a Data Mesh

Using ksqlDB, Apache Kafka, and Real-Time OLAP

Hubert Dulay's avatar
Hubert Dulay
Feb 13, 2023
∙ Paid
3

Share this post

SUP! Hubert’s Substack
SUP! Hubert’s Substack
How to implement Near-ZeroETL in a Data Mesh
Share

ZeroETL

In re-Invent 2022, AWS introduced the concept of ZeroETL which in their case meant a native integration between their Aurora (OLTP) and Redshift (OLAP) services.

Let’s first define what ETL and ELT are:

  • ETL (extract, transform, load) - data is extracted from the source, transformed in the data pipeline, then loaded into a OLAP (analytical) database.

  • ELT (extract, load, transform) - data is extracted from the source, loaded into the OLAP, then transformed in the OLAP using SQL.

ZeroETL negates the need for separate data pipelines to perform ETL and provides built-in replication of transactional data for near real-time analytics. ZeroETL is similar to ELT in that they both transform data in the OLAP database. But doing so can force batching semantics. This is unlike ETL where the transformation can remain in real-time. ETL basically provides a way to pre-processes the data before it reaches the OLAP so that it is relieved from performing transformations.

So how can we create a zeroETL solution that supports real-time use cases?

Keep reading with a 7-day free trial

Subscribe to SUP! Hubert’s Substack to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Hubert Dulay
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share