Product

March 20, 2023

ELT vs. ETL: What's the difference?

min read

Zach Behrman

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Many of Bigeye’s customers are cutting-edge tech companies that build with the modern data stack. The modern data stack uses data warehouses: not only for data storage, but also for transformation.

Warehouse storage makes it possible for analysts to write transformations in SQL, but it also means that data arrives in the warehouse without having undergone a cleanup. In this post, we'll explain the difference between ETL and ELT, and show why data observability is even more important when you are using ELT.

ETL vs. ELT: What's the difference?

In the world of data warehousing, ETL and ELT are two common approaches for managing data pipelines. ETL (extract, transform, load) pulls data from various sources, standardizes its format, then loads it into a target warehouse. On the other hand, ELT (extract, load, transform) flips the order of operations: raw data loads into the target warehouse first, and then is transformed as needed.

Bigeye monitoring with ELT and ETL setups

Data observability solutions like Bigeye can work with both ETL and ELT setups. Data reliability engineers will routinely turn to Bigeye in both situations, to monitor for data errors and anomalies.

With ETL setups, the transformation step happens first, so Bigeye will monitor tables (post-transformation) as they land in the data warehouse. Use Bigeye’s Metadata Metrics to track things like:

Whether the data has landed
Volume of data that has landed
Freshness of data that has landed

Alternatively, Bigeye might monitor an upstream OLTP database like Postgres or MySQL, before the transformation step.

ELT setups, on the other hand, with transformations occurring in the data warehouse, use Bigeye to monitor every step of the process.

Bigeye’s Metadata Metrics, provide out-of-the-box monitoring for freshness and volume. Putting this check in place automates monitoring for raw tables to ensure that they have actually landed.

Alternatively, Bigeye Deltas creates a Delta between the application database and the snapshotted raw table in the data warehouse. Deltas tracks two tables and checks that both are the same. That way, nothing in the “load” step of the data pipeline can go wrong. If it does, Bigeye will send an alert.

Why data observability is vital for ELT setups (and less so for ETL)

With ELT setups, data transformation happens inside the data warehouse. There isn't time to clean up duplicate records or incorrect formatting in the data before it lands in the data warehouse.

In such a scenario, a data observability solution that monitors the raw bronze layer can go a long ways toward sensing pipeline breakages before they filter downstream, and reducing warehouse costs.

With modern data warehouses like Snowflake or Redshift, not only do you pay for storage, but also for compute - i.e. you are charged for every query and every transformation that you run. By using a data observability solution to track raw tables as they land in the data warehouse, you reduce the number of transformations necessary, and thus your ELT expenditures.

Ideally, over time, data observability on your raw bronze layer of data should help inform your transformation models. If you are perennially getting low-quality with nulls, duplicates, strings lengths spanning outside what they should be, or UUIDs that aren’t what they should be, you may want to modify your transformation SQL to handle them.

share this episode

Resource

Monthly cost ($)

Number of resources

Time (months)

Total cost ($)

Software/Data engineer

$15,000

$540,000

Data analyst

$12,000

$144,000

Business analyst

$10,000

$30,000

Data/product manager

$20,000

$240,000

Total cost

$954,000

Role

Goals

Common needs

Data engineers

Overall data flow. Data is fresh and operating at full volume. Jobs are always running, so data outages don't impact downstream systems.

Freshness + volume
Monitoring
Schema change detection
Lineage monitoring

Data scientists

Specific datasets in great detail. Looking for outliers, duplication, and other—sometimes subtle—issues that could affect their analysis or machine learning models.

Freshness monitoringCompleteness monitoringDuplicate detectionOutlier detectionDistribution shift detectionDimensional slicing and dicing

Analytics engineers

Rapidly testing the changes they’re making within the data model. Move fast and not break things—without spending hours writing tons of pipeline tests.

Lineage monitoringETL blue/green testing

Business intelligence analysts

The business impact of data. Understand where they should spend their time digging in, and when they have a red herring caused by a data pipeline problem.

Integration with analytics toolsAnomaly detectionCustom business metricsDimensional slicing and dicing

Other stakeholders

Data reliability. Customers and stakeholders don’t want data issues to bog them down, delay deadlines, or provide inaccurate information.

Integration with analytics toolsReporting and insights

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

ELT vs. ETL: What's the difference?

Get the Best of Data Leadership

Stay Informed

Get Data Insights Delivered

ETL vs. ELT: What's the difference?

Bigeye monitoring with ELT and ETL setups

Why data observability is vital for ELT setups (and less so for ETL)

Get the Best of Data Leadership

Stay Informed

Get Data Insights Delivered

Get AI Ready with Governance & Data Observability

Data Leaders Are Tired of Firefighting—Here’s How AI Can Help

Achieving Data Transparency at Scale: Freedom Mortgage’s Success with Bigeye

Join the Bigeye Newsletter

ELT vs. ETL: What's the difference?

Get the Best of Data Leadership

Stay Informed

Get Data Insights Delivered

ETL vs. ELT: What's the difference?

Bigeye monitoring with ELT and ETL setups

Why data observability is vital for ELT setups (and less so for ETL)

Get the Best of Data Leadership

Stay Informed

Get Data Insights Delivered

Related posts

Get AI Ready with Governance & Data Observability

Data Leaders Are Tired of Firefighting—Here’s How AI Can Help

Achieving Data Transparency at Scale: Freedom Mortgage’s Success with Bigeye

Join the Bigeye Newsletter