What is ETL (Extract, Transform, Load)?

ETL Definition

ETL (Extract, Transform, Load) is a data integration process that pulls data from one or more source systems, converts it into a consistent and usable format, and loads it into a destination system such as a data warehouse, a PIM, or an MDM platform. It is one of the most widely used methods for moving and consolidating data across an organisation.

What happens at each stage?

  • Extract — data is retrieved from source systems, which might include ERP software, spreadsheets, supplier feeds, databases, or third-party APIs. The extract step does not change the data; it simply copies it.
  • Transform — the extracted data is cleaned, reformatted, and standardised. This might involve converting units of measurement, correcting inconsistent values, merging fields from different sources, or filtering out records that do not meet quality rules. This is where most of the logic lives.
  • Load — the transformed data is written into the destination system, either in bulk (a full replacement) or incrementally (only new or changed records).

When is ETL used?

ETL is used any time data needs to move between systems that do not share a format or a direct connection. Typical scenarios include:

  • Consolidating product data from multiple suppliers into a central PIM
  • Moving transactional data into a Data Warehouse for reporting
  • Migrating records from a legacy system to a new platform
  • Syncing Master Data between an ERP and an MDM system.

What is the difference between ETL and ELT?

In ETL, the data is transformed before it is loaded, meaning the processing happens outside the destination system, often in a dedicated tool. In ELT, the raw data is loaded first and transformed inside the destination system using its own processing power. The practical difference comes down to where the transformation logic runs and what the destination system is capable of handling.