What is ETL and Why Use It?

ETL stands for extract, transform and load.of what the software does is examine individual
These are the processes that enable companiesdata fields and apply rules to consistently convert
to move required data from a multitude ofthe contents to the form required by the data
sources, reformat and cleanse it, and then load itwarehouse.
into a data warehouse for analysis, or maybeFor example, a category might be represented in
onto another operational system to supportthree different ways in three different systems.
another business process.The ETL software would recognize that these
All Companies are well aware that they haveentries mean the same thing and convert them
valuable information spread throughout theirto the target format.
company networks that needs to be moved to aIn addition, the ETL process might involve some
data warehouse for analysis and if they do notform of standardization and verification of fields
move it all this information will become worthless.such as addresses or dates of birth or even
The problem is that this data is stored in aexpanding records with additional fields containing
variety of systems and formats. And differentdemographic information or data from other
systems are capable of using different methodssystems.
of definitions.The transformation occurs when the data from
So this problem has to be solved. The besteach source is mapped, cleansed and reconciled
solution is to implement extract, transform andso it all can be tied together, with receivables tied
load (ETL) software. By definition this software'sto invoices and so on.
capabilities includes reading data from the source,After reconciliation, the data is transported and
cleaning and formatting it, and then writing it to itsloaded into the data warehouse for analysis of
target where it can be utilized efficiently.things such as cycle times and total outstanding
The source data for the ETL processes canreceivables.
come from almost any source: a mainframe orIn the past, companies that were doing data
ERP application, a CRM tool, a flat file orwarehousing projects often used in-house code to
spreadsheet, it matters not.support ETL processes, but as the sources and
After extraction, the data is transformed, ordata evolved, the ETL code had to be modified
modified, so that it can be sent to the targetand maintained. And companies encountered
repository.problems and scalability became a serious issue
There are a variety of ways to perform thewith in-house ETL software.
transformation, and the work involved varies. TheProviders of packaged ETL systems and
data may require reformatting only, but mostthird-party vendors that offer bolt-on tools
ETL operations also involve cleaning the data toemerged to replace the in-house systems..
remove duplicates and enforce consistency. Part