| ETL stands for extract, transform and load. | | | | of what the software does is examine individual |
| These are the processes that enable companies | | | | data fields and apply rules to consistently convert |
| to move required data from a multitude of | | | | the contents to the form required by the data |
| sources, reformat and cleanse it, and then load it | | | | warehouse. |
| into a data warehouse for analysis, or maybe | | | | For example, a category might be represented in |
| onto another operational system to support | | | | three different ways in three different systems. |
| another business process. | | | | The ETL software would recognize that these |
| All Companies are well aware that they have | | | | entries mean the same thing and convert them |
| valuable information spread throughout their | | | | to the target format. |
| company networks that needs to be moved to a | | | | In addition, the ETL process might involve some |
| data warehouse for analysis and if they do not | | | | form of standardization and verification of fields |
| move it all this information will become worthless. | | | | such as addresses or dates of birth or even |
| The problem is that this data is stored in a | | | | expanding records with additional fields containing |
| variety of systems and formats. And different | | | | demographic information or data from other |
| systems are capable of using different methods | | | | systems. |
| of definitions. | | | | The transformation occurs when the data from |
| So this problem has to be solved. The best | | | | each source is mapped, cleansed and reconciled |
| solution is to implement extract, transform and | | | | so it all can be tied together, with receivables tied |
| load (ETL) software. By definition this software's | | | | to invoices and so on. |
| capabilities includes reading data from the source, | | | | After reconciliation, the data is transported and |
| cleaning and formatting it, and then writing it to its | | | | loaded into the data warehouse for analysis of |
| target where it can be utilized efficiently. | | | | things such as cycle times and total outstanding |
| The source data for the ETL processes can | | | | receivables. |
| come from almost any source: a mainframe or | | | | In the past, companies that were doing data |
| ERP application, a CRM tool, a flat file or | | | | warehousing projects often used in-house code to |
| spreadsheet, it matters not. | | | | support ETL processes, but as the sources and |
| After extraction, the data is transformed, or | | | | data evolved, the ETL code had to be modified |
| modified, so that it can be sent to the target | | | | and maintained. And companies encountered |
| repository. | | | | problems and scalability became a serious issue |
| There are a variety of ways to perform the | | | | with in-house ETL software. |
| transformation, and the work involved varies. The | | | | Providers of packaged ETL systems and |
| data may require reformatting only, but most | | | | third-party vendors that offer bolt-on tools |
| ETL operations also involve cleaning the data to | | | | emerged to replace the in-house systems.. |
| remove duplicates and enforce consistency. Part | | | | |