When we begin our IT career, one of the first things we learn is the difference between DATA and INFORMATION. We have learned that data must be treated and that a set of information. The concept is practically the same for the process of transforming data into ETL.
We have already talked about previous articles about the definition of ETL and also about STAGING AREA (very important concept for those who will work with ETL).
Now is the time to talk about one of the main ETL processes that is TRANFORMATION. A simple subject to speak of and that can have an execution that goes from simple to complex quickly.
At the definition level, ETL Transformation is the process that takes data from the source and transforms it into the desired information. Understand desired as the one that was defined in negotiation with the client. Nothing more than this.
Let’s take an example to make it clearer and easier to understand: Suppose you are involved in a project that should generate reports with information from the last 24 hours (from the report generation date). In this report you should get the sum of all customer sales with dates in a specific format.
A very simple example, but it shows the need for data transformation, since we will have to transform the type of DATA that is in the Source (American) to another format (requested by the client).
Therefore, it is up to you to identify the fields and apply the necessary functions to transform the information as requested.
It is in the process of transformation that you:
- Apply the famous DE-PARA with the data;
- Standardizes capitalized names;
- Standardizes data in a single format (Brazilian Standard);
- Collect as TXT numeric information to load on systems;
- It feeds a database, among other features.
- I hope this quick article may have helped.
If you have any questions, please contact us.
Strong hug.
Eduardo Santana
bufallos@bufallos.com.br