January 23, 2013

Pentaho Data Integration - How To Use a Transformation (#002)

Explore the repository.


At the moment the repository is empty. Right-click on the folder icon and select "New Folder."


The newly created folder will be named "hr". In this example Oracle's "hr" schema will be used.


The folder is created.


Click the "New File" icon and choose "Transformation."


Click "File", "Save as..."


The "Transformation Properties" window


Fill the fields in the following way:
"Transformation Name": tr_hr_countries_initial_load
change "Directory" to "hr" by clicking the folder incon.

Transformations' naming convention:
tr_[project]_[table]_[transformations_task]


Transformation is saved.


After exploring the repository, we can see that the transformation exists i. the "hr" folder.


Click the "View" tab.


Click a "New File" icon and choose "Database Connection" to create a source connection.


The "Database Connection" window


Fill fields in the "Database Connection" window.
Connection Name: conn_oracle11g_orcl_hr_hr
Host Name: host name or IP address
Database Name: orcl
User Name: hr
Password: hr


Click "Test" to check it.


Create a connection for target schema.
Connection Name: conn_oracle11g_orcl_dwh_dwh
Host Name: host name or IP address
Database Name: orcl
User Name: dwh
Password: dwh


Now the "Database Connections" folder contains two connections definitions.


Click the "Design" tab.
From the "Input" folder select the "Table input" step, drag and drop it into the workspace.
From the "Output" folder select the "Table output" step, drag and drop it into the workspace.
Connect the steps by holding a shift key, clicking on the first step and a moving mouse pointer to the next step.
(hold shift, click and hold on the first step, drag and release over a target step)
Save the transformation.


Doubleclick the "Table input" step
The "Table input" window:
- "Step Name": src_countries
- "Connection": use connection defined in previous steps "conn_oracle11g_orcl_hr_hr"
Click "Get SQL select statement" to automatically create a select statement.


Doubleclick the "Table output" step.
The "Table output" window:
- "Step Name": tgt_countries
- "Connection": use connection defined in previous steps "conn_oracle11g_orcl_dwh_dwh"
Click "Get SQL select statement" to automatically create a select statement.


Click the "SQL" button to see CRETE TABLE statement and click "Execute" to create a table in "dwh" schema.
Save the transformation.



The transformation is ready to run the initial load of a "countries" table.


Click "Action" and then "Run."


In the "Execute a transformation" window click "Lunch."


The transformation was executed flawlessly.


Let's check the results.



Looks great!

No comments:

Post a Comment