Thursday, April 18, 2013

Joiner transformation

The Joiner Transformation is active and connected transformation.

The Joiner transformation to join source data from two related heterogeneous sources residing in different locations or file systems. We can also join data from the same source. The use Joiner transformation sources should have at least one matching column. The Joiner transformation uses a condition that matches one or more pairs of columns between the two sources.

The two input pipelines include a master pipeline and a detail pipeline or a master and a detail branch. The master pipeline ends at the Joiner transformation, while the detail pipeline continues to the target.

No. Of Joiner = No Of source (N) - 1.

The following limitations on the pipelines you connect to the Joiner transformation:
  • You cannot use a Joiner transformation when either input pipeline contains an Update Strategy transformation.
  • You cannot use a Joiner transformation if you connect a Sequence Generator transformation directly before the Joiner transformation.
Join Type:
The join is a relational operator that combines data from multiple tables into a single result set. The Joiner transformation is similar to an SQL join except that data can originate from different types of sources systems or databases.

We can define the join type on the Properties tab in the transformation. The Joiner transformation supports the following types of joins:
  • Normal
  • Master Outer
  • Detail Outer
  • Full Outer
Note: A normal or master outer join performs faster than a full outer or detail outer join.
If a result set includes fields that do not contain data in either of the sources, the Joiner transformation populates the empty fields with null values. If you know that a field will return a NULL and you do not want to insert NULLs in the target, you can set a default value on the Ports tab for the corresponding port.

Normal Join
With a normal join, the Integration Service discards all rows of data from the master and detail source that do not match, based on the condition.

Master Outer Join
A master outer join will keeps all rows of data from the detail source and the matching rows from the master source. It discards the unmatched rows from the master source.

Detail Outer Join
A detail outer join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source.

Full Outer Join
A full outer join keeps all rows of data from both the master and detail sources.

Using Sorted Input
We can improve session performance by configuring the Joiner transformation to use sorted input. When you configure the Joiner transformation to use sorted data, the Integration Service improves performance by minimizing disk input and output. If the sorted input option is select then we are telling Informatica integration service that we have passing sorted data to the joiner transformation.

The following tasks to configure the mapping:
  • Configure the sort order. Configure the sort order of the data you want to join. You can join sorted flat files, or you can sort relational data using a Source Qualifier transformation. You can also use a Sorter transformation.
  • Add transformations. Use transformations that maintain the order of the sorted data.
  • Configure the Joiner transformation. Configure the Joiner transformation to use sorted data and configure the join condition to use the sort origin ports. The sort origin represents the source of the sorted data.
  • If you have selected sorted input option and passed un-sorted data to the transformation then the fail will failed.
Cache:
When the Integration Service processes a Joiner transformation, it reads rows from both sources concurrently and builds the index and data cache based on the master rows. The Integration Service then performs the join based on the detail source data and the cache data.

The number of rows the Integration Service stores in the cache depends on the partitioning scheme, the source data, and whether you configure the Joiner transformation for sorted input.

To improve performance for an unsorted Joiner transformation, use the source with fewer rows as the master source. To improve performance for a sorted Joiner transformation, use the source with fewer duplicate key values as the master.

Sample Mapping:

1. Connect and Open the folder if not already opened.

2. Select Tools --> Mapping Designer
3. Select Mappings --> Create

It will pop-up "Mapping Name". Enter the mapping name of your choice" and Click on 'OK'. Example: m_emp_dept_joiner
4. Drag the Source and Target definitions into workspace if they are already exist. If not click hereto know how to create or import Table definitions.
5. Select 'Transformation' from Menu --> Create
a) That will appear you 'Select the transformation type to create:'
b) Select 'Joiner' from drop down and 'Enter a new name for this transformation:' as " jnr_emp_dept"
c) Click 'Create' and 'Done'

or

a) Click on Joiner Transformation icon marked below in below snapshot.
b) Click in the workspace in Mapping Designer.
c) Select JNRTRANS in workspace and Right Click --> Edit.
d) In Transformation tab --> Click on 'Rename' highlighted above which will pop-up 'Rename Transformation'. Enter the Transformation Name: "jnr_emp_dept"
e) Click on 'OK'
f) Click on 'Apply' and 'OK'.
6). Drag required ports from 'SQ_EMP' and 'SQ_DEPT' Source Qualifiers to 'jnr_emp_dept' Joiner Transformation.
7) Select 'jnr_emp_dept' Joiner Transformation and Right Click --> Edit --> Properties Tab.
 
a) Select the Join Type: Normal

Note: You can see the type of joins we have in joiner transformation in below snapshot.
 
b) Select 'Sorted Input' if the data is already sorted in Sql Override for relation database or using Sorter transformation before passing to Joiner transformation. If data is not sorted then do not select it.
 
c) Go to the 'Condition' tab for conditional columns. Select the Master columns for drop down for join condition
Like this:
 
d) Click 'Apply' and 'OK'.
8). Drag required port from 'jnr_emp_dept' joiner to 'EMP_DEPT_FILE' Target port as below:
 
9) Select 'Mapping' from Menu --> Validate.
10) Select 'Repository' from Menu --> Save.
 

Note:
  1. Choose Sorted Input to join sorted data. Using sorted input can improve performance.
  2. The Joiner transformation does not match null values.
  3. A normal or master outer join performs faster than a full outer or detail outer join.
  4. We can improve session performance by configuring the Joiner transformation to use sorted input.
  5. The Joiner transformation does not match null values. For example, if both EMP_ID1 and EMP_ID2 contain a row with a null value, the Integration Service does not consider them a match and does not join the two rows. To join rows with null values, replace null input with default values, and then join on the default values.
  6. You can also join data from output groups of the same transformation, such as the Custom transformation or XML Source Qualifier transformation. Place a Sorter 

11 comments:

  1. Why you are taking emp as Master table and Dept as Child table.In emp there is more record than that of Dept.

    ReplyDelete
  2. Yeah..even same i was thinking..why we have taken emp table as master table..

    ReplyDelete
  3. In formatica... integration service automatically takes,first table dragged ports are taken as master source and second table dragged ports are taken as detail source.
    If you want to change you will change it as dept table ports as master source.
    But you will not find any difference in session performance because you can have only 14 rows of data.

    ReplyDelete
  4. really i feel good treasure.. to learn

    ReplyDelete
  5. usually each Master record in the Joiner Cache will compares with detailed records. if master records are less then the number of hits to master will be less because of which performance increases. - Rajeev.

    ReplyDelete
  6. Example has 1 condition and 9 as output port, what would be index and data cache?

    ReplyDelete
  7. super post keep update you information like this Informatica Online Course

    ReplyDelete
  8. Thank you.Well it was nice post and very helpful information on Informatica Online Training Hyderabad

    ReplyDelete
  9. can we use operators other than '=' in joiner

    ReplyDelete