Friday, April 5, 2013

Union Transformation

Union transformation is an Active and Connected transformation.

The Union transformation is a multiple input group transformation that you use to merge data from multiple pipelines or pipeline branches into one pipeline branch. It merges data from multiple sources similar to the 'UNION ALL' SQL statement to combine the results from two or more SQL statements. Similar to the UNION ALL statement, the Union transformation does not remove duplicate rows.

The Integration Service processes all input groups in parallel. It concurrently reads sources connected to the Union transformation and pushes blocks of data into the input groups of the transformation. The Union transformation processes the blocks of data based on the order it receives the blocks from the Integration Service.

We can connect heterogeneous sources to a Union transformation. The transformation merges sources with matching ports and outputs the data from one output group with the same ports as the input groups.

The Union transformation is developed using the Custom transformation.

Rules and Guidelines for Union Transformations
Use the following rules and guidelines when you work with a Union transformation:
  • We can create multiple input groups, but only one output group.
  • All input groups and the output group must have matching ports. The precision, datatype, and scale must be identical across all groups.
  • The Union transformation does not remove duplicate rows. To remove duplicate rows, you must add another transformation such as a Router or Filter transformation.
  • We cannot use a Sequence Generator or Update Strategy transformation upstream from a Union transformation.
  • The Union transformation does not generate transactions.
Working with Groups and Ports
A Union transformation has multiple input groups and one output group. Create input groups on the Groups tab, and create ports on the Group Ports tab.

We can create one or more input groups on the Groups tab. The Designer creates one output group by default. You cannot edit or delete the output group.

We can create ports by copying ports from a transformation, or we can create ports manually. When we create ports on the Group Ports tab, the Designer creates input ports in each input group and output ports in the output group. The Designer uses the port names you specify on the Group Ports tab for each input and output port, and it appends a number to make each port name in the transformation unique. It also uses the same metadata for each port, such as datatype, precision, and scale.

The Ports tab displays the groups and ports you create. You cannot edit group and port information on the Ports tab. Use the Groups and Group Ports tab to edit groups and ports.

Sample Mapping:
Here we are combining data coming from 4 different regions which are in the same format (structure). EMP_HYD, EMP_CHN, EMP_PUN and EMP_BLORE.

1. Connect and Open the folder if not already opened.

2. Select Tools --> Mapping Designer
3. Select Mappings --> Create

It will pop-up "Mapping Name". Enter the mapping name of your choice" and Click on 'OK'. Example: m_emp_all_regions
4. Drag the Source and Target definitions into workspace if they are already exist. If not click hereto know how to create or import Table definitions.
5. Select 'Transformation' from Menu --> Create
a) That will appear you 'Select the transformation type to create:'
b) Select 'Union' from drop down and 'Enter a new name for this transformation:' as "uni_emp_all_regions"
c) Click 'Create' and 'Done'
  
or

a) Click on Union Transformation icon marked below in below snapshot.
b) Click in the workspace in Mapping Designer.
c) Select Union in workspace and Right Click --> Edit.
d) In Transformation tab --> Click on 'Rename' highlighted above which will pop-up 'Rename Transformation'. Enter the Transformation Name: "uni_emp_all_regions"
e) Click on 'OK'
f) Click on 'Apply' and 'OK'.
6) Select Union Transformation and Right Click --> Edit.

a) Go to Groups tab and Create 4 Groups.
b) Click 'Apply' and Click 'OK'. (You will able to see 4 groups)
7). Drag required ports from 'SQ_EMP_HYD' Source Qualifier to 'uni_emp_all_regions' union Transformation.
Then Link the port from SQ_EMP_CHN, SQ_EMP_PUN, SQ_EMP_BLORE to respective group in uni_emp_all_regions union transformation as below.
8) Link required ports from 'OUTPUT' group of (uni_emp_all_regions) Union transformation to EMP target.
9) Select 'Mapping' from Menu --> Validate.
10) Select 'Repository' from Menu --> Save.
 
Note: Union transformation will not delete duplicate records. It will only merge the input records from difference sources with same definition.

8 comments:

  1. Hi,
    Can we use Source qualifier transformation for the same scenario.Can you please let me know whats the difference between Source qualifier and Union transformations.
    Thank you in advance !

    ReplyDelete
    Replies
    1. difference is comes when source is relational or flat file

      Delete
  2. Hi,
    Can you please explain why union is an active T/R?

    ReplyDelete
  3. Hi All,

    I am not clear about this statement "The Union transformation does not generate transactions".
    Does it mean that Union T/R has no effect in terms of transaction boundary over the rows passing through it.?
    If not can anyone pls explain me this point. :)

    Thanks in advance!

    ReplyDelete
  4. Hi Martin,

    I mean that the union transformation will not generate any values (intermediate value) at the run time like expression and aggregator transformation.

    Thanks,
    Gowtham

    ReplyDelete
    Replies
    1. Thanks Gowtham
      I was thinking in a diff direction.

      Delete
  5. can we use source qualifier instead of union?

    ReplyDelete