Thursday, April 4, 2013

Sorter Transformation

The Sorter Transformation is an Active and Connected Transformation.

We can sort data with the Sorter transformation and it will allow us to sort data in ascending or descending order according to a specified sort key. You can also configure the Sorter transformation for case-sensitive sorting, and specify whether the output rows should be distinct. We can sort data in ascending or descending order according to a specified sort key.

When you create a Sorter transformation in a mapping, you specify one or more ports as a sort key and configure each sort key port to sort in ascending or descending order. You also configure sort criteria the Integration Service applies to all sort key ports and the system resources it allocates to perform the sort operation.

Sorting Data
The Sorter transformation contains only input/output ports. All data passing through the Sorter transformation is sorted according to a sort key. The sort key is one or more ports that you want to use as the sort criteria.

Sorter Transformation Properties:

Sorter Cache Size
The Integration Service uses the Sorter Cache Size property to determine the maximum amount of memory it can allocate to perform the sort operation. The Integration Service passes all incoming data into the Sorter transformation before it performs the sort operation.

The Integration Service to determine the cache size a maximum amount of memory of 2 GB (2,147,483,648 bytes) or greater, you must run the session on a 64-bit Integration Service.

If it cannot allocate enough memory, the Integration Service fails the session. For best performance, configure Sorter cache size with a value less than or equal to the amount of available physical RAM on the Integration Service machine. Allocate at least 16 MB (16,777,216 bytes) of physical memory to sort data using the Sorter transformation. Sorter cache size is set to 16,777,216 bytes by default.

The Integration Service requires disk space of at least twice the amount of incoming data when storing data in the work directory. If the amount of incoming data is significantly greater than the Sorter cache size, the Integration Service may require much more than twice the amount of disk space available to the work directory.

Case Sensitive
The Case Sensitive property determines whether the Integration Service considers case when sorting data. When you enable the Case Sensitive property, the Integration Service sorts uppercase characters higher than lowercase characters.

Work Directory
Integration Service uses to create temporary files while it sorts data. After the Integration Service sorts the data, it deletes the temporary files. By default, the Integration Service uses the value specified for the $PMTempDir process variable.

When you partition a session with a Sorter transformation, you can specify a different work directory for each partition in the pipeline. To increase session performance, specify work directories on physically separate disks on the Integration Service system.

Distinct Output Rows
You can configure the Sorter transformation to treat output rows as distinct to remove duplicate rows from incoming data. If you configure the Sorter transformation for distinct output rows, the Mapping Designer configures all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation.

Null Treated Low
You can configure the way the Sorter transformation treats null values. Enable this property if you want the Integration Service to treat null values as lower than any other value when it performs the sort operation. Disable this option if you want the Integration Service to treat null values as higher than any other value.

Transformation Scope
The transformation scope specifies how the Integration Service applies the transformation logic to incoming data: 
  • Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when a row of data depends on all rows in the same transaction, but does not depend on rows in other transactions.
  • All Input. Applies the transformation logic on all incoming data. When you choose All Input, the PowerCenter drops incoming transaction boundaries. Choose All Input when a row of data depends on all rows in the source.
Sample Mapping:

1. Connect and Open the folder if not already opened.

2. Select Tools --> Mapping Designer
3. Select Mappings --> Create

It will pop-up "Mapping Name". Enter the mapping name of your choice" and Click on 'OK'. Example: m_emp_dept_sort
4. Drag the Source and Target definitions into workspace if they are already exist. If not click here to know how to create or import Tabledefinitions.
5. Select 'Transformation' from Menu --> Create
a) That will appear you 'Select the transformation type to create:'
b)  Select 'Sorter' from drop down and 'Enter a new name for this transformation:' as "srt_emp_dept"
c) Click 'Create' and 'Done'

or

a) Click on Expression Transformation icon marked below in below snapshot.
b) Click in the workspace in Mapping Designer.
c) Select SRTTRANS in workspace and Right Click --> Edit.
d) In Transformation tab --> Click on 'Rename' highlighted above which will pop-up 'Rename Transformation'. Enter the Transformation Name: "srt_emp_dept"
e) Click on 'OK'
f) Click on 'Apply' and 'OK'.
6). Drag required ports from 'SQ_EMP' Source Qualifier to 'srt_emp_dept' expression Transformation.
7) Select 'srt_emp_dept' Sorter Transformation and Right Click --> Edit --> Ports Tab.
a) Check 'Key' for those columns those you want to sort in Ascending/Descending order.
b) Properties tab: 

  • Select 'Case Sensitive' if you want to sort case sensitive while comparisons.
  • Select 'Distinct' if you want to remove duplicate records.
  • Select 'Null Treated Low' if you want to Null value in a port is treated lower or higher than any other value.
Note: Select 'Key' for all the port in 'Port Tab' will not remove duplicate records. It is for only to sort the records by all columns in Ascending or descending order. To remove duplicate records select 'Distinct' option in 'Property Tab'.

c) Click 'Apply' and 'OK'.


8). Drag required port from 'srt_emp_dept' to 'EMPLOYEE' Target as below: 
9) Select 'Mapping' from Menu --> Validate.
10) Select 'Repository' from Menu --> Save.

0 comments:

Post a Comment