Tuesday, September 2, 2014

Working with Excel as Source

Working with Microsoft Excel as Source:

PowerCenter treats a Microsoft Excel source as a relational database, not a flat file. Like relational sources required the ODBC connection to import definition, the Designer uses ODBC to import a Microsoft Excel source.

Note: No need database permissions to import Microsoft Excel sources.

Pre-required tasks before we import an Excel source definition need to complete:
1. Install the Microsoft Excel ODBC driver on the system.
2. Create a Microsoft Excel ODBC data source for each source file in the ODBC Data Source.
3. Prepare Microsoft Excel spreadsheets by defining ranges and formatting columns of numeric data.

Defining Ranges

The Designer creates source definitions based on ranges you define in Microsoft Excel. We can define one or more ranges in a Microsoft Excel sheet. If you have multiple sheets, define at least one range for each sheet. When you import sources in the Designer, each range displays as a relational source object.

We must define a range in the Microsoft Excel sheets to import the Excel source definition.

To define a range:
1. Open the Microsoft Excel file.
2. Highlight the column or group of columns of data to import.
3. Click Insert > Name > Define.
4. Enter a name for the selected range and click OK.
5. If you have multiple sheets, select each sheet and repeat steps 2 to 4 to define ranges for each set of data.
6. Click File > Save.

Formatting Columns of Numeric Data

In Microsoft Excel, we can assign datatypes to columns of data. The Microsoft Excel datatypes are ODBC datatypes. PowerCenter supports ODBC datatypes and converts them to transformation datatypes.

If we do not assign datatypes in Microsoft Excel, the Designer imports each column as VARCHAR. If you want to perform numeric or aggregate calculations in a mapping, assign numeric datatypes in Microsoft Excel before importing the spreadsheet.

To format columns in Microsoft Excel:
1.  Open the Microsoft Excel file.
2. Select the columns of data that consist of numeric data.
3. Click Format > Cells.
4. In the Number tab, select Number.
5. Specify the number of decimal places.
6. Click OK.
7. Click File > Save.

Steps to Import a Microsoft Excel Source Definition

After we define ranges and format cells, we can import the ranged Microsoft Excel Sheet in the Designer. Ranges display as source definitions when you import the source.

To import a Microsoft Excel source definition:
1. In the Designer, connect to the repository --> open the folder --> the source definition.
2. Open the Source Analyzer and click Sources --> Import from Database.
3. Select Excel Files (Microsoft Excel Driver (*.xls)) for the data source.
4. Click the Browse button to open the ODBC.
5. In the User or System DSN tabs, depending on where you created the data source, double-click the Microsoft Excel driver.
6. Click Select Workbook and browse for the Microsoft Excel file, which is considered a relational database.
7. Click OK three times to return to the Import Tables dialog box.
8. Click Connect in the Import Tables dialog box.

Note:

·         You do not need to enter a database user name and password.
·         The ranges you defined in the Microsoft Excel file appear as table names.
·         The database owner is No Owner because you are not required to enter a database user name.

9. Select the table you want to import.
To select more than one table, hold down the Ctrl or Shift keys to highlight multiple tables.

10. Click OK.
In the Navigator, the source definition appears in the Sources node, under the database name.

4 comments: