[转]Using the Microsoft Connector for Oracle by Attunity with SQL Server 2008 Integration Services

时间:2023-03-08 15:37:50

本文转自:http://technet.microsoft.com/en-us/library/ee470675(v=sql.100).aspx

SQL Server Technical Article

Writer: Doug Wheaton (Attunity Ltd.)

Technical Reviewer: Ramakrishnan Krishnan (Microsoft), Robert Zare (Microsoft), Jeff Bernhardt (Microsoft), David Noor (Microsoft)

Published: September 2009

Applies to: SQL Server 2008

Summary: The Microsoft Connector for Oracle by Attunity provides optimal performance during loading or unloading of data from Oracle with Microsoft SQL Server Integration Services packages. The Attunity connector offers a user experience similar to that of the built-in connectors, with intuitive capabilities including configuration, metadata browsing, and error routing. It is available for download at no cost to users of SQL Server 2008 Enterprise. This paper discusses how to install and configure the Microsoft Connector for Oracle by Attunity, the use cases for the connector, and the performance of the connector. The paper also discusses how to troubleshoot and analyze run-time failures related to the connector.

Introduction

The new high-speed Microsoft® Connector for Oracle by Attunity is available download for Microsoft SQL Server® Enterprise customers as a Web download from the Microsoft and Attunity Web sites. This connector was designed to be significantly faster than existing connectors for Oracle. For more information, see “Performance” later in this paper. The connector enables massive data extraction from and to Oracle within the Microsoft SQL Server environment, via standard interfaces in both Full and Incremental modes. The Oracle connector supports the Fast Load method and can run several processes in parallel.

The Microsoft Oracle Connector 1.0 by Attunity has three main components:

  • An Oracle source, to extract bulk data from Oracle
  • An Oracle destination, to import bulk data into Oracle
  • An Oracle Connection Manager, to enable a package to connect to an Oracle data source

Prerequisites and Installation

Supported product versions:

  • Microsoft SQL Server 2008 data management software
  • Oracle Client version 10.x or 11.x

Supported operating systems:

  • The 32-bit (x86) and 64-bit (x64) editions of the Windows® XP operating system
  • The 32-bit (x86) and 64-bit (x64) editions of the Windows Vista® operating system
  • The 32-bit (x86), 64-bit (x64), and 64-bit Itanium (IA64) editions of the Windows Server® 2003 operating system
  • The 32-bit (x86), 64-bit (x64), and 64-bit Itanium (IA64) editions of Windows Server 2008 32-bit (x86), 64-bit (x64), and 64-bit Itanium (IA64)

Installation:

  • The Oracle client software must be installed on the computer running SQL Server Integration Services (SSIS)
  • Oracle database versions 9.2.0.4 and higher are supported

Additional considerations for 64-bit environments:

  • On 64-bit platforms (both x64 and IA64), you need only install the Attunity Oracle 64 bit connector for the appropriate architecture (x64 or IA64). You should not install the Attunity Oracle 32 bit connector alongside the 64-bit versions, because the 64 bit connector installs both 32-bit and 64-bit versions.
  • The Oracle 32 Bit client is required in order to develop packages on a 64-bit computer because SQL Server Business Intelligence Studio runs in 32-bit mode.
  • The Oracle 64 Bit client is required in order to run packages on a 64-bit computer.
  • If you run the package in 32-bit mode, you do not need the Oracle 64 Bit client.

In summary, if you want both to develop and to run SSIS packages on a 64-bit computer:

  1. Install the Attunity Oracle 64 bit connector.
  2. Install the Oracle 11G 32 Bit client.
  3. Install the Oracle 11G 64 Bit client.
  4. Restart the computer after the Oracle installations are complete.
  5. You should be able to develop and run SSIS packages.

If after completing these steps you still experience functional issues, you can use the SQLPLUS Oracle utility to connect to the database in order to confirm that the client was installed properly.

The following screen capture shows both Oracle clients installed on a computer running a 64-bit edition of Windows Server 2008.

Figure 1: Registry entries showing the 32-bit and 64-bit Oracle client software installed on the same computer, which is running a 64-bit edition of Windows Server.

Figure 2: Supported platforms and versions for the Oracle connector

Install and Add the Microsoft Connector for Oracle by Attunity

The installation steps for the connector are the same for all of the Use Cases described in this article. Before you proceed with any individual Use Case, follow these steps:

  1. Install the Microsoft Connector for Oracle by Attunity.
  2. Enable and  as data flow sources in the Toolbox in Business Intelligence Development Studio.
  3. Set up the Connection Manager for Oracle.

Each step is described in more detail in the following sections.

Install the Microsoft Connector for Oracle by Attunity

Installation Process

Figure 3: Starting the installation process for the Oracle Connector by Attunity

Figure 4: Accepting the license agreement

Figure 5: Selecting the installation folder

Figure 6: Finishing the installation

After installation is complete, you are prompted to restart SQL Server Integration Services.

Figure 7: Restarting the Integration Services service after installation of the Oracle Connector by Attunity

In addition to the component installation, a Start menu item is created to provide easy access to documentation.

Figure 8: New item on the Start menu for the Help file for the Oracle Connector by Attunity

Add the Oracle Source and Oracle Destination Components to the Toolbox

After installation, open SQL Server Business Intelligence Development Studio and create a new Integration Services project. The first time after the SQL Server 2008 Feature Pack installation, the Microsoft Connector components are not enabled by default as data flow sources. To enable them, in the Toolbox, right-click Data Flow Sources, click Choose Items, and then on the SSIS Data Flow Items tab, and select the Oracle Source check box, as shown in the following figures.

Oracle Source Component

1. In the Toolbox, right-click Data Flow Sources and then click Choose Items.

Figure 9: The first step in adding the Oracle source to the Toolbox

2. Click the SSIS Data Flow Items tab.

3. Under Name, select the Oracle Source check box, and then click OK.

Figure 10: Selecting the Oracle source

At this point Oracle Source is now available as a data flow source.

Figure 11: Oracle Source now appears in the Toolbox

Oracle Destination Component

Repeat the steps to enable display of the Oracle destination component.

Configure the Microsoft Connector for Oracle by Attunity

Oracle Connection Manager

In the SSIS project, add a new connection of type MSORA.

Figure 12: Adding an Oracle connection to a package

After the connection is created, edit it and fill out the system and logon information. To validate connectivity, click Test Connection.

  • The connection parameters are the same you would use for SQLPLUS.
  • The connection manager is used by both the source and destination components.
  • The connection manager uses the Oracle client to connect to the Oracle database.

Figure 13: Setting up the Oracle connection

Oracle Source Component

The Oracle source extracts data from Oracle databases by using a database table, a view, or an SQL command.

The Oracle source has the following data access modes for extracting data:

  • A table or view.
  • The results of an SQL statement.

The source uses an Oracle connection manager, which specifies the Oracle provider to use.

The Oracle source has one regular output and one error output.

Error Handling in the Oracle Source Component

The Oracle source has an error output. The component error output includes the following output columns:

  • Oracle Error Code: The number that corresponds to the current error. See the Oracle documentation for a list of relevant error codes.
  • Error Column: The source column causing the error (for conversion errors).
  • Error Row Columns: The record data that causes the error.

Depending on the error behavior setting, the Oracle source supports returning errors (data conversion, truncation) that occur during the extraction process in the error output.

Parallelism

There is no limit on the number of Oracle sources that can run in parallel against the same table or different tables, on the same computer, or on different computers (other than normal global session limits). The Oracle source can also run in parallel to the Oracle destination working against the same table.

Troubleshooting the Oracle Source

You can log the calls that the Oracle source makes to the Oracle Call Interface (OCI). You can use this logging ability to troubleshoot the extraction of data from Oracle data sources that the Oracle source performs. To log the calls that the Oracle source makes to an Oracle data source, enable package logging and select the Diagnostic event at the package level.

Configuring the Oracle Source

You configure the Oracle source in the Oracle Source dialog box.

Oracle Source Dialog Box (Connection Manager Page)

Use the Connection Manager page of the Oracle Source dialog box to select the Oracle connection manager for the source. On this page you can also select a table or view from the database.

Figure 14: Connection Manager page of the Oracle Source dialog box

Options on the Connection Manager Page of the Oracle Source Dialog Box

Connection Manager - Select an existing connection manager from the list, or click New to create a new connection.

New - Click New to open the Oracle Connection Manager Editor, where you can create a new connection manager.

Data Access Mode - Select the method for selecting data from the source. The options are shown in the following table.

                                          Option                                                                               Description                                    

Table or view

Retrieve data from a table  or view in the Oracle data source. When you select this option, select a  value from the drop-down list for the name of the table or the view. Select  an available table or view from the database from the list. This list  contains the first 1,000 tables only. If your database contains more than  1,000 tables, you can type the beginning of a table name or use the (*) wild  card to enter any part of the name to display the table or tables you want to  use. If no default database is selected for the connection manager, all  tables for all databases are returned except for SYS, MDSYS, OLAPSYS, WKSYS,  CTXSYS, XDB, WMSYS, SYSTEM, EXFSYS, ORDSYS, DMSYS, and WK_TEST.

SQL command

Retrieve data from the  Oracle data source by using an SQL query. When you select this option, enter  a query in one of the following ways:

Enter the text of the SQL  query in SQL command.

Click Browse to load the SQL query from a text file.

Click Parse query to verify the syntax of the query text.

Table 1: Data access mode options for the Oracle source

Oracle Source Dialog Box (Columns Page)

Use the Columns page of the Oracle Source dialog box to map an output column to each external (source) column.

Figure 15: Columns page of the Oracle Source dialog box

Options on the Columns Page of the Oracle Source Dialog Box

Available External Columns - A list of available external columns in the data source. You cannot use this table to add or delete columns. Select the columns to use in the source. The selected columns are added to the External Column list in the order they are selected. Select the Select All check box to select all of the columns.

External Column - A view of the external (source) columns in the order that you see them when configuring components that consume data from the Oracle source. To change this order, first clear the selected columns in the Available External Columns list, and then select external columns from the list in a different order. The selected columns are added to the External Column list in the order you select them.

Output Column - Enter a unique name for each output column. The default is the name of the selected external (source) column; however, you can choose any unique, descriptive name. The name entered is displayed in the SSIS Designer.

Note: Columns of unsupported data types are shown as external columns, but they are not exposed as output columns.

Oracle Source Dialog Box (Error Output Page)

Use the Error Output page of the Oracle Source dialog box to select error handling options.

Figure 16: Error Output page of the Oracle Source dialog box

Options on the Error Output Page of the Oracle Source Dialog Box

Error - Select how the Oracle source should handle errors in a flow: ignore the failure, redirect the row, or fail the component.

Truncation - Select how the Oracle source should handle truncation in a flow: ignore the failure, redirect the row, or fail the component.

The following are the options for handling errors and truncation:

  • Fail Component. The Data Flow task fails when an error or a truncation occurs. This is the default behavior.
  • Ignore Failure. The error or the truncation is ignored and the data row is directed to the Oracle source output.
  • Redirect Flow. The error or the truncation data row is directed to the error output of the Oracle source. In this case the Oracle source error handling is used.

Oracle Source Advanced Editor

The Advanced Editor contains properties that cannot be set in the Oracle Source dialog box. You can view and configure these properties using the Component Properties tab.

To open the Advanced Editor: In the Data Flow screen of your Integration Services project, right-click the Oracle source and then click Show Advanced Editor.

Figure 17: The Component Properties tab of the Advanced Editor for the Oracle source

Custom Properties of the Oracle Source

The Component Properties tab includes a Custom Properties list. The following table describes the custom properties of the Oracle source. All properties are read/write.

                          Custom Property                                                 Data Type                                                 Description                      

AccessMode

Integer (enumeration)

The mode used to access the  database. The possible options are Table  Name and SQL Command. The  default is Table Name.

BatchSize

Integer

The size of the batch for  bulk loading. This is the number of records extracted as an array.

Note: This property is not  available in the Oracle Source  dialog box, but it can be set by using the Advanced Editor.

DefaultCodePage

Integer

The code page to use when  code page information is unavailable from the data source.

Note: This property is not  available in the Oracle Source  dialog box, but can be set by using the Advanced Editor.

PreFetchCount

Integer

The possible number of  prefetched (look ahead) rows.

Note: This property is not  available in the Oracle Source  dialog box, but can be set by using the Advanced Editor.

SqlCommand

String

The SQL command to be  executed.

TableName

String

The name of the table with  the data that is being used.

Table 2: Custom properties of the Oracle source

Oracle Destination Component

The Oracle destination connects to a local or remote Oracle database and bulk loads data into Oracle databases.

The destination uses the Oracle Connection Manager to connect to a data source.

An Oracle destination includes mappings between input columns and columns in the destination data source. You do not have to map input columns to all destination columns, but depending on the properties of the destination columns, errors can occur if no input columns are mapped to the destination columns. For example, if a destination column does not allow null values, an input column must be mapped to that column. In addition, the data types of mapped columns must be compatible. For example, you cannot map an input column with a string data type to a destination column with a numeric data type.

The input column and the column in the destination data source are compatible if:

  • They have the same data type, precision, scale, and code page. The length can be different.
  • The input column type is DT_NUMERIC and the destination data source type is DT_R8.

The Oracle destination has one regular input and one error output.

Note: Columns of unsupported data types are shown, but cannot be mapped. Tables that have columns of unsupported data types that are not nullable cannot be loaded.

Load Options

The Oracle destination can use one of two access load modes. You set the mode on the Connection Manager page of the Oracle Destination Editor. The two modes are:

  • Arrayed load: In this mode, data is loaded into Oracle table in batches and the entire batch is inserted under the same transaction.
  • Fast load using Direct Path: In this mode, the destination component uses the OCI direct path protocol for loading the Oracle table.

When using fast load mode, the following are some restrictions:

  • Triggers are not supported.
  • Referential integrity constraints are not supported.
  • Clustered tables are not supported.
  • Loading of remote objects is not supported.

For more information, see Oracle Corporation’s Call Interface Programmer’s Guide.

Error Handling

The Oracle destination has a single error output. The component error output includes the following output columns:

  • Error Code: The number that corresponds to the current error. See the Oracle documentation for a list of relevant error codes.
  • Error Column: The source column causing the error (for conversion errors).
  • Error Row Columns: The record data that causes the error.

Depending on the error behavior setting, the Oracle destination supports returning errors (such as, data conversion, truncation, or constraint violation) that occur during the loading process in the error output.

You can set the maximum number of errors that can occur using the Maximum number of errors (MaxErrors) property. When the error limit is reached, the Oracle destination returns an error and stops. In all cases the target table will include all records completed successfully up to the point where the component stopped. Only the record that exceeded the limit is not included in the target table.

Parallelism

When arrayed loading is used, there is no restriction on the number of Oracle destinations that run in parallel against the same tables as long as all of the Oracle destinations are configured for arrayed loading. However, this may reduce performance because of standard record locking. The amount of performance loss depends on the data and table organization.

When the direct path protocol (fast load) is used, only one Oracle destination can be configured to run against the same table in the same time. However, you can choose to use Parallel mode, which denotes a parallel direct path load. A parallel direct path load allows multiple direct path load sessions to concurrently load the same data segments (multisegment parallelism). A parallel direct path is naturally more restrictive than a normal direct path. In this case, you should plan any use of parallelism in advance by splitting the source data into segments and applying them concurrently in parallel mode. There is no reason to use a single parallel session.

The following restrictions apply when parallel direct path loads are used:

  • The load cannot maintain local or global indexes.
  • Referential integrity and CHECK constraints must be disabled.
  • Triggers must be disabled.

For more information, see the Oracle documentation.

Troubleshooting the Oracle Destination

You can log the calls that the Oracle destination makes to the Oracle Call Interface (OCI). You can use this logging ability to troubleshoot the saving of data to Oracle data sources that the Oracle destination performs. To log the calls that the Oracle destination makes to an Oracle data source, enable package logging and select the Diagnostic event at the package level.

Configuring the Oracle Destination

You configure the Oracle destination in the Oracle Destination Editor.

Oracle Destination Editor (Connection Manager Page)

Use the Connection Manager page of the Oracle Destination Editor to select the Oracle connection manager for the destination. This page also lets you select a table or view from the database.

Options on the Connection Manager Page of the Oracle Destination Editor

Connection manager - Select an existing connection manager from the list, or click New to create a new connection.

New - Click New to open the Oracle Connection Manager Editor, where you can create a new connection manager.

Data access mode - Select the method for selecting data from the source. The options are shown in the following table.

                          Option                                                 Description                      

Table  Name

Select this option to  configure the Oracle destination to work in arrayed mode. When you select  this option, the following options are available.

Name  of the table or the view:  Select an available table or view from the database from the list. This list  contains the first 1,000 tables only. If your database contains more than  1,000 tables, you can type the beginning of a table name or use the (*) wild  card to enter any part of the name to display the table or tables you want to  use. If no default database is selected for the connection manager, all  tables for all databases are returned except for SYS, MDSYS, OLAPSYS, WKSYS,  CTXSYS, XDB, WMSYS, SYSTEM, EXFSYS, ORDSYS, DMSYS, and WK_TEST.

Maximum  number of errors: Type  the number of errors that can occur before the data flow is stopped. By  default the value is 0, which indicates that there is no error limit. If you  select Redirect flow in the Error  handling page, all errors that are returned before the data flow reaches the  error limit are returned in the error output. For more information, see the   “Options on the Error Output Page of the Oracle Destination Editor” section.

Transaction  size: Type the number of  inserts that can be carried out in a single transaction. The default is the  value of BatchSize.

Batch  size: Type the size of the  batch for bulk loading. This is the number of rows loaded as a batch.

Table  Name-Fast Load

Select this option to  configure the Oracle destination to work in fast (Direct Path) load mode. If  this option is not used, the Oracle destination is configured to work in  arrayed mode. When you select this option, the following options are  available.

Name  of the table or the view:  Select an available table or view from the database from the list. This list  contains the first 1,000 tables only. If your database contains more than  1,000 tables, you can type the beginning of a table name or use the (*) wild card  to enter any part of the name to display the table or tables you want to use.  If no default database is selected for the connection manager, all tables for  all databases are returned except for SYS, MDSYS, OLAPSYS, WKSYS, CTXSYS,  XDB, WMSYS, SYSTEM, EXFSYS, ORDSYS, DMSYS, and WK_TEST.

Parallel  load: Select this check box to  indicate whether parallel loading is enabled.

No  logging: Select this check box to  disable database logging. This is not related not to tracing but to the  logging that the Oracle database carries out for recovery purposes.

Maximum  number of errors: Type  the number of errors that can occur before the data flow is stopped. By  default the value is 0, which indicates that there is no error limit. All  errors that are returned before the data flow reaches the error limit are  returned in the error output. For more information, see the “Options on the  Error Output Page of the Oracle Destination Editor” section.

Transfer  buffer size (KB): Type  the size of the transfer buffer. The default size is 64 KB.

Table 3: Data Access Mode options for the Oracle destination

Oracle Destination Editor (Mappings Page)

Use the Mappings page of the Oracle Destination Editor to map input columns to destination columns.

Options on the Mappings Page of the Oracle Destination Editor

Available Input Columns - The list of available input columns. Drag an input column to an available destination column to map the columns.

Note: Columns of unsupported data types are shown, but they cannot be mapped.

Available Destination Columns - The list of available destination columns. Drag a destination column to an available input column to map the columns.

Note: Columns of unsupported data types are shown, but they cannot be mapped.

Input Column - View the input columns that you selected. You can remove mappings by selecting <ignore> to exclude columns from the output.

Destination Column - View all available destination columns, both mapped and unmapped.

Oracle Destination Editor (Error Output Page)

Use the Error Output page of the Oracle Destination Editor to select error handling options.

Options on the Error Output Page of the Oracle Destination Editor

Error behavior - Select how the Oracle destination should handle errors in a flow: ignore the failure, redirect the row, or fail the component.

Truncation - Select how the Oracle destination should handle truncation in a flow: ignore the failure, redirect the row, or fail the component.

The following are the options for handling errors and truncation:

  • Fail Component. The Data Flow task fails when an error or a truncation occurs. This is the default behavior.
  • Ignore Failure. The error or the truncation is ignored and the data row is not inserted.
  • Redirect Flow. The error or the truncation data row is directed to the error output of the Oracle destination.

Oracle Destination Advanced Editor

The Advanced Editor contains the properties that can be set programmatically.

To open the Advanced Editor: In the Data Flow screen of your Integration Services project, right-click the Oracle destination and then click Show Advanced Editor.

Figure 18: The Component Properties page of the Advanced Editor for the Oracle destination

Oracle Destination Custom Properties

The following table describes the custom properties of the Oracle destination. All properties are read/write.

                                          Property                                                                               Data Type                                                                               Description                                                                               Loading Mode                                    

BatchSize

Integer

The size of the batch for  bulk loading. This is the number of rows loaded as a batch.

Used only in arrayed mode.

DefaultCodePage

Integer

The code page to use when  code page information is unavailable from the data source.

Note: This property is not  available in the Oracle Destination Editor, but can be set by using the  Advanced Editor.

Used for both modes.

FastLoad

Boolean

A value that indicates  whether Fast Loading is used. The default value is True. This can also be set  in the Oracle Destination Editor (Connection Manager Page). For more  information, the “Options on the Error Output Page of the Oracle Destination  Editor” section.

Used for both modes.

MaxErrors

Integer

The number of errors that  can occur before the data flow is stopped. By default the value is 0, which  indicates that there is no error limit. If you select Redirect flow in the Error handling page, all errors that are  returned before the data flow reaches the error limit are returned in the  error output. For more information, see the “Options on the Error Output Page  of the Oracle Destination Editor” section.

Used for both modes.

NoLogging

Boolean

A value that indicates  whether database logging is disabled. The default value is False, indicating  that logging is enabled.

Used for both modes.

Parallel

Boolean

A value that indicates  whether parallel loading is enabled. For information about how to use  parallel loading and restrictions on its use, see the “Options on the Error  Output Page of the Oracle Destination Editor” section.

Used only in Fast Load  mode.

TableName

String

The name of the table with  the data that is being used.

Used for both modes.

TableSubName

String

The subname or  subpartition. This value is optional.

Note: This property is not  available in the Oracle Destination Editor, but it can be set by using the  Advanced Editor.

Used only in Fast Load  mode.

TransactionSize

Integer

The number of inserts that  can be made in a single transaction. The default is the value of BatchSize.

Used only in arrayed mode

TransferBufferSize

Integer

The size of the transfer  buffer. The default value is 64 KB.

Used only in Fast Load mode.

Table 4: Custom properties of the Oracle destination

Use Cases

Usage Scenario 1: Bulk Extract from Oracle to SQL Server Using OCI Array Binding

Solution Architecture Overview

Figure 19: Extracting from Oracle using OCI array binding

  • Source table: in Oracle
  • Target table: in SQL Server
  • Data flow: Configure Oracle source component and configure SQL Server destination component

Define the SSIS Data Flow

Add and Configure the Oracle Source Component

1. Add the source component to the SSIS package data flow.

2. Select the Oracle source component from the Data Flow Sources box.

Figure 20: Selecting the Oracle source in the Toolbox

Figure 21: The Data Flow tab of SSIS Designer, showing the Oracle source and the SQL Server destination before they are connected

3. Configure the Oracle source component.

The Oracle source extracts data from Oracle databases by using a database table, a view, or an SQL command. The source uses an Oracle connection manager, which specifies the Oracle provider to use.

The Oracle source has one regular output and one error output.

Data Access Modes

                          Option                                                 Description                      

Table Name

Retrieve data from a table  or view in the Oracle data source defined in the Connection Manager. When you  select this option, select a value from the Name of the table or the view list

This list contains the  first 1,000 tables only. If your database contains more than 1,000 tables,  you can type the beginning of a table name or use the (*) wild card to enter  any part of the name to display the table or tables you want to use.

If no default database is  selected for the connection manager, all tables for all databases are  returned except for SYS, MDSYS, OLAPSYS, WKSYS, CTXSYS, XDB, WMSYS, SYSTEM,  EXFSYS, ORDSYS, DMSYS, and WK_TEST.

SQL Statement

Retrieve data from the  Oracle data source by using an SQL query. When you select this option, enter  a query in one of the following ways:

Enter the text of the SQL  query in SQL command.

Click Browse to load the SQL query from a text file.

Click Parse query to verify the syntax of the query text.

Table 5: Data access modes for the Oracle source

Figure 22: Connection Manager page of the Oracle Source dialog box, configured to use a table or view as the data access mode

Figure 23: Connection Manager page of the Oracle Source dialog box, configured to use an SQL command as the data access mode

Add and Configure SQL Server Destination Component

After the Oracle source component is set up, define the destination node. After the mapping has been completed, the SQL Server Integration Services package is ready to run.

Figure 24: The Data Flow tab of SSIS Designer, showing the Oracle source and the SQL Server destination after they are connected

Usage Scenario 2: Bulk Load Using FastLoad

Solution Architecture Overview

Figure 25: Loading into Oracle using OCI Direct Path

Description

This scenario typically entails moving large amounts of data into an Oracle database.

Define SSIS Data Flow

Add and Configure Source and Oracle Destination Components

1. Add a source component. Any supported source can be used.

Figure 26: Selecting a source in the Toolbox

2. Add the Oracle destination component.

Figure 27: Selecting the Oracle destination in the Toolbox

3. Link the source and the destination components.

Figure 28: The Data Flow tab of SSIS Designer, showing the OLE DB source and the Oracle destination after they are connected

Configure the Oracle destination component.

Figure 29: Connection Manager page of the Oracle Destination Editor, configured to use a table and fast load using direct path as the data access mode

After the mapping is done, the SSIS package is ready to run.

Usage Scenario 3: Incremental Data Load

Description

The Oracle destination connects to a local or remote Oracle database and bulk loads data into Oracle databases using the Array Binding API. In this mode, data is loaded into Oracle table in batches and the batches are inserted under the transactions. This mode allows for bulk loading to an Oracle table that already has data in it.

The destination uses the Oracle Connection Manager to connect to a data source.

Architecture

Figure 30: Loading into Oracle using OCI Array Binding

Define SSIS Data Flow

Add and Configure Source and Oracle Destination Components

1. Add a source component. Any supported source can be used.

Figure 31: Selecting a source in the Toolbox

2. Add the Oracle destination component.

Figure 32: Selecting the Oracle destination in the Toolbox

3. Link the source and the destination components.

Figure 33: The Data Flow tab of SSIS Designer, showing the OLE DB source and the Oracle destination after they are connected

4. Configure the Oracle destination component.

Figure 34: Connection Manager page of the Oracle Destination Editor, configured to use a table as the data access mode

After the mapping is done, the SSIS package is ready to run.

Performance

Methodology

Test Scenario and Configuration

This section discusses test scenarios that were performed and the results of the tests.

All testing was done using the LINEITEM table from the TPC-H dataset. The LINEITEM table is the largest table and contains the largest variety of data types.

Two identical servers were used, each with Intel 64-bit processors (24 cores at 2.4 GHz), 32 GB of RAM and an HP disk array partitioned into 16 logical drives for data plus one for database logs (each logical drive two physical disks, striped).

Software used:

  • Windows Server 2008
  • SQL Server 2008
  • Oracle 11g
  • Attunity SSIS Connectors 1.0 GA release – 2008-09-16.

64-bit software is used in all tests except where noted.

Up to eight streams of data were used. A stream is a single SSIS package reading from a single flat file and loading into the database (or in the reverse direction).

To ensure that I/O does not become the bottleneck, each stream sources data from a different logical drive, so that LINEITEM.TBL.1 is on Drive1, LINEITEM.TBL.2 is on Drive2, and so on. Except where noted, source files have about 75,000,000 rows and are about 10 GB in size.

Likewise, the target database is striped across eight logical drives, Drive9 to Drive16. For SQL Server, the target database is striped onto eight file groups, each of which has one file on each drive. For Oracle, storage extents were preallocated across the eight drives, and the target table mapped to a tablespace, which spans the extents.

In SQL Server, the following table definition was used:

Transact-SQL
create table LINEITEM
       (L_SHIPDATE          smalldatetime        not null,
        L_ORDERKEY          bigint               not null,
        L_DISCOUNT          smallmoney           not null,
        L_EXTENDEDPRICE     money                not null,
        L_SUPPKEY           int                  not null,
        L_QUANTITY          smallint             not null,
        L_RETURNFLAG        char(1)              not null,
        L_PARTKEY           int                  not null,
        L_LINESTATUS        char(1)              not null,
        L_TAX               smallmoney           not null,
        L_COMMITDATE        smalldatetime        not null,
        L_RECEIPTDATE       smalldatetime        not null,
        L_SHIPMODE          varchar(10)          not null,
        L_LINENUMBER        int                  not null,
        L_SHIPINSTRUCT      varchar(25)          not null,
        L_COMMENT           varchar(44)          not null
)

SQL Server and Oracle data types do not have a simple one-to-one mapping. The following table definition was used for Oracle:

Transact-SQL
create table LINEITEM
      (L_SHIPDATE          DATE                 not null,
        L_ORDERKEY          NUMBER(19,0)         not null,
        L_DISCOUNT          NUMBER(10,4)         not null,
        L_EXTENDEDPRICE     NUMBER(19,4)         not null,
        L_SUPPKEY           NUMBER(10,0)         not null,
        L_QUANTITY          NUMBER(5,0)          not null,
        L_RETURNFLAG        char(1)              not null,
        L_PARTKEY           NUMBER(10,0)         not null,
        L_LINESTATUS        char(1)              not null,
        L_TAX               NUMBER(10,4)         not null,
        L_COMMITDATE        DATE                 not null,
        L_RECEIPTDATE       DATE                 not null,
        L_SHIPMODE          varchar2(10)         not null,
        L_LINENUMBER        NUMBER(10,0)         not null,
        L_SHIPINSTRUCT      varchar2(25)         not null,
        L_COMMENT           varchar2(44)         not null
)

The SSIS packages were kept as simple as possible. Except as noted, each package contains a single data flow that reads data from one source file and writes to the database. Below is an example. There are also versions of the packages that read from the database and write to flat files.

Figure 35: Simple package structure used for performance testing

Test Definitions

Three primary tests scenarios were performed:

  1. “EmptyTable” tests, which loaded data from flat files into an empty database table.
  2. “AddToTable” tests, which loaded data from flat files into a database table that already had data in it.
  3. “Extract” tests, which pulled data from a database table and wrote to a flat file. The size of the resulting file was somewhat larger than the original flat files due to differences in the formatting of the output strings.

Results of Performance Testing

Figure 36: Attunity connectors vs. Oracle OLE DB vs. Microsoft OLE DB

Figure 37: 64-bit vs. 32- bit connectors

Data Type Mapping

The following table shows the Oracle database data types and their default mapping to SSIS data types. SSIS components for Oracle do not support all data types. Columns with unsupported data types cannot be mapped. Tables with columns of unsupported data types that are not nullable cannot be loaded.

                          Oracle Data Type                                                 Integration Services Data Type                      

VARCHAR2

DT_STR

NVARCHAR2

DT_WSTR

NUMBER

DT_R81

NUMBER(P, S)

When the scale is 0,  according to the precision (P) DT_I1,  DT_I2,   DT_I4,  DT_NUMERIC(P, 0)

DT_NUMERIC(P, S)

LONG

Not supported

DATE

DT_DBTIMESTAMP

RAW

DT_BYTES

LONG RAW

Not supported

ROWID

Not supported

CHAR

DT_STR

User-defined type (object  type, VARRAY, Nested Table)

Not supported

REF

Not supported

CLOB, NCLOB, BLOB, BFILE

Not supported

TIMESTAMP TIMESTAMP WITH  TIME ZONE INTERVAL YEAR TO MONTH INTERVAL DAY TO SECOND TIMESTAMP WITH LOCAL  TIME ZONE

DT_STR

UROWID

Not supported

Table 6: Mapping of Oracle data types

You can change the output column type to DT_NUMERIC with specific precision and scale. You can set the precision and scale according to the data stored in the column. When the output column type is changed to DT_Numeric with specific precision and scale, the component will extract the column data as a number with fixed precision and scale.

Troubleshooting

This section contains troubleshooting information, including specific steps for resolution, to help you address common errors you may encounter when you are working with the Microsoft Connector for Oracle by Attunity.

Troubleshooting: Run-Time Failures

This step-by-step section describes how to troubleshoot and analyze run-time failures related to the Microsoft Connector for Oracle by Attunity.

The debugging process depends on the logging facility that SSIS provides for the external providers. Using the verbose log files is necessary if the other debugging facilities of the Microsoft SQL Server Business Intelligence Development Studio have not helped, or if the nature of the problems is related to the data provider.

STEP 1 – Eliminate the common problems

                          Problem                                                 Solution                      

Permission problems at run  time.

Check the relevant error  message and confirm that the SQLServer service and the SQL Server Agent has  the required permissions in the specified account.

Problem running a package  on 64-bit platform, if the package worked on 32-bit platform.

Make sure you are using the  64-bit connector, for the Microsoft Connector for Oracle. There are special  installation kits for 64-bit editions of Windows and Itanium IA64.

If you are calling DTS  packages using the Execute DTS 2000 Package task to run a SQL Server 2000 DTS  package, you must run the package in a 32-bit environment.

General errors in the Event  Viewer.

Always check the Windows  Event Viewer for general error messages. If the problem is indeed related to  the data provider, proceed to STEP 2.

Table 7: Eliminating common run-time failures

STEP 2 - Using the Logging Facility

The Microsoft Connector for Oracle outputs meaningful error messages to SSIS; however, there are cases where there is a need for verbose debugging log file, which can show the complete lifecycle of the interaction with the connector and the back-end database.

In these cases SSIS provides a complete logging facility and several logging providers.

To enable logging, perform the following steps:

1. In the SQL Server Business Intelligence Development Studio, open the package in which you want to enable logging.

2. On the SSIS menu, click Logging.

Figure 38: Selecting Logging on the SSIS menu in Business Intelligence Development Studio

3. On the Providers and Logs tab of the Configure SSIS Logs dialog box, choose a logging provider. For example, SSIS log provider for Text files enables you to output the logging to a simple text file in your file system.

Figure 39: Selecting a log provider on the Providers and Logs tab of the Configure SSIS Logs dialog box

4. Click Add to add the selected provider.

5. Select the check box next to the provider to enable it.

6. Configure the logging provider by clicking on the Configuration column. You can either create a new connection or use an existing one. For the Text provider, you can choose to create a new text file and output the logging to it. Another option may be to always append to an existing file. You can define several log file providers to output to multiple log files of different formats.

Figure 40: Configuring the connection for logging to a text file

7. Next, select the diagnostic level for debugging. Click the Details tab, and then select the events to log. For the Oracle connector, choose the Diagnostic debugging level, which returns important interactions with the OCI interface of Oracle as well as other types of information. Note that you can select other events to be logged as needed, such as OnError, OnInformation, and OnWarning.

Figure 41: Selecting the Diagnostic event for logging

8. To save the current logging configuration for reuse as a template, click Save.

9. To accept your logging configuration, click OK.

10. To save the changes to the package, click the Save icon.

Run the package that you want to debug, and then review the output log file.

STEP 3 - Understanding the log file

The verbose log file contains the details and interaction of the different components in your package. Quickly reviewing the log file can reveal problems that you may be able to resolve without involving your technical support department.

Many general problems related to the SQL query, the backend database, or SQL Server Integration Services can be solved by reviewing the log file. If for any reason you cannot understand the cause of the failure, a support call should be made in order to get an explanation of the failure.

Analysis Tips

1. The log contains many Pre/Post and Enter/Exit messages. Make sure for every request there is a response, for example:

2:46:34 PM,0,0x,ExternalRequest_pre: The object is ready to make the following external request: 'IRowset::GetData'.

2:46:34 PM,0,0x,ExternalRequest_post: 'IRowset::GetData succeeded'. The external request has completed.

2. For OCI calls you will see Enter/Exit pairs as follows:

2:46:36 PM,0,0x,*Enter > OCIAttrGet

2:46:36 PM,0,0x,*Exit < OCIAttrGet

Troubleshooting: Error Message: “Failed to load OCI DLL”

This error indicates that a failure occurred while the Oracle oci.dll was being loaded. The error can occur if there are multiple Oracle homes or permission problems.

SCENARIO 1 – Multiple Oracle Homes

In this scenario, the computer is configured for development. The computer is running 64-bit edition of the Windows operating system, and the following components, which are required for development, are installed:

  • Oracle Client 32Bit (required for design)
  • Oracle Client 64Bit
  • Microsoft SSIS Oracle Connector 64Bit

The following error may appear when the Oracle component is run in 32-bit (design time):

Error at Package [Connection manager "Oracle Connector 1"]: Failed to load OCI DLL.

SOLUTION

First, check that the environments are working and configured properly, and that SQLPlus is responding on both installations of the Oracle client.

On the computer on which the error occurred, it is noticed that the Oracle Home for the 64-bit Oracle client is defined in the 32-bit registry portion (Wow6432Node); this is causing the wrong version of oci.dll to be loaded.

To work around the problem, define a dummy registry entry (Z_SSIS) as follows:

Note: Incorrectly editing the registry can severely damage your system. Before making changes to the registry, you should back up any valued data on the computer.

  1. Open the Registry Editor.
  2. Locate the following key: HKEY_LOCAL_COMPUTER\SOFTWARE\Wow6432Node\ORACLE.
  3. Right-click the ORACLE node, click New, and then click Key.
  4. Call the new key Z_SSIS (to make sure it's the last entry).
  5. Right-click on the Z_SSIS node, click New, and then click String.
  6. Name the property ORACLE_HOME.
  7. Double-click ORACLE_HOME and set it to the location of the Oracle 32-bit installation home directory.

SCENARIO 2 – Permission problems to Oracle HOME directory and files

The user is working in a 32-bit or 64-bit environment and receiving the error message while designing the SSIS process.

SOLUTION

The error could occur for several reasons:

  • The wrong ORACLE_HOME is set up in your registry. If this is the case, use the Oracle Installer to configure your environment.
  • The interactive user does not have the proper permission to access the Oracle HOME directory and its files. If this is the case, check the permissions and configure them properly.

Finally, verify the connection using the Oracle SQLPlus utility.

For more information, see this Attunity Forum post (http://www.attunity.com/forums/microsoft-ssis-oracle-connector/error-failed-load-oci-dll-1308.html#post1942).

Conclusion

This paper discussed the functionality of the SSIS Connector for Oracle by Attunity 1.0, and it provided detailed step-by-step instructions on how to use the connector with SQL Server Integration Services. Three general use cases are presented with the design highlights, rationale, and performance statistics. Overall, the connector provides a high-performance means of loading and extracting data from Oracle databases.

For more information:

See the recent blog post, Using SSIS to get data out of Oracle: A big surprise! (http://blogs.msdn.com/sqlperf/archive/2009/08/14/using-ssis-to-get-data-out-of-oracle-a-big-surprise.aspx) by Len Wyatt of the Microsoft SQL Server Performance team.

http://www.microsoft.com/sqlserver/: SQL Server Web site

http://msdn.microsoft.com/en-us/sqlserver/cc511477.aspx: SQL Server Integration Services TechCenter

http://technet.microsoft.com/en-us/sqlserver/cc510302.aspx: SQL Server Integration Services DevCenter

Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5 (excellent), how would you rate this paper and why have you given it this rating? For example:

  • Are you rating it high due to having good examples, excellent screen shots, clear writing, or another reason?
  • Are you rating it low due to poor examples, fuzzy screen shots, or unclear writing?

This feedback will help us improve the quality of white papers we release.

Send feedback.