【ogg Administering Oracle GoldenGate】7 配置实时数据仓库(多对一)
本章描述如何配置Oracle GoldenGate来实现实时数据仓库。
Topics:
7.1 Overview of the Data Warehousing Configuration
数据仓库配置是多对一配置。多个源数据库将数据发送到一个目标仓库数据库。Oracle GoldenGate支持like-to-like或异构数据传输,支持在配置中的任何系统上进行过滤和转换 (support varies by database platform).
Description of the illustration simple_config_data_whse.jpg
7.2 Considerations for a Data Warehousing Configuration
本节描述数据仓库配置的注意事项。
7.2.1 Isolation of Data Records
This configuration assumes that each source database contributes different records to the target system. If the same record exists in the same table on two or more source systems and can be changed on any of those systems, conflict resolution routines are needed to resolve conflicts when changes to that record are made on both sources at the same time and replicated to the target table. See Configuring Oracle GoldenGate for Active-Active High Availability for more information about resolving conflicts.
7.2.2 Data Storage
You can divide the data storage between the source systems and the target system to reduce the need for massive amounts of disk space on the target system. This is accomplished by using a data pump on each source, rather than sending data directly from each Extract across the network to the target.
-
A primary Extract writes to a local trail on each source.
-
A data-pump Extract on each source reads the local trail and sends it across TCP/IP to a dedicated Replicat group.
7.2.3 Filtering and Conversion
If not all of the data from a source system will be sent to the data warehouse, you can use the data pump to perform the filtering. This removes that processing overhead from the primary Extract group, and it reduces the amount of data that is sent across the network. See Mapping and Manipulating Data for filtering and conversion options.
7.2.4 Additional Information
The following documentation provides additional information of relevance to configuring Oracle GoldenGate.
-
For additional system requirements, process configuration, and database setup requirements, see the Oracle GoldenGate installation and configuration document for your database type. These guides are listed in the Preface of this book.
-
For detailed instructions on configuring Oracle GoldenGate change capture and delivery groups, see Configuring Online Change Synchronization.
-
For additional tuning options for Oracle GoldenGate, see Tuning the Performance of Oracle GoldenGate.
-
For complete syntax and descriptions of the Oracle GoldenGate commands and parameters, see Reference for Oracle GoldenGate for Windows and UNIX.
7.3 Creating a Data Warehousing Configuration
请参考图7-1以获得您将要创建的对象的可视化表示。
Figure 7-1 Configuration for Data Warehousing
Description of "Figure 7-1 Configuration for Data Warehousing"
7.3.1 Source Systems
Configure the Manager process and primary Extract groups for the source systems.
To Configure the Manager Process
-
On each source, configure the Manager process according to the instructions in Configuring Manager and Network Communications.
-
In each Manager parameter file, use the
PURGEOLDEXTRACTS
parameter to control the purging of files from the trail on the local system.
To Configure the primary Extract Groups
-
On each source, use the
ADD EXTRACT
command to create a primary Extract group. For documentation purposes, these groups are called ext_1 and ext_2.Command on source_1:
ADD EXTRACT ext_1, {TRANLOG | INTEGRATED TRANLOG}, BEGIN time [option[, ...]]
Command on source_2:
ADD EXTRACT ext_2, {TRANLOG | INTEGRATED TRANLOG}, BEGIN time [option[, ...]]
See Reference for Oracle GoldenGate for detailed information about these and other
ADD EXTRACT
options that may be required for your installation. -
On each source, use the
ADD EXTTRAIL
command to create a local trail.Command on source_1:
ADD EXTTRAIL local_trail_1, EXTRACT ext_1
Command on source_2:
ADD EXTTRAIL local_trail_2, EXTRACT ext_2
Use the
EXTRACT
argument to link each Extract group to the local trail on the same system. The primary Extract writes to this trail, and the data-pump reads it. -
On each source, use the
EDIT PARAMS
command to create a parameter file for the primary Extract. Include the following parameters plus any others that apply to your database environment. For possible additional required parameters, see the Oracle GoldenGate installation and setup guide for your database.Parameter file for ext_1:
-- Identify the Extract group: EXTRACT ext_1 -- Specify database login information as needed for the database: [SOURCEDB dsn_1][, USERIDALIAS alias] -- Log all scheduling columns if using integrated Replicat LOGALLSUPCOLS -- Specify the local trail that this Extract writes to -- and the encryption algorithm: ENCRYPTTRAIL algorithm EXTTRAIL local_trail_1 -- Specify tables and sequences to be captured: SEQUENCE [container.|catalog.]owner.sequence; TABLE [container.|catalog.]owner.table;
Parameter file for ext_2:
-- Identify the Extract group: EXTRACT ext_2 -- Specify database login information as needed for the database: [SOURCEDB dsn_2][, USERIDALIAS alias] -- Log all scheduling columns if using integrated Replicat or CDR LOGALLSUPCOLS -- Specify the local trail that this Extract writes to -- and the encryption algorithm: ENCRYPTTRAIL algorithm EXTTRAIL local_trail_2 -- Specify tables and sequences to be captured: SEQUENCE [container.|catalog.]owner.sequence; TABLE [container.|catalog.]owner.table;
To Configure the Data Pumps
- On each source, use the
ADD EXTRACT
command to create a data pump Extract group. For documentation purposes, these pumps are called pump_1 and pump_2.Command on source_1:
ADD EXTRACT pump_1, EXTTRAILSOURCE local_trail_1, BEGIN time
Command on source_2:
ADD EXTRACT pump_2, EXTTRAILSOURCE local_trail_2, BEGIN time
Use
EXTTRAILSOURCE
as the data source option, and specify the name of the trail on the local system - On each source, use the
ADD RMTTRAIL
command to create a remote trail on the target.Command on source_1:
ADD RMTTRAIL remote_trail_1, EXTRACT pump_1
Command on source_2:
ADD RMTTRAIL remote_trail_2, EXTRACT pump_2
Use the
EXTRACT
argument to link each remote trail to a different data pump. The data pump writes to this trail over TCP/IP, and a Replicat reads from it.See Reference for Oracle GoldenGate for additional
ADD RMTTRAIL
options. - On each source, use the
EDIT PARAMS
command to create a parameter file for the data pump group. Include the following parameters plus any others that apply to your database environment.Parameter file for pump_1:
-- Identify the data pump group: EXTRACT pump_1 -- Specify database login information as needed for the database: [SOURCEDB dsn_1][, USERIDALIAS alias] -- Decrypt the data only if the data pump must process it. -- DECRYPTTRAIL -- Specify the name or IP address of the target system -- and optional encryption of data over TCP/IP: RMTHOSTOPTIONS target, MGRPORT port_number, ENCRYPT encryption_options -- Specify the remote trail and encryption algorithm on the target system: ENCRYPTTRAIL algorithm RMTTRAIL remote_trail_1 -- Specify tables and sequences to be captured: SEQUENCE [container.|catalog.]owner.sequence; TABLE [container.|catalog.]owner.table;
Parameter file for pump_2:
-- Identify the data pump group: EXTRACT pump_1 -- Specify database login information as needed for the database: [SOURCEDB dsn_2][, USERIDALIAS alias] -- Decrypt the data only if the data pump must process it. -- DECRYPTTRAIL -- Specify the name or IP address of the target system -- and optional encryption of data over TCP/IP: RMTHOSTOPTIONS target, MGRPORT port_number, ENCRYPT encryption_options -- Specify the remote trail and encryption algorithm on the target system: ENCRYPTTRAIL algorithm RMTTRAIL remote_trail_2 -- Specify tables and sequences to be captured: SEQUENCE [container.|catalog.]owner.sequence; TABLE [container.|catalog.]owner.table;
7.3.2 Target System
Configure the Manager process and primary Replicat groups for the target system.
To Configure the Manager Process
-
Configure the Manager process. See Configuring Manager and Network Communications for instructions.
-
In the Manager parameter file, use the
PURGEOLDEXTRACTS
parameter to control the purging of files from the trail.
To Configure the Replicat Groups
- On the target, create a Replicat checkpoint table (unless using Oracle integrated Replicat). See Creating a Checkpoint Table for instructions.
- On the target, use the
ADD REPLICAT
command to create a Replicat group for each remote trail that you created. For documentation purposes, these groups are called rep_1 and rep_2.Command to add rep_1:
ADD REPLICAT rep_1 [, INTEGRATED | COORDINATED [MAXTHREADS number]] , EXTTRAIL remote_trail_1, BEGIN time
Command to add rep_2:
ADD REPLICAT rep_2 [, INTEGRATED | COORDINATED [MAXTHREADS number]] , EXTTRAIL remote_trail_2, BEGIN time
Use the
EXTTRAIL
argument to link the Replicat group to the trail.See Reference for Oracle GoldenGate for detailed information about these and other options that may be required for your installation.
- On the target, use the
EDIT PARAMS
command to create a parameter file for each Replicat group. Include the following parameters plus any others that apply to your database environment. For possible additional required parameters, see the Oracle GoldenGate installation and setup guide for your database.Parameter file for rep_1:
-- Identify the Replicat group: REPLICAT rep_1 -- Specify database login information as needed for the database: [TARGETDB dsn_3][, USERIDALIAS alias] -- Specify error handling rules: REPERROR (error, response) -- Specify tables for delivery and threads if using coordinated Replicat: MAP [container.|catalog.]owner.table, TARGET owner.table[, DEF template] [, THREAD (thread_ID)][, THREADRANGE (thread_range[, column_list])] ;
Parameter file for rep_1:
-- Identify the Replicat group: REPLICAT rep_2 -- Specify database login information as needed for the database: [TARGETDB dsn_3][, USERIDALIAS alias] -- Specify error handling rules: REPERROR (error, response) -- Specify tables for delivery and threads if using coordinated Replicat: MAP [container.|catalog.]owner.table, TARGET owner.table[, DEF template] [, THREAD (thread_ID)][, THREADRANGE (thread_range[, column_list])] ;
You can use any number of
MAP
statements for any given Replicat group. AllMAP
statements for a given Replicat group must specify the same objects that are contained in the trail that is linked to the group.