Oracle Data Pump in RAC

Oracle Data Pump was introduced in Oracle Database 10g to enable very high-speed transfer of data and metadata between databases. A salient feature of Data Pump is that it can parallelize the export and import jobs for maximum performance.

However, prior to Oracle Database 11g release 2 (11.2), if a Data Pump job is executed in parallel on an instance of an Real Application Cluster (RAC) database, the parallelism is employed only on that instance without utilizing the potentially valuable idle resources of other instances.

From Oracle Database 11.2 onwards, parallelism is no longer confined to a single instance. Rather worker processes can be distributed across multiple Oracle RAC instances to better utilize Oracle RAC resources. The newly-introduced client parameter CLUSTER can be employed to specify whether you want to use the cluster resources. When CLUSTER=Y (default) is specified, Data Pump attempts to use all active Oracle RAC instances.

To exercise more control over the instances on which the job should run, the SERVICE_NAME parameter can be mentioned where SERVICE_NAME must be a valid known service of the database. In this case, the Data Pump job will run only on those instance where the service is available.

If any Oracle RAC instance where the job that is running dies or leaves the cluster, the job aborts and can be restarted at some future time.

In this article, I will demonstrate:

Case-I: Parallelization of an export job across all active Oracle RAC instances
Case-II: Parallelization of an export job across a subset of Oracle RAC instances

Current scenario:

Name of the cluster: cluster01
Number of nodes: 3 (host01, host02, host03)
RAC Database version: 11.2.0.3
Name of RAC database: orcl
Number of instances: 3
Names of instances: orcl1, orcl2, orcl3

In order that the Data Pump job is distributed across multiple instances of an RAC database, the following prerequisites should be met:

The directory object should point to shared storage that is accessible by any Oracle RAC instances designated to run Data Pump worker processes.
The PARALLEL parameter should be more than 1 and less than or equal to the number of files in the dump file set.
To specify enough dump files by specifying DUMPFILE parameter, the preferred approach is to use the wildcard option by using a substitution variable (%U) in the filename.
LOGFILE parameter should point to a location on the local file system. Log file is created on the node hosting the instance where master process is running.

Demonstration

Case-I: Export job distributed across all active Oracle RAC instances

First, let’s create a directory object which points to shared storage and is accessible by all the three instances of the database:

SQL>drop directory dp_shared_dir;

SQL>create directory DP_SHARED_DIR as '+DATA/orcl/';

SQL>grant read, write on directory dp_shared_dir to public;

Issue the command to export SH schema in parallel across all active Oracle RAC instances:

Degree of parallelism = 6
Dumpfile – Create in the newly created directory DP_SHARED_DIR with names expsh%U.dmp beginning with 01 for %U, then using 02, 03, and so on.
Logfile – Create in the location on local file system pointed to by the directory object DATA_PUMP_DIR

[oracle@host01 root]$ expdp system/oracle@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile='expsh%U.dmp' reuse_dumpfiles=y

Export: Release 11.2.0.3.0 - Production on Thu Dec 3 12:49:58 2015

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining and Real Application Testing options

Starting "SYSTEM"."SYS_EXPORT_SCHEMA_06": system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y

Estimate in progress using BLOCKS method...

Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA

Total estimation using BLOCKS method: 273.8 MB

Processing object type SCHEMA_EXPORT/USER

Processing object type SCHEMA_EXPORT/SYSTEM_GRANT

Processing object type SCHEMA_EXPORT/ROLE_GRANT

Processing object type SCHEMA_EXPORT/DEFAULT_ROLE

Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA

. . exported "SH"."CUSTOMERS" 9.853 MB 55500 rows

. . exported "SH"."COSTS":"COSTS_Q2_1998" 79.52 KB 2397 rows

. . exported "SH"."COSTS":"COSTS_Q2_1999" 132.5 KB

….

Verify that the Data Pump job is distributed across 6 worker processes on all the three instances of the RAC database:

SQL> select inst_id, session_type from dba_datapump_sessions;

INST_ID SESSION_TYPE

---------- --------------

3 WORKER

2 WORKER

1 DBMS_DATAPUMP

1 MASTER

1 WORKER

While the export is running, let us abort the database instance on host01 to verify that the Data Pump job can be restarted in case any instance where the job is running aborts.

[root@host01 log]# ps -ef |grep pmon

grid 5597 1 0 08:25 ? 00:00:00 asm_pmon_+ASM1

oracle 27612 1 0 09:46 ? 00:00:00 ora_pmon_orcl1

root 30103 8495 0 11:39 pts/1 00:00:00 grep pmon

[root@host01 log]# kill -9 27612

As the instance orcl1 has crashed, the Data Pump job aborts.

……

. . exported "SH"."SALES":"SALES_Q2_1999" 1.754 MB 54233 rows

. . exported "SH"."COSTS":"COSTS_Q1_1999" 183.5 KB 5884 rows

. . exported "SH"."COSTS":"COSTS_Q1_1998" 139.5 KB 4411 rows

. . exported "SH"."SALES":"SALES_Q2_2000" 1.802 MB 55515 rows

UDE-03113: operation generated ORACLE error 3113

ORA-03113: end-of-file on communication channel

Process ID: 20597

Session ID: 51 Serial number: 233

UDE-03114: operation generated ORACLE error 3114

ORA-03114: not connected to ORACLE

Let us restart the job by ATTACHing to the job using the name of the job (as identified in expdp command output) as a parameter to the expdp command:

[oracle@host01 root]$ expdp system/oracle@orcl ATTACH=SYS_EXPORT_SCHEMA_06

Export: Release 11.2.0.3.0 - Production on Thu Dec 3 12:53:06 2015

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining and Real Application Testing options

Job: SYS_EXPORT_SCHEMA_06

Owner: SYSTEM

Operation: EXPORT

Creator Privs: TRUE

GUID: 25FA37F1818A5075E053B7C909C0116F

Start Time: Thursday, 03 December, 2015 12:53:11

Mode: SCHEMA

Instance: orcl3

Max Parallelism: 6

EXPORT Job Parameters:

Parameter Name Parameter Value:

CLIENT_COMMAND system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y

State: IDLING

Bytes Processed: 27,664,048

Percent Done: 23

Current Parallelism: 6

Job Error Count: 0

Dump File: +DATA/orcl//expsh%u.dmp

Dump File: +DATA/orcl//expsh01.dmp

bytes written: 4,096

Dump File: +DATA/orcl//expsh02.dmp

bytes written: 27,385,856

Dump File: +DATA/orcl//expsh03.dmp

bytes written: 147,456

Dump File: +DATA/orcl//expsh04.dmp

bytes written: 192,512

Dump File: +DATA/orcl//expsh05.dmp

bytes written: 4,096

Dump File: +DATA/orcl//expsh06.dmp

bytes written: 4,096

Worker 1 Status:

Process Name: DW00

State: UNDEFINED

Object Schema: SH

Object Type: SCHEMA_EXPORT/DEFAULT_ROLE

Completed Objects: 1

Total Objects: 1

Worker Parallelism: 1

Worker 2 Status:

Process Name: DW00

State: UNDEFINED

Object Schema: SH

Object Name: COSTS

Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA

Completed Objects: 1

Total Objects: 66

Worker Parallelism: 1

Worker 3 Status:

Process Name: DW00

State: UNDEFINED

Worker 4 Status:

Process Name: DW03

State: UNDEFINED

Worker 5 Status:

Process Name: DW01

State: UNDEFINED

Object Schema: SH

Object Name: COSTS

Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA

Completed Objects: 1

Total Objects: 66

Worker Parallelism: 1

Worker 6 Status:

Process Name: DW01

State: UNDEFINED

Export>

We will restart the job with a different degree of parallelism, say 4 (earlier it was 6):

Export> parallel=4

Export> START_JOB

It can be seen that four worker processes have been started on instances 2 and 3 as instance 1 has been aborted.

SQL> select inst_id, session_type from dba_datapump_sessions;

INST_ID SESSION_TYPE

---------- --------------

3 DBMS_DATAPUMP

3 MASTER

3 WORKER

2 WORKER

6 rows selected.

To show progress again, let us issue the CONTINUE_CLIENT command:

Export> continue_client

Job SYS_EXPORT_SCHEMA_06 has been reopened at Thursday, 03 December, 2015 12:53

Restarting "SYSTEM"."SYS_EXPORT_SCHEMA_06": system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y

. . exported "SH"."COSTS":"COSTS_Q1_2000" 120.6 KB 3772 rows

. . exported "SH"."COSTS":"COSTS_Q1_2001" 227.8 KB 7328 rows

. . exported "SH"."SALES":"SALES_Q3_2000" 1.909 MB 58950 rows

. . exported "SH"."SALES":"SALES_Q2_2001" 2.051 MB 63292 rows

. . .

. . exported "SH"."SALES":"SALES_Q4_2002" 0 KB 0 rows

. . exported "SH"."SALES":"SALES_Q4_2003" 0 KB 0 rows

Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA

. . exported "SH"."SALES":"SALES_Q3_1998" 1.633 MB 50515 rows

Processing object type SCHEMA_EXPORT/TABLE/TABLE

Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT

Processing object type SCHEMA_EXPORT/TABLE/COMMENT

Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT

Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS

Processing object type SCHEMA_EXPORT/VIEW/VIEW

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT

Processing object type SCHEMA_EXPORT/TABLE/INDEX/BITMAP_INDEX/INDEX

Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/BITMAP_INDEX/INDEX_STATISTICS

Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS

Processing object type SCHEMA_EXPORT/TABLE/INDEX/DOMAIN_INDEX/INDEX

Processing object type SCHEMA_EXPORT/MATERIALIZED_VIEW

Processing object type SCHEMA_EXPORT/DIMENSION

Master table "SYSTEM"."SYS_EXPORT_SCHEMA_06" successfully loaded/unloaded

******************************************************************************

Dump file set for SYSTEM.SYS_EXPORT_SCHEMA_06 is:

+DATA/orcl//expsh01.dmp

+DATA/orcl//expsh02.dmp

+DATA/orcl//expsh03.dmp

+DATA/orcl//expsh04.dmp

Job "SYSTEM"."SYS_EXPORT_SCHEMA_06" successfully completed at 12:56:29

Thus, a Data Pump job:

Can be parallelized across all the available instances of a RAC database.
Restarted in future if any instance, where the job is running, aborts.

Case-II: Parallelization of export job across a subset of Oracle RAC instances

In case we want to run the job against a subset of instances, we can specify the SERVICE_NAME parameter which points to a service of the database.

In order to demonstrate this, let us first create and start a service called orcls for the database with preferred instances orcl1, orcl2 and available instance orcl3. It can be observed that the newly-created service orcls is currently available on instances orcl1 and orcl2.

[oracle@host01 root]$ srvctl add service -s orcls -d orcl -r orcl1,orcl2 -a orcl3

[oracle@host01 root]$ srvctl status service -s orcls -d orcl

Service orcls is not running.

[oracle@host01 root]$ srvctl start service -s orcls -d orcl

[oracle@host01 root]$ srvctl status service -s orcls -d orcl

Service orcls is running on instance(s) orcl1,orcl2

Let us once again export the SH schema with a parallelism of 6 as earlier, but this time I am going to specify an additional parameter of SERVICE_NAME on the command line with points to service orcls, which is currently running on orcl1 and orcl2.

[oracle@host01 root]$ expdp system/oracle@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile='expsh%U.dmp' reuse_dumpfiles=y service_name=orcls

Export: Release 11.2.0.3.0 - Production on Fri Dec 4 14:48:22 2015

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining and Real Application Testing options

Starting "SYSTEM"."SYS_EXPORT_SCHEMA_09": system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y service_name=orcls

Estimate in progress using BLOCKS method...

Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA

Total estimation using BLOCKS method: 273.8 MB

Processing object type SCHEMA_EXPORT/USER

Processing object type SCHEMA_EXPORT/SYSTEM_GRANT

Processing object type SCHEMA_EXPORT/ROLE_GRANT

Processing object type SCHEMA_EXPORT/DEFAULT_ROLE

Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA

. . exported "SH"."COSTS":"COSTS_Q1_1998" 139.5 KB 4411 rows

. . exported "SH"."COSTS":"COSTS_Q2_1998" 79.52 KB 2397 rows

. . exported "SH"."COSTS":"COSTS_Q2_1999" 132.5 KB 4179 rows

…..

While the job is running, it can be verified that this time parallelism has been constrained across instances orcl1 and orcl2 only where the service is currently available.

SQL> select inst_id, session_type from dba_datapump_sessions;

INST_ID SESSION_TYPE

---------- --------------

1 WORKER

2 DBMS_DATAPUMP

2 MASTER

2 WORKER

8 rows selected.

Abort the RAC database instance on one of the nodes, say host01, where Data Pump export is currently running:

[root@host01 ~]# ps -ef |grep pmon

grid 5598 1 0 13:07 ? 00:00:00 asm_pmon_+ASM1

oracle 6157 1 0 13:08 ? 00:00:00 ora_pmon_orcl1

root 7178 6735 0 13:14 pts/1 00:00:00 grep pmon

[root@host01 ~]# kill -9 6157

As the instance aborts, the export also gets terminated:

. . exported "SH"."COSTS":"COSTS_Q2_1999" 132.5 KB 4179 rows

. . exported "SH"."COSTS":"COSTS_Q2_2000" 119.0 KB 3715 rows

. . exported "SH"."CUSTOMERS" 9.853 MB 55500 rows

. . exported "SH"."COSTS":"COSTS_Q2_2001" 184.5 KB 5882 rows

UDE-03113: operation generated ORACLE error 3113

ORA-03113: end-of-file on communication channel

Process ID: 9617

Session ID: 52 Serial number: 5

UDE-03114: operation generated ORACLE error 3114

ORA-03114: not connected to ORACLE

Restart the export by ATTACHing to the job by specifying its name:

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

[oracle@host01 root]$ expdp system/oracle@orcl ATTACH=SYS_EXPORT_SCHEMA_09

Export: Release 11.2.0.3.0 - Production on Fri Dec 4 14:49:10 2015

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining and Real Application Testing options

Job: SYS_EXPORT_SCHEMA_09

Owner: SYSTEM

Operation: EXPORT

Creator Privs: TRUE

GUID: 260EFD3967B32591E053B7C909C0C0D4

Start Time: Friday, 04 December, 2015 14:49:19

Mode: SCHEMA

Instance: orcl1

Max Parallelism: 6

EXPORT Job Parameters:

Parameter Name Parameter Value:

State: IDLING

Bytes Processed: 11,277,592

Percent Done: 4

Current Parallelism: 6

Job Error Count: 0

Dump File: +DATA/orcl//expsh%u.dmp

Dump File: +DATA/orcl//expsh01.dmp

bytes written: 4,096

Dump File: +DATA/orcl//expsh02.dmp

bytes written: 10,473,472

Dump File: +DATA/orcl//expsh03.dmp

bytes written: 827,392

Dump File: +DATA/orcl//expsh04.dmp

bytes written: 4,096

Dump File: +DATA/orcl//expsh05.dmp

bytes written: 4,096

Worker 1 Status:

Process Name: DW00

State: UNDEFINED

Object Schema: SH

Object Type: SCHEMA_EXPORT/DEFAULT_ROLE

Completed Objects: 1

Total Objects: 1

Worker Parallelism: 1

Worker 2 Status:

Process Name: DW00

State: UNDEFINED

Object Schema: SH

Object Name: COSTS

Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA

Completed Objects: 1

Total Objects: 66

Worker Parallelism: 1

Worker 3 Status:

Process Name: DW02

State: UNDEFINED

Worker 4 Status:

Process Name: DW01

State: UNDEFINED

Object Schema: SH

Object Name: COSTS

Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA

Completed Objects: 1

Total Objects: 66

Worker Parallelism: 1

Worker 5 Status:

Process Name: DW04

State: UNDEFINED

Worker 6 Status:

Process Name: DW02

State: UNDEFINED

Object Schema: SH

Object Name: COSTS

Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA

Completed Objects: 1

Total Objects: 66

Worker Parallelism: 1

Export> start_job

Export> continue_client

Job SYS_EXPORT_SCHEMA_09 has been reopened at Friday, 04 December, 2015 14:49

Restarting "SYSTEM"."SYS_EXPORT_SCHEMA_09": system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y service_name=orcls

Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA

. . exported "SH"."COSTS":"COSTS_Q1_1999" 183.5 KB 5884 rows

. . exported "SH"."COSTS":"COSTS_Q4_1998" 144.7 KB 4577 rows

. . exported "SH"."COSTS":"COSTS_Q4_1999" 159.0 KB 5060 rows

. . exported "SH"."COSTS":"COSTS_Q4_2000" 160.2 KB 5088 rows

. . exported "SH"."COSTS":"COSTS_Q4_2001" 278.4 KB 9011 rows

……

….

Processing object type SCHEMA_EXPORT/TABLE/TABLE

Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT

Processing object type SCHEMA_EXPORT/TABLE/COMMENT

Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT

Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS

Processing object type SCHEMA_EXPORT/VIEW/VIEW

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT

Processing object type SCHEMA_EXPORT/TABLE/INDEX/BITMAP_INDEX/INDEX

Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/BITMAP_INDEX/INDEX_STATISTICS

Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS

Processing object type SCHEMA_EXPORT/TABLE/INDEX/DOMAIN_INDEX/INDEX

Processing object type SCHEMA_EXPORT/MATERIALIZED_VIEW

Processing object type SCHEMA_EXPORT/DIMENSION

Master table "SYSTEM"."SYS_EXPORT_SCHEMA_09" successfully loaded/unloaded

******************************************************************************

Dump file set for SYSTEM.SYS_EXPORT_SCHEMA_09 is:

+DATA/orcl//expsh01.dmp

+DATA/orcl//expsh02.dmp

+DATA/orcl//expsh03.dmp

+DATA/orcl//expsh04.dmp

+DATA/orcl//expsh05.dmp

+DATA/orcl/expsh06.dmp

Job "SYSTEM"."SYS_EXPORT_SCHEMA_09" successfully completed at 14:50:49

Thus, a Data Pump job:

Can be parallelized across a subset of the available instances of a RAC database by specifying the SERVICE_NAME parameter.
Restarted in future if any instance, where the job is running, aborts.

Note: Since there is some overhead involved with distributing Data Pump processes across an Oracle RAC environment, there will be cases where there is no performance benefit. In those cases, you can

specify CLUSTER=N to force all Data Pump processes to run only on the instance where the job is started.

Summary:

Prior to Oracle Database 11g release 2 (11.2), if a Data Pump job is executed in parallel on an instance of a RAC database, the parallelism is employed only on that instance without utilizing the potentially valuable idle resources of other instances.
From Oracle Database 11.2 onwards, CLUSTER parameter can be employed to distribute the worker processes across multiple Oracle RAC instances to better utilize Oracle RAC resources.
To exercise more control over the instances on which the job should run, SERVICE_NAME parameter can be specified so that the Data Pump job runs only on those instances where the service is available.
If any Oracle RAC instance where the job is running dies or leaves the cluster, the job aborts and can be restarted at some future time.
For more information, Oracle have a white paper on the subject.

Register for Simple Talk

Oracle Data Pump in RAC

Current scenario:

Demonstration

Case-I: Export job distributed across all active Oracle RAC instances

Case-II: Parallelization of export job across a subset of Oracle RAC instances

Summary:

Article tags

About the author

Anju Garg's contributions

Articles

Books

Top topics

Anju Garg's latest contributions:

Policy-Managed Oracle RAC One Node Databases

Duplicate Point-In-Time Recovered PDB Using Backup

Re-registering Databases In A Cluster With srvctl: Problems and Solutions

Current scenario:

Demonstration

Case-I: Export job distributed across all active Oracle RAC instances

Case-II: Parallelization of export job across a subset of Oracle RAC instances

Summary:

Article tags

Recommended

About the author

Anju Garg's contributions

Articles

Books

Top topics

Anju Garg's latest contributions:

Policy-Managed Oracle RAC One Node Databases

Duplicate Point-In-Time Recovered PDB Using Backup

Re-registering Databases In A Cluster With srvctl: Problems and Solutions