Exporting BasicFile LOBs can be quite a pain. If your maintenance window is too short, here is a way to make the export faster.
The following applies to BasicFile LOBs:
select owner, table_name, column_name
from dba_lobs
where securefile='NO';
Background
The old BasicFile LOBs do not offer any parallel access methods. The new, improved SecureFile LOBs offer superior functionality, including full parallel access to the LOB. This is a good reason to migrate any old BasicFile LOBs in your database.
When Data Pump starts to export a table with a BasicFile LOB, only one worker will be assigned (due to the limited parallel access methods). If you have a large table, it will take a lot of time for that one worker to process the table.
Solution
Instead of one Data Pump job with one worker processing the table with the LOB, the idea is to start multiple Data Pump exports. Each export has one worker working on a dedicated part of the table.
Imagine you have a table with 100 rows. You start four concurrent Data Pump sessions:
| Job | Rows |
|---|---|
| Data Pump job 1 | 1-25 |
| Data Pump job 2 | 26-50 |
| Data Pump job 3 | 51-75 |
| Data Pump job 4 | 76-100 |
To do this, you need to use the Data Pump query parameter.
Generating the Predicates
Now, you need a way to split the table into chunks and generate a predicate for each of the concurrent Data Pump sessions. It is important that:
- All rows are exported.
- No rows are exported more than once.
Further, you should ensure that:
- The rows are spread evenly across the Data Pump jobs.
ROWIDs
An elegant solution is to use modulus and the block number that holds the row. You can get the block number using rowid and dbms_rowid:
| Job | Predicate |
|---|---|
| Data Pump job 1 | where mod(dbms_rowid.rowid_block_number(rowid), 4) = 0 |
| Data Pump job 2 | where mod(dbms_rowid.rowid_block_number(rowid), 4) = 1 |
| Data Pump job 3 | where mod(dbms_rowid.rowid_block_number(rowid), 4) = 2 |
| Data Pump job 4 | where mod(dbms_rowid.rowid_block_number(rowid), 4) = 3 |
This might not be the fastest option, but it is a generic one that works without any knowledge of the table structure you are exporting.
Keys
You can also split the table on a primary/unique key or any other column that offers an even distribution of values:
| Job | Predicate |
|---|---|
| Data Pump job 1 | where mod(--primary_key_column--, 4) = 0 |
| Data Pump job 2 | where mod(--primary_key_column--, 4) = 1 |
| Data Pump job 3 | where mod(--primary_key_column--, 4) = 2 |
| Data Pump job 4 | where mod(--primary_key_column--, 4) = 3 |
More Workers
If you need more sessions working on the table, you can change the modulus operation and add more Data Pump jobs.
How to Export With ROWID
Here is the first of the Data Pump parameter files:
logfile=exp_lob_0.log
dumpfile=exp_lob_0.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 0"
The second one. Notice how the serial number changes in the logfile and dumpfile parameters. Also, the modulus operation changes:
logfile=exp_lob_1.log
dumpfile=exp_lob_1.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 1"
The third one:
logfile=exp_lob_2.log
dumpfile=exp_lob_2.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 2"
The fourth one:
logfile=exp_lob_3.log
dumpfile=exp_lob_3.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 3"
Now, you can start four concurrent Data Pump jobs. You must execute the commands in different terminals:
expdp ... parfile=exp_lob_0.par
expdp ... parfile=exp_lob_1.par
expdp ... parfile=exp_lob_2.par
expdp ... parfile=exp_lob_3.par
How to Import
When importing, you change the LOB to SecureFile LOB, which allows parallel access. Then, take one Data Pump job at a time.
-
The first job does a regular schema import of the first dump file:
impdp ... \ dumpfile=exp_lob_0.dmp \ logfile=imp_lob_0.log transform=lob_storage:securefile \ parallel=4- It also creates the table itself.
- You should always convert the LOB to SecureFile LOB during import. Conversion happens on-the-fly and allows for parallel import of data. Even if the LOB originally was a BasicFile LOB.
-
Now, proceed with the other dump files in serial. Since you already converted the LOB to a SecureFile (previous step), you can now perform a Data Pump import using native parallelism:
impdp ... \ dumpfile=exp_lob_1.dmp \ logfile=imp_lob_1.log parallel=4 \ table_exists_action=append impdp ... \ dumpfile=exp_lob_2.dmp \ logfile=imp_lob_2.log parallel=4 \ table_exists_action=append impdp ... \ dumpfile=exp_lob_3.dmp \ logfile=imp_lob_3.log parallel=4 \ table_exists_action=append- Notice how the
dumpfileandlogfilechanges for each job. - Run the jobs in serial. Each job will use Data Pump parallelism.
- Notice how the
That’s it!
Few Words of Caution
-
You must figure out how you create indexes – if they are present on the table. Ideally, you want to postpone index creation until the last job has loaded its rows. Otherwise, you’ll have expensive index maintenance happening when you load data.
-
Data Pump uses Advanced Queueing, which relies on the streams pool in the SGA. Be sure that
streams_pool_sizeis set high enough when you start multiple data pump sessions. In most situations, setting it to 2G should be more than enough:alter system set streams_pool_size=2G scope=memory; -
Do this in a maintenance window only with no users connected.
