Faster Data Pump Import of LOBs Over Database Link

A colleague was helping a customer optimize an import using Data Pump via a database link that involved SecureFile LOBs.

Do you see a way to parallelize the direct import to improve performance and thus shorten the time it takes to import? Or is it not possible for LOB data?

Network mode imports are a flexible way of importing your data when you have limited access to the source system. However, it comes with the price of restrictions. One of them being:

  • Network mode import does not use parallel query (PQ) child processes.

In Data Pump, one worker will process a table data object which is either a:

  • Table
  • Table partition
  • Table subpartition

So, for a regular table, this means just one worker is processing the table and it doesn’t use parallel query. That’s bound to be slow for larger data sets, but can you do something?

Starting Point

To illustrate my point, I’ll use a sample data set consisting of:

  • One schema (BLOBLOAD)
  • With one table (TAB1)
  • Containing two columns
    • Number (ID)
    • BLOB (BLOB_DATA)
  • The table has around 16.000 rows
  • Size is 50 GB

Doing a regular Data Pump import over a database link is slow because there’s only one worker and no parallel query:

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   parallel=4

...

21-OCT-24 05:30:36.813: Job "SYSTEM"."SYS_IMPORT_SCHEMA_01" successfully completed at Mon Oct 21 05:30:36 2024 elapsed 0 00:11:50

Almost 12 minutes!

Partitioning

Since we know that multiple workers can process different partitions of the same table, let’s try to partition the source table. I’ll use hash partitioning and ensure my partitions are equally distributed:

alter table tab1 
modify partition by hash (id) 
partitions 32 online;

Repeat the import:

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   parallel=4

...

21-OCT-24 09:08:00.897: Job "SYSTEM"."SYS_IMPORT_SCHEMA_01" successfully completed at Mon Oct 21 09:08:00 2024 elapsed 0 00:04:26

Just 4m 26s – that’s a huge improvement!

In the log, file you’ll see that multiple workers are processing partitions individually. So, even without parallel query, I get parallelism because of multiple workers on the same table – each on different partitions.

But partitioning is a separately licensed option.

Using QUERY Parameter and Multiple Data Pump Imports

I’ve previously blocked about do-it-yourself parallelism for Data Pump exports of BasicFile LOBs. Can I use the same approach here?

The idea is to start multiple Data Pump jobs importing the same table, but each working on a subset of the data.

  • First, import just the metadata
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=metadata_only
    
  • Next, start 4 concurrent imports importing just the rows. Each import works on a subset of the data using thery query parameter:
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=data_only \
       query="where mod(id, 4)=0"
    
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=data_only \
       query="where mod(id, 4)=1"
    
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=data_only \
       query="where mod(id, 4)=2"
    
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=data_only \
       query="where mod(id, 4)=3"
    

No – that’s not possible. During imports, Data Pump acquires a lock on the table being imported using the APPEND hint. This is from a trace of the imports:

INSERT /*+  APPEND  NESTED_TABLE_SET_REFS   PARALLEL(KUT$,1)   */ INTO "BLOBLOAD"."TAB1"  KUT$ ("ID", "BLOB_DATA")
SELECT /*+ NESTED_TABLE_GET_REFS  PARALLEL(KU$,1)  */ "ID", "BLOB_DATA" FROM "BLOBLOAD"."TAB1"@srclnk KU$ WHERE mod(id, 4)=1

If you try to start multiple imports into the same table, you get an error:

ORA-02049: timeout: distributed transaction waiting for lock

So, let’s prevent that by adding data_options=disable_append_hint to each Data Pump import jobs.

Now, multiple Data Pump jobs may work on the same table, but it doesn’t scale lineary.

  • One concurrent job: Around 12 minutes
  • Four concurrent jobs: Around 8 minutes
  • Eight concurrent jobs: Around 7 minutes

It gives a performance benefit, but probably not as much as you’d like.

Two-Step Import

If I can’t import into the same table, how about starting four simultaneous Data Pump jobs using the do-it-yourself approach above, but importing into separate tables and then combining all the tables afterward?

I’ll start by loading 1/4 of the rows (notice the QUERY parameter):

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   query=\(blobload.tab1:\"WHERE mod\(id, 4\)=0\"\)

While that runs, I’ll start three separate Data Pump jobs that each work on a different 1/4 of the data. I’m remapping the table into a new table to avoid the locking issue:

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   include=table \
   remap_table=tab1:tab1_2 \
   query=\(blobload.tab1:\"WHERE mod\(id, 4\)=1\"\)

In the remaining two jobs, I’ll slightly modify the QUERY and REMAP_TABLE parameters:

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   include=table \
   remap_table=tab1:tab1_3 \
   query=\(blobload.tab1:\"WHERE mod\(id, 4\)=2\"\)
impdp ... \
   network_link=srclnk \
   schemas=blobload \
   include=table \
   remap_table=tab1:tab1_4 \
   query=\(blobload.tab1:\"WHERE mod\(id, 4\)=3\"\)

Now, I can load the rows from the three staging tables into the real one:

ALTER SESSION FORCE PARALLEL DML;
INSERT /*+ APPEND PARALLEL(a) */ INTO "BLOBLOAD"."TAB1" a 
SELECT /*+ PARALLEL(b)  */ * FROM "BLOBLOAD"."TAB1_2" b;
commit;

INSERT /*+ APPEND PARALLEL(a) */ INTO "BLOBLOAD"."TAB1" a 
SELECT /*+ PARALLEL(b)  */ * FROM "BLOBLOAD"."TAB1_3" b;
commit;

INSERT /*+ APPEND PARALLEL(a) */ INTO "BLOBLOAD"."TAB1"  a
SELECT /*+ PARALLEL(b)  */ * FROM "BLOBLOAD"."TAB1_4" b;
commit;

This approach took around 7 minutes (3m 30s for the Data Pump jobs, and 3m 30s to load the rows into the real table). Slower than partitioning but still faster than the starting point.

This approach is complicated; the more data you have, the more you need to consider things like transaction size and index maintenance.

Conclusion

Network mode imports have many restrictions, which also affect performance. Partitioning is the easiest and fastest improvement, but it requires the appropriate license option. The final resort is to perform some complicated data juggling.

Alternatively, abandon network mode imports and use dump files. In dump file mode, one worker can use parallel query during export and import, which is also fast.

Thanks

I used an example from oracle-base.com to generate the test data.

Data Pump and Faster Export of SecureFile LOBs

SecureFile LOBs are the best LOB type, partly because it allows parallel access to and from the LOB. The parallel access allows Data Pump to unload and load data faster.

The following applies to SecureFile LOBs:

select owner, table_name, column_name 
from   dba_lobs 
where  securefile='YES';

There is another blog post for Data Pump and BasicFile LOBs.

How Data Pump Works

First of all, you must allow Data Pump to work in parallel. Only when so, Data Pump assigns one worker – and only one worker – per table data object. That is a table, partition or subpartition.

Suppose the table data object is big enough, that one worker uses parallel query (PQ) to unload the data. The PQ worker will hand over the data to the worker, who will write it to the dump file.

How much is big enough? By default, Data Pump examines the object statistics, and if the segment is larger than 250 MB, it will use parallel query. You can change this threshold using the parallel_threshold parameter.

To export a SecureFile LOB in parallel:

  • You must allow Data Pump to do parallel work using the parallel parameter.
  • The table data object must be of a certain size.

Data Pump determines the size by using either:

  • Optimizer statistics (default) – estimate=statistics
  • Calculation – estimate=blocks

Why doesn’t Data Pump use parallel query on all objects? Because it takes time to start a PQ worker and keep track of them. It is a waste on small table data objects and will often be slower. Parallel query makes sense only when a table data object reaches a certain size.

How the Database Stores LOBs

With LOBs you can potentially store a lot of data.

If a LOB (in a single row) is less than 4000 bytes, it is stored in-row together with the rest of the data. Having small LOBs as part of the table is efficient. However, if a LOB is larger than 4000 bytes, it is inefficient to store it in-row. Instead, the database stores the entire LOB in a separate LOB segment. This is called out-of-row.

In Oracle Database 23ai, you can optionally store up to 8000 bytes in-row.

When Statistics Are Not Enough

Update, 6 May 2024: The below mentioned issue is solved in the 19.23.0 Data Pump Bundle Patch

If a LOB is stored in-row, the space it uses, is added to the total segment space of the table. However, if the LOB is stored out-of-row, the LOB space is added to the LOB segment space usage.

The table statistics (dba_tab_statistics) show only the blocks (i.e., space) used by the table. The size of the LOB segment does not affect the size of the table in the optimizer statistics. This is by design, because the purpose of statistics is to enable the optimizer to come up with good plans, and the size of the LOB segment is irrelevant in that situation.

In other words, you can have a table with 100 rows and 1 TB of LOB data. If all that LOB data is stored out-of-row, the table looks really small.

If Data Pump determines the size of the table based on the statistics, that table with 1 TB LOBs looks really small. As a result, Data Pump will not use parallel query on it.

This applies also to partitioned tables. In this case, Data Pump looks at the partition statistics instead of the table statistics.

Faster Exports

Bundle Patch

The patch to fix the above issue is included in the 19.23.0 Data Pump Bundle Patch.

Don’t Estimate By Statistics

You can change the way Data Pump determines the size of a table from statistics to a proper calculation:

expdp ... estimate=blocks
  • Estimating by blocks is slower than using statistics. You will see that the startup phase in Data Pump takes longer.
  • Due to a bug in Data Pump, this will not work unless your database is on 19.18.0 with the Data Pump bundle patch.

Fake Statistics

Since Data Pump only looks at the table data object statistics, you can fake the statistics. Trick Data Pump into believing the table itself is huge by setting statistics to high values:

begin
 dbms_stats.set_table_stats (
   ownname => 'APPUSER', 
   tabname => 'T1',
   numrows => 10000000, 
   numblks => 1000000);
 end;
 /
  • You must do this for all tables with big LOBs.
  • It requires testing to get the best result.
  • The fake statistics will influence the choices made by the optimizer, so only do this in your maintenance window. Also, note that setting statistics invalidates cursors in the library cache.
  • Ensure that the statistics gathering job doesn’t overwrite your fake stats.

Use Partitioning

If you partition a table, Data Pump uses one worker per partition or subpartition. Data Pump might also use parallel query on a single partition/subpartition depending on the number of partitions and subpartitions and available parallel processes. But only if the partition statistics reflect that it is significant in size. Otherwise, you run into the same problem as described above.

An advantage of having multiple Data Pump workers on the same partitioned table, is that more workers can write to dump files simultaneously. Not only can Data Pump read faster from the database, it can also store it faster into dump files.

Final Words

Our developers solved the issue in the 19.23.0 Data Pump Bundle Patch. The best solution is of course to stay current and apply the Data Pump Bundle Patch.

Finally, as a reminder, always convert your LOBs to SecureFile LOBs on import:

impdp ... transform=lob_storage:securefiles

Data Pump and Faster Export of BasicFile LOBs

Exporting BasicFile LOBs can be quite a pain. If your maintenance window is too short, here is a way to make the export faster.

The following applies to BasicFile LOBs:

select owner, table_name, column_name 
from   dba_lobs 
where  securefile='NO';

Background

The old BasicFile LOBs do not offer any parallel access methods. The new, improved SecureFile LOBs offer superior functionality, including full parallel access to the LOB. This is a good reason to migrate any old BasicFile LOBs in your database.

When Data Pump starts to export a table with a BasicFile LOB, only one worker will be assigned (due to the limited parallel access methods). If you have a large table, it will take a lot of time for that one worker to process the table.

Solution

Instead of one Data Pump job with one worker processing the table with the LOB, the idea is to start multiple Data Pump exports. Each export has one worker working on a dedicated part of the table.

Imagine you have a table with 100 rows. You start four concurrent Data Pump sessions:

Job Rows
Data Pump job 1 1-25
Data Pump job 2 26-50
Data Pump job 3 51-75
Data Pump job 4 76-100

To do this, you need to use the Data Pump query parameter.

Generating the Predicates

Now, you need a way to split the table into chunks and generate a predicate for each of the concurrent Data Pump sessions. It is important that:

  • All rows are exported.
  • No rows are exported more than once.

Further, you should ensure that:

  • The rows are spread evenly across the Data Pump jobs.

ROWIDs

An elegant solution is to use modulus and the block number that holds the row. You can get the block number using rowid and dbms_rowid:

Job Predicate
Data Pump job 1 where mod(dbms_rowid.rowid_block_number(rowid), 4) = 0
Data Pump job 2 where mod(dbms_rowid.rowid_block_number(rowid), 4) = 1
Data Pump job 3 where mod(dbms_rowid.rowid_block_number(rowid), 4) = 2
Data Pump job 4 where mod(dbms_rowid.rowid_block_number(rowid), 4) = 3

This might not be the fastest option, but it is a generic one that works without any knowledge of the table structure you are exporting.

Keys

You can also split the table on a primary/unique key or any other column that offers an even distribution of values:

Job Predicate
Data Pump job 1 where mod(--primary_key_column--, 4) = 0
Data Pump job 2 where mod(--primary_key_column--, 4) = 1
Data Pump job 3 where mod(--primary_key_column--, 4) = 2
Data Pump job 4 where mod(--primary_key_column--, 4) = 3

More Workers

If you need more sessions working on the table, you can change the modulus operation and add more Data Pump jobs.

How to Export With ROWID

Here is the first of the Data Pump parameter files:

logfile=exp_lob_0.log
dumpfile=exp_lob_0.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 0"

The second one. Notice how the serial number changes in the logfile and dumpfile parameters. Also, the modulus operation changes:

logfile=exp_lob_1.log
dumpfile=exp_lob_1.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 1"

The third one:

logfile=exp_lob_2.log
dumpfile=exp_lob_2.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 2"

The fourth one:

logfile=exp_lob_3.log
dumpfile=exp_lob_3.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 3"

Now, you can start four concurrent Data Pump jobs. You must execute the commands in different terminals:

expdp ... parfile=exp_lob_0.par
expdp ... parfile=exp_lob_1.par
expdp ... parfile=exp_lob_2.par
expdp ... parfile=exp_lob_3.par

How to Import

When importing, you change the LOB to SecureFile LOB, which allows parallel access. Then, take one Data Pump job at a time.

  • The first job does a regular schema import of the first dump file:

    impdp ... \
       dumpfile=exp_lob_0.dmp \
       logfile=imp_lob_0.log
       transform=lob_storage:securefile \
       parallel=4
    
    • It also creates the table itself.
    • You should always convert the LOB to SecureFile LOB during import. Conversion happens on-the-fly and allows for parallel import of data. Even if the LOB originally was a BasicFile LOB.
  • Now, proceed with the other dump files in serial. Since you already converted the LOB to a SecureFile (previous step), you can now perform a Data Pump import using native parallelism:

    impdp ... \
       dumpfile=exp_lob_1.dmp \
       logfile=imp_lob_1.log
       parallel=4 \
       table_exists_action=append
    impdp ... \
       dumpfile=exp_lob_2.dmp \
       logfile=imp_lob_2.log
       parallel=4 \
       table_exists_action=append
    impdp ... \
       dumpfile=exp_lob_3.dmp \
       logfile=imp_lob_3.log
       parallel=4 \	  
       table_exists_action=append	  
    
    • Notice how the dumpfile and logfile changes for each job.
    • Run the jobs in serial. Each job will use Data Pump parallelism.

That’s it!

Few Words of Caution

  • You must figure out how you create indexes – if they are present on the table. Ideally, you want to postpone index creation until the last job has loaded its rows. Otherwise, you’ll have expensive index maintenance happening when you load data.

  • Data Pump uses Advanced Queueing, which relies on the streams pool in the SGA. Be sure that streams_pool_size is set high enough when you start multiple data pump sessions. In most situations, setting it to 2G should be more than enough:

    alter system set streams_pool_size=2G scope=memory;
    
  • Do this in a maintenance window only with no users connected.