Data Pump and Parallel Transportable Jobs

In migrations, you often use transportable tablespaces or Full Transportable Export/Import (FTEX). Downtime is always of concern when you migrate, so having Data Pump perform the transportable job in parallel is a huge benefit.

How much benefit? Let’s find out with a little benchmark.

The Results

Let’s start with the interesting part. How much time can you save using parallel transportable jobs in Data Pump.

The following table lists four different scenarios. Each scenario consists of an export and an import. The total is the time Data Pump needs to finish the migration – the sum of export and import. In a real migration, many other things are in play, but here, we are looking solely at Data Pump performance.

Export	Time	Import	Time	Total
19c, no parallel	2h 2m	19c, no parallel	6h 44m	8h 46m
23ai, parallel 4	1h 48m	23ai, parallel 4	2h 33m	4h 21m
19c, no parallel	2h 2m	23ai, parallel 16	1h 23m	3h 25m
23ai, parallel 16	1h 8m	23ai, parallel 16	1h 23m	2h 31m

The first row is what you can do in Oracle Database 19c, almost 9 hours. Compare that to the last row you can do with parallel 16 in Oracle Database 23ai, almost a 3.5x reduction.

If you migrate from Oracle Database 19c to Oracle Database 23ai (3rd row), you can still benefit from parallel import and gain a significant benefit.

The Benchmark

I used the following Oracle homes:

Oracle Database 19c, 19.19.0 plus Data Pump Bundle Patch
Oracle Database 23ai, 23.3.0

My test database:

E-Business Suite database
300 GB physical size
630.000 database objects, including
- 89.000 tables
- 66.000 indexes
- 60.000 packages
CPU_COUNT = 16
SGA_TARGET = 64G

My test machine:

OCI shape VM.Standard.E4.Flex
8 CPUs
128 GB mem
Fast disks (max. IOPS 128.000)

How to

It’s very easy to enable parallel transportable jobs. If you want to use 16 parallel workers, on export:

expdp ... parallel=16 dumpfile=ftex%L.dmp

On import:

impdp ... parallel=16

What Happens

A Data Pump job consists of several object paths that Data Pump must process. An object path could be tables, indexes, or package bodies.

Parallel Export

Each worker takes an object path and starts to process it. The worker works alone on this object path. You get parallelism by multiple workers processing multiple object paths simultaneously.

Parallel Import

During import, Data Pump must process each object path in a certain order. Data Pump can only import constraints once it has imported tables, for example.

Data Pump processes each object path in the designated order, then splits the work in one object path to many workers. You get parallelism by multiple workers processing one object path in parallel.

The Fine Print

Parallel transportable jobs work in Oracle Database 21c and later. In Oracle Database 19c, a transportable job has no parallel capabilities.
Data Pump can use parallel only on transportable jobs via a dump file. Network mode is not an option for parallel transportable jobs.
If you export in Oracle Database 19c (which does not support parallel transportable jobs), you can still perform a parallel import into Oracle Database 23ai.
The export parallel degree and import parallel degree do not have to match. You can export with parallel degree 1 and import with parallel degree 16.
When you enable parallel jobs, Data Pump starts more workers. How much extra resources do they use?
- I didn’t notice any significant difference in undo or temp tablespace use.
- I didn’t notice any extra pressure on the streams pool, either. I had the streams pool set to 256M, and the database didn’t perform any SGA resize operation during my benchmark.

Conclusion

For migrations, parallel transportable jobs in Data Pump are a huge benefit. Every minute of downtime often counts, and this has a massive impact.

How to Use Oracle Data Pump When You Have Common Objects

One of the benefits of the multitenant architecture in Oracle Database is the concept of common objects. If you want to move data around with Oracle Data Pump, is there anything you must know?

What Are Common Objects

The documentation describes common objects as:

A common phenomenon defined in a root is the same in all containers plugged in to this root. … For example, if you create a common user account while connected to CDB$ROOT, then this user account is common to all PDBs and application roots in the CDB.

A simpler explanation:

The stuff you create as C##.... in the root container (CDB$ROOT).

The principles of commonality are:

A common phenomenon is the same in every existing and future container. …

Only a common user can alter the existence of common phenomena. More precisely, only a common user logged in to either the CDB root or an application root can create, destroy, or modify attributes of a user, role, or object that is common to the current container.

For illustration purposes, imagine a common profile used by a local user. You would create such as:

alter session set container=CDB$ROOT;
create profile c##commonprofile1 ... ;
alter session set container=pdb1;
create user localuser1 ... profile c##commonprofile1;

With other object types, you can make it even more complex, but the principles remain the same.

What Happens In Oracle Data Pump

On Export

Data Pump

Connects to the PDB.
Extracts the relevant data and metadata.

Using the above example, Data Pump extracts the user DDL. The user DDL specifies the use of a common profile.

On Import

Data Pump

Connects to the PDB.
Assumes that you have already created the common objects.
Creates the local user localuser1 and specifies the use of the common profile c##commonprofile1.

If the common profile does not exist already, the creation of the user fails:

Processing object type DATABASE_EXPORT/SCHEMA/USER
ORA-39083: Object type USER:"LOCALUSER1" failed to create with error:
ORA-02380: profile C##COMMONPROFILE1 does not exist

Failing sql is:
 CREATE USER "LOCAL1" ... PROFILE "C##COMMONPROFILE1"

Why Don’t Data Pump Create The Common Objects?

First, Data Pump connects to the PDB itself to perform the import. Since you must create common objects in the root container, that’s obviously a problem. The importing user might not have access to the root container, so switching containers and creating common objects is impossible.
Also, it would break the contract in the multitenant architecture that each PDB is isolated. If one PDB were allowed to create common objects, those objects would also be available in other PDBs, and the isolation would no longer exist.
Finally, it would open for all sorts of security issues if a user connected to a PDB could create common objects.

What Can You Do?

A CDB Blueprint

Ideally, you should have some CDB blueprint; A definition of the common objects used in your databases. When you create a new CDB, you have an afterburner that creates those common objects according to your company guidelines.

Examine the Dump File

You can also examine the DDLs in the Data Pump file and create a list of common objects.

Extract the DDL:
```
impdp ... sqlfile=import_ddls.sql
```
Search for C## and build a list of common objects needed:
```
grep -n -i "C##" import_ddls.sql
```
This is a crude search, and I imagine awk aficionados can improve it.
Extract the DDL for the common objects from the source CDB:
```
l_stmt := dbms_metadata.get_ddl(...);
dbms_output.put_line(l_stmt);
l_stmt := dbms_metadata.get_granted_ddl(...);
dbms_output.put_line(l_stmt);
```
Use dbms_metadata.get_granted_ddl to extract object and role grants and system privileges.

Create the common objects in the target CDB before import:

alter session set container=cdb$root;
create profile c##commonprofile1 ... ;

Data Pump and Faster Export of SecureFile LOBs

SecureFile LOBs are the best LOB type, partly because it allows parallel access to and from the LOB. The parallel access allows Data Pump to unload and load data faster.

The following applies to SecureFile LOBs:

select owner, table_name, column_name 
from   dba_lobs 
where  securefile='YES';

There is another blog post for Data Pump and BasicFile LOBs.

How Data Pump Works

First of all, you must allow Data Pump to work in parallel. Only when so, Data Pump assigns one worker – and only one worker – per table data object. That is a table, partition or subpartition.

Suppose the table data object is big enough, that one worker uses parallel query (PQ) to unload the data. The PQ worker will hand over the data to the worker, who will write it to the dump file.

How much is big enough? By default, Data Pump examines the object statistics, and if the segment is larger than 250 MB, it will use parallel query. You can change this threshold using the parallel_threshold parameter.

To export a SecureFile LOB in parallel:

You must allow Data Pump to do parallel work using the parallel parameter.
The table data object must be of a certain size.

Data Pump determines the size by using either:

Optimizer statistics (default) – estimate=statistics
Calculation – estimate=blocks

Why doesn’t Data Pump use parallel query on all objects? Because it takes time to start a PQ worker and keep track of them. It is a waste on small table data objects and will often be slower. Parallel query makes sense only when a table data object reaches a certain size.

How the Database Stores LOBs

With LOBs you can potentially store a lot of data.

If a LOB (in a single row) is less than 4000 bytes, it is stored in-row together with the rest of the data. Having small LOBs as part of the table is efficient. However, if a LOB is larger than 4000 bytes, it is inefficient to store it in-row. Instead, the database stores the entire LOB in a separate LOB segment. This is called out-of-row.

In Oracle Database 23ai, you can optionally store up to 8000 bytes in-row.

When Statistics Are Not Enough

Update, 6 May 2024: The below mentioned issue is solved in the 19.23.0 Data Pump Bundle Patch

If a LOB is stored in-row, the space it uses, is added to the total segment space of the table. However, if the LOB is stored out-of-row, the LOB space is added to the LOB segment space usage.

The table statistics (dba_tab_statistics) show only the blocks (i.e., space) used by the table. The size of the LOB segment does not affect the size of the table in the optimizer statistics. This is by design, because the purpose of statistics is to enable the optimizer to come up with good plans, and the size of the LOB segment is irrelevant in that situation.

In other words, you can have a table with 100 rows and 1 TB of LOB data. If all that LOB data is stored out-of-row, the table looks really small.

If Data Pump determines the size of the table based on the statistics, that table with 1 TB LOBs looks really small. As a result, Data Pump will not use parallel query on it.

This applies also to partitioned tables. In this case, Data Pump looks at the partition statistics instead of the table statistics.

Faster Exports

Bundle Patch

The patch to fix the above issue is included in the 19.23.0 Data Pump Bundle Patch.

Don’t Estimate By Statistics

You can change the way Data Pump determines the size of a table from statistics to a proper calculation:

expdp ... estimate=blocks

Estimating by blocks is slower than using statistics. You will see that the startup phase in Data Pump takes longer.
Due to a bug in Data Pump, this will not work unless your database is on 19.18.0 with the Data Pump bundle patch.

Fake Statistics

Since Data Pump only looks at the table data object statistics, you can fake the statistics. Trick Data Pump into believing the table itself is huge by setting statistics to high values:

begin
 dbms_stats.set_table_stats (
   ownname => 'APPUSER', 
   tabname => 'T1',
   numrows => 10000000, 
   numblks => 1000000);
 end;
 /

You must do this for all tables with big LOBs.
It requires testing to get the best result.
The fake statistics will influence the choices made by the optimizer, so only do this in your maintenance window. Also, note that setting statistics invalidates cursors in the library cache.
Ensure that the statistics gathering job doesn’t overwrite your fake stats.

Use Partitioning

If you partition a table, Data Pump uses one worker per partition or subpartition. Data Pump might also use parallel query on a single partition/subpartition depending on the number of partitions and subpartitions and available parallel processes. But only if the partition statistics reflect that it is significant in size. Otherwise, you run into the same problem as described above.

An advantage of having multiple Data Pump workers on the same partitioned table, is that more workers can write to dump files simultaneously. Not only can Data Pump read faster from the database, it can also store it faster into dump files.

Final Words

Our developers solved the issue in the 19.23.0 Data Pump Bundle Patch. The best solution is of course to stay current and apply the Data Pump Bundle Patch.

Finally, as a reminder, always convert your LOBs to SecureFile LOBs on import:

impdp ... transform=lob_storage:securefiles

Applying Parallelism to Data Pump Exports and Imports

When you use Oracle Data Pump, you often want it to happen fast. Applying parallelism is the obvious choice. Here are a few things you might not be aware of.

Parallel and Export

During export, when a worker needs to write to a file, it needs an exclusive lock on a dump file. If your dump file specification does not allow creating multiple files, then the workers will queue up waiting for a lock on that single dump file.

You should always allow Data Pump to create as many dump files as it wants:

expdp ... dumpfile=my_exp_%L.dmp

The wildcard operator, %L, allows the creation of many files by adding a number. Previously, you would use %U, but %L allows more files. Check the documentation for other substitution variables.

Import From a Single Dump File

Even if you made the export with no parallel or just have one dump file, you can still import in parallel.

During an import, Data Pump only needs to read from the dump file, and multiple workers can read from the same file without problems. The only issue would be how much I/O a single file can handle for a large database.

Does import parallelism has to match export parallelism, and the answer is no. The two are independent. This would work fine:

expdp ... parallel=4
impdp ... parallel=64

Import From a Previous Release

In Oracle Database 12.2, Data Pump starts to support parallel metadata import during a regular import. If you import from a dump file from a previous version (that does not support it), you can still import metadata in parallel in Oracle Database 12.2 and newer. There are no specific requirements for the dump file to perform parallel operations.

The same applies to transportable tablespace jobs, which you can do in parallel in Oracle Database 21c.

Want to Know More?

How to Export Database Scheduler Jobs Using Data Pump

Which options do you have to move scheduler jobs (DBMS_SCHEDULER) to a different Oracle Database?

Data Pump

In Data Pump, scheduler jobs are referred to as PROCOBJ. You can get just the scheduler jobs using:

expdp ... include=procobj
impdp ... include=procobj

If you need just a subset of the scheduler jobs (works for impdp as well):

expdp ... include=procobj:" IN ('JOB1', 'JOB2')"

Data Pump and SQLFILE

If you have a Data Pump dump file, you can extract the DDL of the scheduler jobs using the sqlfile option:

impdp ... include=procobj \
   sqlfile=my_scheduler_jobs.sql

The DDL is written to a text file specified by sqlfile. This allows you to review the job defintions and, optionally, change them. Then, you execute the commands.

DBMS_METADATA

If you need to export just a few jobs, it might be easier to simply get the defintions using the metadata API:

select
   dbms_metadata.get_ddl('PROCOBJ','<name>','<owner>')
from 
   dual;

SYS Jobs

In older releases of Oracle Database, you could not get the definition of a job owned by SYS. A simple workaround is to copy the job into another schema and then generate the DDL:

exec dbms_scheduler.copy_job('SYS.MY_JOB','APPUSER.MY_JOB');
select
   dbms_metadata.get_ddl('PROCOBJ','MY_JOB','APPUSER')
from 
   dual;

Appendix

Credits

Thanks to my colleague, Klaus Gronau. He has done more migrations than I’ll ever do and he maintains a remarkable set of notes. From his notes, he dug out many good pointers.

Data Pump and Faster Export of BasicFile LOBs

Exporting BasicFile LOBs can be quite a pain. If your maintenance window is too short, here is a way to make the export faster.

The following applies to BasicFile LOBs:

select owner, table_name, column_name 
from   dba_lobs 
where  securefile='NO';

Background

The old BasicFile LOBs do not offer any parallel access methods. The new, improved SecureFile LOBs offer superior functionality, including full parallel access to the LOB. This is a good reason to migrate any old BasicFile LOBs in your database.

When Data Pump starts to export a table with a BasicFile LOB, only one worker will be assigned (due to the limited parallel access methods). If you have a large table, it will take a lot of time for that one worker to process the table.

Solution

Instead of one Data Pump job with one worker processing the table with the LOB, the idea is to start multiple Data Pump exports. Each export has one worker working on a dedicated part of the table.

Imagine you have a table with 100 rows. You start four concurrent Data Pump sessions:

Job	Rows
Data Pump job 1	1-25
Data Pump job 2	26-50
Data Pump job 3	51-75
Data Pump job 4	76-100

To do this, you need to use the Data Pump query parameter.

Generating the Predicates

Now, you need a way to split the table into chunks and generate a predicate for each of the concurrent Data Pump sessions. It is important that:

All rows are exported.
No rows are exported more than once.

Further, you should ensure that:

The rows are spread evenly across the Data Pump jobs.

ROWIDs

An elegant solution is to use modulus and the block number that holds the row. You can get the block number using rowid and dbms_rowid:

Job	Predicate
Data Pump job 1	`where mod(dbms_rowid.rowid_block_number(rowid), 4) = 0`
Data Pump job 2	`where mod(dbms_rowid.rowid_block_number(rowid), 4) = 1`
Data Pump job 3	`where mod(dbms_rowid.rowid_block_number(rowid), 4) = 2`
Data Pump job 4	`where mod(dbms_rowid.rowid_block_number(rowid), 4) = 3`

This might not be the fastest option, but it is a generic one that works without any knowledge of the table structure you are exporting.

Keys

You can also split the table on a primary/unique key or any other column that offers an even distribution of values:

Job	Predicate
Data Pump job 1	`where mod(--primary_key_column--, 4) = 0`
Data Pump job 2	`where mod(--primary_key_column--, 4) = 1`
Data Pump job 3	`where mod(--primary_key_column--, 4) = 2`
Data Pump job 4	`where mod(--primary_key_column--, 4) = 3`

More Workers

If you need more sessions working on the table, you can change the modulus operation and add more Data Pump jobs.

How to Export With ROWID

Here is the first of the Data Pump parameter files:

logfile=exp_lob_0.log
dumpfile=exp_lob_0.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 0"

The second one. Notice how the serial number changes in the logfile and dumpfile parameters. Also, the modulus operation changes:

logfile=exp_lob_1.log
dumpfile=exp_lob_1.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 1"

The third one:

logfile=exp_lob_2.log
dumpfile=exp_lob_2.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 2"

The fourth one:

logfile=exp_lob_3.log
dumpfile=exp_lob_3.dmp
tables=appuser.table_with_lob
query="where mod(dbms_rowid.rowid_block_number(rowid), 4) = 3"

Now, you can start four concurrent Data Pump jobs. You must execute the commands in different terminals:

expdp ... parfile=exp_lob_0.par
expdp ... parfile=exp_lob_1.par
expdp ... parfile=exp_lob_2.par
expdp ... parfile=exp_lob_3.par

How to Import

When importing, you change the LOB to SecureFile LOB, which allows parallel access. Then, take one Data Pump job at a time.

The first job does a regular schema import of the first dump file:
```
impdp ... \
   dumpfile=exp_lob_0.dmp \
   logfile=imp_lob_0.log
   transform=lob_storage:securefile \
   parallel=4
```
- It also creates the table itself.
- You should always convert the LOB to SecureFile LOB during import. Conversion happens on-the-fly and allows for parallel import of data. Even if the LOB originally was a BasicFile LOB.

Now, proceed with the other dump files in serial. Since you already converted the LOB to a SecureFile (previous step), you can now perform a Data Pump import using native parallelism:

impdp ... \
   dumpfile=exp_lob_1.dmp \
   logfile=imp_lob_1.log
   parallel=4 \
   table_exists_action=append
impdp ... \
   dumpfile=exp_lob_2.dmp \
   logfile=imp_lob_2.log
   parallel=4 \
   table_exists_action=append
impdp ... \
   dumpfile=exp_lob_3.dmp \
   logfile=imp_lob_3.log
   parallel=4 \	  
   table_exists_action=append

Notice how the dumpfile and logfile changes for each job.
Run the jobs in serial. Each job will use Data Pump parallelism.

That’s it!

Few Words of Caution

You must figure out how you create indexes – if they are present on the table. Ideally, you want to postpone index creation until the last job has loaded its rows. Otherwise, you’ll have expensive index maintenance happening when you load data.
Data Pump uses Advanced Queueing, which relies on the streams pool in the SGA. Be sure that streams_pool_size is set high enough when you start multiple data pump sessions. In most situations, setting it to 2G should be more than enough:
```
alter system set streams_pool_size=2G scope=memory;
```
Do this in a maintenance window only with no users connected.

Things to Consider When Importing Advanced Queues using Oracle Data Pump

Oracle Data Pump supports moving Advanced Queues (AQ) using export/import and full transportable export/import. But there are some things to be aware of.

Data Pump Does Not Start Queues

Data Pump creates the queues during import but does not start the queues. At the end of the import, you must manually start the queues:

exec dbms_aqadm.start_queue(queue_name => ... );

Be sure to start all queue:

select owner, name, queue_table, enqueue_enabled, dequeue_enabled 
from   dba_queues;

This behavior is intentional. During a migration, you typically don’t want to use the queues. Depending on your use case, it can have unintended side effects. After the migration, when you confirm a successful migration and are ready to go live, you can manually start the queues.

If you forget to start the queues, your application will start to receive the following error:

ORA-25207: enqueue failed, queue string.string is disabled from enqueueing
ORA-25226: dequeue failed, queue string.string is not enabled for dequeue

The Database Does Not Create All Queue Objects

Data Pump creates the queues using the Advanced Queuing administration API DBMS_AQADM:

exec dbms_aqadm.create_queue_table( ...

The database then creates the essential parts of the underlying queue infrastructure: tables, views, IOTs, etc. This should include:

<queue_table_name>
AQ$_<queue_table_name>_E
AQ$_<queue_table_name>_I
AQ$_<queue_table_name>_T
AQ$_<queue_table_name>_F

In the source database, depending on your configuration and use of Advanced Queueing, you might also see the following objects:

AQ$_<queue_table_name>_C
AQ$_<queue_table_name>_D
AQ$_<queue_table_name>_G
AQ$_<queue_table_name>_H
AQ$_<queue_table_name>_L
AQ$_<queue_table_name>_P
AQ$_<queue_table_name>_S
AQ$_<queue_table_name>_V

This is intentional. The database creates those objects when needed. You should never create them or otherwise touch the underlying queue objects.

Typically, as part of a migration, you compare the count of objects in the source and target database to ensure nothing is lost. This comparison must consider the above.

For queue tables, it is sufficient to compare using the following query:

SQL> select count(*) 
     from dba_objects
     where owner='<schema>' 
           and object_type = 'QUEUE';

Definition Of Queue Tables Might Change

As stated, you should not worry about the underlying queue objects. However, if you compare the underlying queue objects, you might see that the definition of the object changes. I describe this situation in a different blog post.

Migrate Old Queues

If the queues are REALLY old, the queue compatibility setting might also be old. If so, I strongly recommend migrating the old queues to the newest version. You can learn more about that in this blog post.

Appendix

Honey, I Shrunk the Database! Why Is My Database Smaller after Migration?

In a recent migration, a customer noticed that the size of the database decreased dramatically after the import. The size of the data files went from 500 GB to only 300 GB. Data Pump didn’t report any errors. Nevertheless, the customer was afraid that data was lost during migration.

The customer asked:

Am I missing data?

Why Would a Database Shrink During Import?

First, the following applies to Data Pump imports. If you migrate with transportable tablespaces or Data Guard, things are different.

Fragmentation

If you perform a lot of DML on a table, it will become fragmented over time. A fragmented table will use much more space, because the database will only re-use a data block with free space, once the free space reaches a certain limit (PCTUSED). Often, this leads to blocks that are never filled and waste of space.

When you import, the database neatly packs all blocks, and the table is completed defragmented.

Depending on the nature of your DML statements, you can see a dramatic impact on space usage for a defragmented table.

Indexes

After importing the data, Data Pump rebuilds the indexes. In many cases, an index has some level of fragmentation, but a rebuild removes all that, resulting in a neatly packed and efficient index. Most often, an index rebuild ends up using less space.

PCTFREE

If you change PCTFREE for an existing object, it applies to new blocks only. The existing blocks are left untouched. Any space saving from the lower setting applies to new data blocks only.

However, during an import (or index rebuild) the database builds the object from scratch and applies the setting to all the blocks. Now, all blocks have the new settings.

Connor’s Words

Connor McDonald was part of the original conversation with the customer. He phrased it like this:

Create a 100 GB table, and a 1 GB table. Your database is now (at least) 101 GB. Drop the 100 GB table and do an export/import. You now have a 1 GB database.

How Can You Know No Data Is Lost?

If you can’t use the database’s size to indicate whether data is lost, what do you do?

We covered this in one of our webinars, Migrating Very Large Databases.

Can the Database Become Bigger?

The headline of this blog post is a reference to Honey, I shrunk the kids. The sequel is called Honey, I Blew Up the Kid.

Then, can the database become bigger as part of an import? Yes, it may.

If you move from a singlebyte character set to Unicode, then characters stored in your database may take up more space. In particular, LOBs now take up much more space.
PCTFREE setting of your segments may also cause them to increase in size. Opposite of the case above.

Can Data Pump Export to ASM Storage?

The other day, I was helping my boss, Roy, with a benchmark for our session Data Pump New Features and Best Practice at Oracle DatabaseWorld at CloudWorld. I needed to find a database with access to very fast storage, and I decided to use a Base Database Service in OCI.

But the fast storage is on ASM. I never tried exporting with Data Pump to ASM storage. Is it possible?

How to Export to ASM Storage?

First, I create a directory in ASM:

ASMCMD> cd DATA
ASMCMD> mkdir DATA_PUMP

Then, I create a database directory pointing to the ASM directory:

SQL> create directory myasmdir as '+DATA/DATA_PUMP';
SQL> grant read, write to ... ;

And start a Data Pump export:

expdp ... \
   directory=myasmdir \
   dumpfile=exp%L.dmp \
   logfile=exp.log

It fails:

Connected to: Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
ORA-39002: invalid operation
ORA-39070: Unable to open the log file.
ORA-29283: invalid file operation: nonexistent file or path [29434]

It turns out that the ASM directory can’t hold the log file. You must store it in a regular file system (see appendix).

I create an additional directory for the log file:

SQL> create directory mylogdir as '/tmp';

Start the Data Pump export, now, redirecting the log file to regular storage:

expdp ... \
   directory=myasmdir \
   dumpfile=exp%L.dmp \
   logfile=mylogdir:exp.log

Appendix

Log File

If you don’t have any access to a regular local file system, you can also start a Data Pump job without writing to a log file:

expdp ... nologfile=y

Data Pump Bundle Patch

When you work with Data Pump in Oracle Database 19c, you should always install the Data Pump bundle patch. In 19.20.0, we have more than 150 Data Pump specific fixes included.

Is It Possible to Migrate SQL Plan Baselines Before You Migrate Data

In a recent migration, a customer had millions of SQL plan baselines. The customer performed a Full Transportable Export/Import, but the export of the SQL plan baselines took a lot of time. I suggested moving the SQL plan baselines the day before the downtime window.

But can you move SQL plan baselines into an empty database; before you move the data? Do SQL plan baselines care about the underlying schema objects the SQL accesses?

SQL Plan Baselines and Underlying Schema Objects

The optimizer matches a SQL statement with a SQL plan baseline using the signature of the SQL (see appendix). Originally, I thought that it was the SQL ID, but I was wrong:

SQL statements are matched to SQL plan baselines using the signature of the SQL statement. A signature is a unique SQL identifier generated from the normalized SQL text (uncased and with whitespaces removed).

SQL Plan Management uses the signature only to match SQLs to SQL plan baselines. There is no reference to the underlying schema objects, which is evident since the same SQL from different schemas share the same signature.

The conclusion is that importing SQL plan baselines into an empty database is safe before you move the data.

Empty Database and Plan Evolution

But there is a catch. You don’t want plan evolution to happen in a database with no data.

Plan evolution is the process that validates whether a plan performs better than existing ones and, if so, marks the plans as accepted. This can have undesired side effects in an empty database.

There are two ways to avoid this:

Disable the SPM Evolve Advisor Task:

BEGIN
   DBMS_SPM.SET_EVOLVE_TASK_PARAMETER(
      task_name => 'SYS_AUTO_SPM_EVOLVE_TASK',
      parameter => 'ALTERNATE_PLAN_BASELINE',
      value => '');
   DBMS_SPM.SET_EVOLVE_TASK_PARAMETER(
      task_name => 'SYS_AUTO_SPM_EVOLVE_TASK',
      parameter => 'ALTERNATE_PLAN_SOURCE',
      value => '');
END;
/

Don’t allow plans to auto-evolve using the accept_plans parameter:

BEGIN
   DBMS_SPM.SET_EVOLVE_TASK_PARAMETER(
      task_name => 'SYS_AUTO_SPM_EVOLVE_TASK' ,
      parameter => 'ACCEPT_PLANS',
      value => 'FALSE');
END;
/

How to Migrate SQL Plan Baselines

DBMS_SPM

The easiest way you can move SQL plan baselines is using DBMS_SPM. You can find a good example of using it in the documentation.

Data Pump

Data Pump exports SQL plan baselines as part of the SQL Management Base (SMB). Data Pump includes the SMB as part of a full export only. To export the SMB only:

expdp ... full=y include=SMB

Please note that the SMB includes the following as well:

SQL Profiles
SQL Patches
SQL Plan Directives

How to Exclude SQL Plan Baselines

If you migrate SQL plan baselines in advance, you should exclude them from the production migration. As described above, the SQL plan baselines are part of the SQL Management Base. You exclude it using:

expdp ... exclude=SMB

That will exclude the items below. Depending on how you move the SQL plan baselines and if you need the other items in the target database, you might need to move these items manually:

SQL Profiles You can move SQL Profiles using DBMS_SQLTUNE.
SQL Patches You can move SQL Patches using DBM_SQLDIAG.
SQL Plan Directives You can move SQL plan directives using DBMS_SPD. Only relevant if you use Adaptive Statistics.

Appendix

SQL Signature

The signature of an SQL is:

A numeric hash value computed using a SQL statement text that has been normalized for case insensitivity and white space. It uniquely identifies a SQL statement. The database uses this signature as a key to maintain SQL management objects such as SQL profiles, SQL plan baselines, and SQL patches.

Let me use an example. If you have the following SQL text:

select * from dual

The signature is:

14103420975540283355

There is a function that calculates the signature from the SQL text:

col signature format 99999999999999999999999
select dbms_sqltune.sqltext_to_signature ('select * from dual') as signature from dual;

Acknowledgment

Thanks to Nigel Bayliss, optimizer product manager, for valuable assistance.

The Results

The Benchmark

How to

What Happens

Parallel Export

Parallel Import

The Fine Print

Conclusion

What Are Common Objects

What Happens In Oracle Data Pump

On Export

On Import

Why Don’t Data Pump Create The Common Objects?

What Can You Do?

A CDB Blueprint

Examine the Dump File

How Data Pump Works

How the Database Stores LOBs

When Statistics Are Not Enough

Faster Exports

Bundle Patch

Don’t Estimate By Statistics

Fake Statistics

Use Partitioning

Final Words

Parallel and Export

Import From a Single Dump File

Import From a Previous Release

Want to Know More?

Data Pump

Data Pump and SQLFILE

DBMS_METADATA

SYS Jobs

Appendix

Credits

Background

Solution

Generating the Predicates

ROWIDs

Keys

More Workers

How to Export With ROWID

How to Import

Few Words of Caution

Data Pump Does Not Start Queues

The Database Does Not Create All Queue Objects

Definition Of Queue Tables Might Change

Migrate Old Queues

Appendix

Further Reading

Why Would a Database Shrink During Import?

Fragmentation

Indexes

PCTFREE

Connor’s Words

How Can You Know No Data Is Lost?

Can the Database Become Bigger?

How to Export to ASM Storage?

Appendix

Log File

Data Pump Bundle Patch

SQL Plan Baselines and Underlying Schema Objects

Empty Database and Plan Evolution

How to Migrate SQL Plan Baselines

DBMS_SPM

Data Pump

How to Exclude SQL Plan Baselines

Appendix

SQL Signature

Acknowledgment