How to Trace Oracle Data Pump

Data Pump is hanging!

If you ever think the above, the answer is probably:

No, it’s not, it’s working…

It might be working in a suboptimal way, but nevertheless, it’s working. Here’s how you can tell what’s going on.

My Ultimate Tracing Guide

I use this approach every time I work with Oracle Data Pump and need to see what happens underneath the hood.

Before Data Pump

  • I always ensure dictionary and fixed objects statistics are current before starting Data Pump:

    begin
       dbms_stats.gather_schema_stats('SYS');
       dbms_stats.gather_schema_stats('SYSTEM');
       dbms_stats.gather_fixed_objects_stats;
    end;
    /
    
    • I usually don’t use gather_dictionary_stats. I prefer gathering schema stats instead. I’ve seen a few edge cases with gather_dictionary_stats over the years, but mostly it works fine. But you know, old dog and new tricks…
    • You should always follow this advice even if you are not tracing a specific problem.
  • I change the AWR snapshot interval to 15 minutes to get better granularity and manually create an AWR snapshot:

    begin
       dbms_workload_repository.modify_snapshot_settings(null, 15);
       dbms_workload_repository.create_snapshot;
    end;
    /
    
    
  • Optionally, if I need to diagnose a performance problem, I enable SQL trace for Data Pump processes:

    alter system set events 'sql_trace {process: pname = dw | process: pname = dm} level=8';
    
    • Or a specific SQL ID:
    alter system set events 'sql_trace[SQL:  ]';
    

Starting Data Pump

  • I add diagnostic information (metrics and logtime) to the log file and turn on tracing:
    expdp ... metrics=yes logtime=all trace=1FF0300
    impdp ... metrics=yes logtime=all trace=1FF0300
    
    • 1FF0300 turns on tracing for more or less everything in Data Pump, but there are other trace levels.

After Data Pump

  • I create a new AWR snapshot:
    begin
       dbms_workload_repository.create_snapshot;
    end;
    /
    
    • You might also want to reset the AWR snap interval to the previous setting.
  • Plus, an AWR report spanning the entire period.
    @?/rdbms/admin/awrrpt
    
  • Data Pump writes the tracing into the process trace files. You can find them in the database trace directory:
    • Control process trace files named *dm*.trc.
    • Worker process trace files named *dw*.trc.

Happy Debugging

Normally, the above information is enough to figure out the underlying problem. If you supply the above to Oracle Support when you open the next case, they’ll be super excited.

Further Reading

It’s a Wrap – DOAG 2023 Conference + Exhibition

Relaxing a bit after yet another exciting visit to Germany for the annual DOAG conference. The conference always delivers. Many great speakers and many interesting session. If you have yet to arrive, you should try it out. Even if you don’t speak German, there are so many English sessions. Don’t hesitate to visit next year.

This year was slightly different, as all my sessions were with either Mike or Roy. I really enjoy presenting with other people. It brings extra energy into the session, and I’m sure the audience also feels it. When you also have a completely full room – even with a standing crowd also – that serves as additional motivation.

The Slides

Move to Oracle Database 23c – CDB-architecture migration on auto-pilot

The slides contain essential information for everyone who will work on upgrades to Oracle Database 23c and adopting the multitenant architecture.

Data Pump Best Practices, Secrets and Real World Scenarios

The slides give an overview of some new features and tips and tricks for troubleshooting in Data Pump. Finally, it describes how LOBs work in the database, why Data Pump might not deliver the expected performance – and what you can do about it.

Help! My Database Is Still on 8i!

What do product managers do when they find an empty slot at a conference? Of course, they quickly throw in another presentation. Which is what happened this year. We took the audience a walk down memory lane to the database of the previous millennium.

Thanks

Thanks to the organizers in DOAG. Again, the conference was a huge success. I enjoy coming here. This is a great conference. If you can’t go to Oracle CloudWorld this is a perfect second option.

Thanks to everyone who attended my sessions – especially those with all the good questions.

See you next year!

Pictures

DOAG 2023 DOAG 2023 DOAG 2023 DOAG 2023 DOAG 2023 DOAG 2023

Data Pump and Parallel Transportable Jobs

In migrations, you often use transportable tablespaces or Full Transportable Export/Import (FTEX). Downtime is always of concern when you migrate, so having Data Pump perform the transportable job in parallel is a huge benefit.

How much benefit? Let’s find out with a little benchmark.

The Results

Let’s start with the interesting part. How much time can you save using parallel transportable jobs in Data Pump.

The following table lists four different scenarios. Each scenario consists of an export and an import. The total is the time Data Pump needs to finish the migration – the sum of export and import. In a real migration, many other things are in play, but here, we are looking solely at Data Pump performance.

Export Time Import Time Total
19c, no parallel 2h 2m 19c, no parallel 6h 44m 8h 46m
23ai, parallel 4 1h 48m 23ai, parallel 4 2h 33m 4h 21m
19c, no parallel 2h 2m 23ai, parallel 16 1h 23m 3h 25m
23ai, parallel 16 1h 8m 23ai, parallel 16 1h 23m 2h 31m

The first row is what you can do in Oracle Database 19c, almost 9 hours. Compare that to the last row you can do with parallel 16 in Oracle Database 23ai, almost a 3.5x reduction.

If you migrate from Oracle Database 19c to Oracle Database 23ai (3rd row), you can still benefit from parallel import and gain a significant benefit.

The Benchmark

I used the following Oracle homes:

My test database:

  • E-Business Suite database
  • 300 GB physical size
  • 630.000 database objects, including
    • 89.000 tables
    • 66.000 indexes
    • 60.000 packages
  • CPU_COUNT = 16
  • SGA_TARGET = 64G

My test machine:

  • OCI shape VM.Standard.E4.Flex
  • 8 CPUs
  • 128 GB mem
  • Fast disks (max. IOPS 128.000)

How to

It’s very easy to enable parallel transportable jobs. If you want to use 16 parallel workers, on export:

expdp ... parallel=16 dumpfile=ftex%L.dmp

On import:

impdp ... parallel=16

What Happens

A Data Pump job consists of several object paths that Data Pump must process. An object path could be tables, indexes, or package bodies.

Parallel Export

Each worker takes an object path and starts to process it. The worker works alone on this object path. You get parallelism by multiple workers processing multiple object paths simultaneously.

Parallel Import

During import, Data Pump must process each object path in a certain order. Data Pump can only import constraints once it has imported tables, for example.

Data Pump processes each object path in the designated order, then splits the work in one object path to many workers. You get parallelism by multiple workers processing one object path in parallel.

The Fine Print

  • Parallel transportable jobs work in Oracle Database 21c and later. In Oracle Database 19c, a transportable job has no parallel capabilities.

  • Data Pump can use parallel only on transportable jobs via a dump file. Network mode is not an option for parallel transportable jobs.

  • If you export in Oracle Database 19c (which does not support parallel transportable jobs), you can still perform a parallel import into Oracle Database 23ai.

  • The export parallel degree and import parallel degree do not have to match. You can export with parallel degree 1 and import with parallel degree 16.

  • When you enable parallel jobs, Data Pump starts more workers. How much extra resources do they use?

    • I didn’t notice any significant difference in undo or temp tablespace use.
    • I didn’t notice any extra pressure on the streams pool, either. I had the streams pool set to 256M, and the database didn’t perform any SGA resize operation during my benchmark.

Conclusion

For migrations, parallel transportable jobs in Data Pump are a huge benefit. Every minute of downtime often counts, and this has a massive impact.

It’s a Wrap – UKOUG Conference ’23

I just finished my presentation at the UKOUG conference. This time, it was held at the Oracle office in Reading. Two intense days full of learning experiences.

It’s the 40th anniversary of UKOUG – that’s truly amazing. The community started when I was just a little child and still lives on today, what a change tech has undergone since then.

Congratulations to the board and the entire community on the 40th anniversary.

The Slides

Patch Me If You Can – Grid Infrastructure Edition

This is a modification of an existing talk about database patching, but mostly on Oracle Grid Infrastructure. But since Oracle Database and Grid Infrastructure go hand in hand, it also has some database stuff.

You should flip through the slides if you work with Oracle Grid Infrastructure. And remember – always patch out-of-place.

Help! My Database is still on 8i!

I also had the opportunity to close the conference with my 8i talk. I really like this talk because it is a walk down memory lane. Plus, it includes demos using Oracle 8i Database. It’s cool to be old school.

For a little laugh, you can find a comparison of Oracle Database releases and mobile phones of the same age.

Thanks

Thanks to the board of UKOUG and the organizers for pulling yet another successful conference. Thanks to the sponsors making it all possible and to everyone who attended my sessions or the conference in general.

It keeps impressing me how much you can learn in such a short time. My head is full. Luckily, the weekend is coming up.

P.S. The chocolate fountain was amazing (see below)!

Pictures

Welcome to UKOUG Conference '23 Presenting Patch Me If You Can - Grid Infrastructure Edition Red Carpet at the 40th Anniversary Celebration The Chocolate Fountain Cool art

How to Use Oracle Data Pump When You Have Common Objects

One of the benefits of the multitenant architecture in Oracle Database is the concept of common objects. If you want to move data around with Oracle Data Pump, is there anything you must know?

What Are Common Objects

The documentation describes common objects as:

A common phenomenon defined in a root is the same in all containers plugged in to this root. … For example, if you create a common user account while connected to CDB$ROOT, then this user account is common to all PDBs and application roots in the CDB.

A simpler explanation:

The stuff you create as C##.... in the root container (CDB$ROOT).

The principles of commonality are:

  • A common phenomenon is the same in every existing and future container. …
  • Only a common user can alter the existence of common phenomena. More precisely, only a common user logged in to either the CDB root or an application root can create, destroy, or modify attributes of a user, role, or object that is common to the current container.

For illustration purposes, imagine a common profile used by a local user. You would create such as:

alter session set container=CDB$ROOT;
create profile c##commonprofile1 ... ;
alter session set container=pdb1;
create user localuser1 ... profile c##commonprofile1;

With other object types, you can make it even more complex, but the principles remain the same.

What Happens In Oracle Data Pump

On Export

Data Pump

  • Connects to the PDB.
  • Extracts the relevant data and metadata.

Using the above example, Data Pump extracts the user DDL. The user DDL specifies the use of a common profile.

On Import

Data Pump

  • Connects to the PDB.
  • Assumes that you have already created the common objects.
  • Creates the local user localuser1 and specifies the use of the common profile c##commonprofile1.

If the common profile does not exist already, the creation of the user fails:

Processing object type DATABASE_EXPORT/SCHEMA/USER
ORA-39083: Object type USER:"LOCALUSER1" failed to create with error:
ORA-02380: profile C##COMMONPROFILE1 does not exist

Failing sql is:
 CREATE USER "LOCAL1" ... PROFILE "C##COMMONPROFILE1"

Why Don’t Data Pump Create The Common Objects?

  • First, Data Pump connects to the PDB itself to perform the import. Since you must create common objects in the root container, that’s obviously a problem. The importing user might not have access to the root container, so switching containers and creating common objects is impossible.

  • Also, it would break the contract in the multitenant architecture that each PDB is isolated. If one PDB were allowed to create common objects, those objects would also be available in other PDBs, and the isolation would no longer exist.

  • Finally, it would open for all sorts of security issues if a user connected to a PDB could create common objects.

What Can You Do?

A CDB Blueprint

Ideally, you should have some CDB blueprint; A definition of the common objects used in your databases. When you create a new CDB, you have an afterburner that creates those common objects according to your company guidelines.

Examine the Dump File

You can also examine the DDLs in the Data Pump file and create a list of common objects.

  1. Extract the DDL:
    impdp ... sqlfile=import_ddls.sql
    
  2. Search for C## and build a list of common objects needed:
    grep -n -i "C##" import_ddls.sql
    
    This is a crude search, and I imagine awk aficionados can improve it.
  3. Extract the DDL for the common objects from the source CDB:
    l_stmt := dbms_metadata.get_ddl(...);
    dbms_output.put_line(l_stmt);
    l_stmt := dbms_metadata.get_granted_ddl(...);
    dbms_output.put_line(l_stmt);
    
    Use dbms_metadata.get_granted_ddl to extract object and role grants and system privileges.
  4. Create the common objects in the target CDB before import:
    alter session set container=cdb$root;
    create profile c##commonprofile1 ... ;
    

Understand How a Change of Database Time Zone Affects Transportable Tablespaces

In a recent migration using full transportable export/import, I noticed a lot of time spent on the following SQL:

SELECT NVL((SELECT 2
            FROM   sys.ku$_all_tsltz_tables 
            WHERE  owner = :1 AND table_name = :2), 0) 
FROM   sys.dual

The SQL ID was g3qu7py3g0yg0. Each execution of the SQL was a few seconds, but Data Pump executed the statement many times.

Data Pump also wrote in the log file that the database timezone differed:

01-NOV-23 07:43:22.152: W-1 Source time zone is +00:00 and target time zone is -07:00.

The following applies to full transportable export/imports using the following Data Pump parameters:

full=y
transportable=always

What Happens?

In a full transportable import, if the database time zone is different, Data Pump translates any data of type Timestamp with local timezone (TSLTZ) to the new database timezone.

On startup, Data Pump detects the difference in the source and target database timezone. For each table Data Pump checks whether it must convert data. If needed, the data is converted from the source database timezone and stored in the target database timezone.

The check and conversion takes time. This is the price you must pay to change the database timezone on import.

Alternatively, you must import into a database with the same timezone. Then Data Pump completely skips the check and conversion.

In the migration, we could save 22 minutes on import by importing into the same database time zone.

Changing the Database Timezone

You can find the database timezone using:

select dbtimezone from dual;

If you don’t have any tables with TSLTZ columns, you can change the database timezone:

alter database set time_zone='+00:00';
shutdown immediate
startup

The database timezone affects only:

  • TSLTZ columns
  • Function CURRENT_DATE
  • Function CURRENT_TIMESTAMP
  • Function LOCALTIMESTAMP

If you don’t use any of the above, it should be safe to change the database timezone.

Columns of type Timestamp with timezone (TSTZ) and the database timezone file (v$timezone_file) are totally unrelated to the database timezone.

Full Transportable vs. Traditional Transportable

In a traditional transportable import, Data Pump does not import tables with columns of type TIMESTAMP WITH LOCAL TIME ZONE (TSLTZ). You must move such tables using a regular Data Pump import.

As mentioned above, a full transportable export/import Data Pump translates the data to match the new database timezone.

Other Blog Posts in This Series

How to Connect SQL Developer to a Base Database Service Using SSH Tunnel

Here is a quick guide showing you how to connect SQL Developer running on your computer to a Base Database Service in Oracle Cloud Infrastructure. The Base Database Service does not allow connections from the outside except SSH. So, you can use an SSH tunnel.

Create SSH Connection

  1. First, you start SQL Developer on your computer (of course, the latest version). Go to the View menu, and select SSH.
  2. To the left you now find the SSH Hosts panel, right-click and choose New SSH Host.
  3. Enter the following information and click OK:
    • Name: Give it a meaningful name.
    • Host: The public IP address of your Base Database Service.
    • Username: Set it to oracle.
    • Use key file: Tick it off and select your private key file.
    • Add a Local Port Forward: Tick it off.

Configure SSH host connection in SQL Developer

Create Database Connection

  1. In the Connections panel, click on the plus icon to create a new connection.
  2. Enter the following information:
    • Name: Give the database connection a meaningful name.
    • Supply a set of user credentials (Username, Password, Role).
    • Connection Type: Set it to SSH.
    • Port Forward: Select the SSH connection you just created.
    • Service name: The fully qualified service name (use lsnrctl status to find it).
  3. Click Test to ensure the connection works. If you get an error, jump to the troubleshooting part.
  4. Click Save and Connect.

Define connection in SQL Developer to connect using SSH tunnel

That’s it!

Now you can connect to your OCI instance from SQL Developer on your own computer.

Remember to use the latest version of SQL Developer to get access to all the goodies.

For Production?

It’s very unusual for a production database to have a public IP address. In a more secure environment, this is a safer approach.

Troubleshooting

IO Error: Got minus one from read call

IO Error: Connection reset by peer

Vendor code 17002

I have run into this a few times. I haven’t found the root cause (yet), but this usually works for me:

  1. Stop the listener and take a backup of listener.ora:
    lsnrctl stop
    cp $ORACLE_HOME/network/admin/listener.ora $ORACLE_HOME/network/admin/listener.ora2
    
  2. Remove all references in listener.ora to the listener called LISTENER:
    vi $ORACLE_HOME/network/admin/listener.ora
    
    • LISTENER=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER))))
    • ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON
    • VALID_NODE_CHECKING_REGISTRATION_LISTENER=SUBNET
  3. Restart the listener:
    lsnrctl start
    
  4. Connection from SQL Developer should work now.
  5. Now comes the odd part. Add the parameters back into listener.ora and restart:
    vi $ORACLE_HOME/network/admin/listener.ora
    
    lsnrctl stop
    lsnrctl start
    
  6. The connections should still work.

Upgrade Base Database Cloud Service to Oracle Database 23c

Please see updated post on Oracle Database 23ai:


Here’s a cool way of upgrading your database in OCI to Oracle Database 23c. I will move my PDB to a new Base Database System using refreshable clone PDBs and AutoUpgrade.

The benefit of using this approach is:

  • Shorter downtime than an in-place upgrade.
  • A brand-new Base Database System, which means the operating system and Oracle Grid Infrastructure is already on a newer version.

My Environment

I have one PDB that I want to upgrade. It’s called SALES.

Source

  • Base Database System on 19.20.0
  • Name DBS19

Target

  • Base Database System on 23.3.0
  • Name DBS23

How To

Prepare AutoUpgrade

  • I must use a version of AutoUpgrade that supports upgrades to Oracle Database 23c, and I must have AutoUpgrade on the source and target system.
    $ java -jar autoupgrade.jar -version
    
  • At the time of writing, the latest version of AutoUpgrade on My Oracle Support does not support 23 as a target release. Instead, I copy AutoUpgrade from the target Oracle Home (on DBS23) to the source Oracle Home (on DBS19). This version allows upgrades to Oracle Database 23c.
  • I create an AutoUpgrade config file, named sales.cfg, which I store on both servers:
    global.autoupg_log_dir=/u01/app/oracle/cfgtoollogs/autoupgrade
    global.keystore=/u01/app/oracle/cfgtoollogs/keystore
    upg1.source_home=/u01/app/oracle/product/19.0.0.0/dbhome_1
    upg1.target_home=/u01/app/oracle/product/23.0.0.0/dbhome_1
    upg1.sid=CDB19
    upg1.pdbs=SALES
    upg1.target_cdb=CDB23
    upg1.source_dblink.SALES=CLONEPDB 600
    upg1.target_pdb_copy_option.SALES=file_name_convert=none
    upg1.target_version=23
    upg1.start_time=05/11/2023 02:00:00
    
    • I must specify global.keystore to allow AutoUpgrade to create a keystore to work with my encrypted PDB.
    • source_home and target_home list the Oracle Home of the source and target CDB, respectively. It doesn’t matter that the two homes existing on one server only.
    • sid and target_cdb contain the SID of the source and target CDB, respectively.
    • pdbs contains the name of the PDB I want to upgrade, sales. If needed, I could specify more PDBs.
    • source_dblink has the name of the database link (CLONEPDB) and the rate at which the target CDB brings over redo and rolls forward the copy (600 seconds or 10 minutes).
    • I want to use ASM and Oracle Managed Files, so I set target_pdb_copy_option accordingly.
    • Since my source and target CDB are not on the same host, AutoUpgrade can’t automatically determine the target version. I specify that manually using target_version.
    • start_time specifies when downtime starts. At this point, AutoUpgrade refreshes the PDB for the last time and then moves on with the upgrade.

Prepare Source PDB

  • I connect to the source PDB. I create a user (for a database link) and grant privileges:

    create user dblinkuser identified by ... ;
    grant create session, 
          create pluggable database, 
          select_catalog_role to dblinkuser;
    grant read on sys.enc$ to dblinkuser;
    
  • After the upgrade, I can drop the user again.

Prepare Target CDB

  • I connect to the target CDB and create a database link pointing to my source PDB:
    create database link clonepdb
    connect to dblinkuser
    identified by dblinkuser
    using '<connection-string-to-source-pdb>';
    

Analyze and Fix Source PDB

  • First, I analyze the source PDB for upgrade readiness. On the source system:
    java -jar autoupgrade.jar -config sales.cfg -mode analyze
    
  • The summary report lists the following precheck failures, which I safely ignore:
    • TDE_PASSWORDS_REQUIRED – I will fix that on the target system.
    • TARGET_CDB_AVAILABILITY – the target CDB is remote, and AutoUpgrade can’t analyze it.
  • Then, I execute the preupgrade fixups:
    java -jar autoupgrade.jar -config sales.cfg -mode fixups
    
    • This changes my source PDB, so I do it as close to my maintenance window as possible.

Upgrade

  • Since my PDB is encrypted, I must add the source and target CDB keystore password to the AutoUpgrade keystore. I start the TDE console on the target host:

    java -jar autoupgrade.jar -config sales.cfg -load_password
    
  • In the TDE console, I add the keystore passwords of the source and target CDB:

    TDE> add CDB19
    Enter your secret/Password:    
    Re-enter your secret/Password: 
    TDE> add CDB23
    Enter your secret/Password:    
    Re-enter your secret/Password: 
    
  • I save the passwords and convert the AutoUpgrade keystore to an auto-login keystore:

    TDE> save
    Convert the keystore to auto-login [YES|NO] ? 
    
  • I start AutoUpgrade in deploy mode:

    java -jar autoupgrade.jar -config sales.cfg -mode deploy
    
    • AutoUpgrade copies the data files over the database link.
    • Rolls the copies of the data files forward with redo from the source.
    • At one point, issues a final refresh and disconnects the PDB from the source.
    • Upgrades the PDB.
  • I have now upgraded my PDB to Oracle Database 23c.

The Fine Print

You should:

  • Check the words for caution in my previous blog post on AutoUpgrade and refreshable clone PDBs.
  • Start a new full backup of the target database after the migration.
  • Familiarize yourself with the concept of the AutoUpgrade keystore.

Also, notice:

  • The PDB is now on a different Base Database System. You need to update your connection string.
  • The source PDB must be Oracle Database 19c or newer to upgrade directly to Oracle Database 23c.
  • The OCI console will recognize and display the new PDB after a while. You don’t have to do anything … than to wait for the automatic sync job.

Oracle Grid Infrastructure and Apache Tomcat

Oracle Grid Infrastructure (GI) uses some functionality from Apache Tomcat. You can find Apache Tomcat in the GI Home. How do you ensure that Apache Tomcat is up to date?

It’s Easy

The Release Updates for Oracle Grid Infrastructure also contains patches for Apache Tomcat:

Patching of Tomcat within the GI home is handled via the Quarterly Grid Infrastructure Release Updates.

The following example shows the output of a GI home on 19.19.0. You can see there is a specific patch for Apache Tomcat (TOMCAT RELEASE UPDATE):

$ cd $ORACLE_HOME/OPatch
$ ./opatch lspatches
35050341;OJVM RELEASE UPDATE: 19.19.0.0.230418 (35050341)
35004974;JDK BUNDLE PATCH 19.0.0.0.230418
35107512;TOMCAT RELEASE UPDATE 19.0.0.0.0 (35107512)
35050325;ACFS RELEASE UPDATE 19.19.0.0.0 (35050325)
35042068;Database Release Update : 19.19.0.0.230418 (35042068)
33575402;DBWLM RELEASE UPDATE 19.0.0.0.0 (33575402)

How to Find Tomcat Version

Use the following command to find the Apache Tomcat version:

$ cd $ORACLE_HOME/tomcat/lib

$ java -cp catalina.jar org.apache.catalina.util.ServerInfo
Server version: Apache Tomcat/8.5.84
Server built:   Nov 16 2022 13:34:24 UTC
Server number:  8.5.84.0
OS Name:        Linux
OS Version:     4.14.35-2047.510.5.5.el7uek.x86_64
Architecture:   amd64
JVM Version:    1.8.0_371-b11
JVM Vendor:     Oracle Corporation

Can I Update Tomcat Manually?

No, you can only update Apache Tomcat as part of a Release Update:

Oracle is continuously monitoring TOMCAT fixes for CVEs, once a fix is found and the fix the fix is in a object in one of the JAR files of the compact distribution, we start the process to incorporate the TOMCAT version with the fix in GI. … Patching outside of GI Release Updates is NOT supported.

My Security Team Flags Tomcat as Out-of-date

Many customers use tools to scan for vulnerabilities. Such tools might scan a GI home and find an out-of-date Apache Tomcat. To update Apache Tomcat, you must apply a newer Release Update.

If the latest Release Update does not contain a fix for a specific issue in Apache Tomcat:

  • Check 555.1 for a one-off patch
  • Wait for the next Release Update

If you find the issue so critical that you can’t wait, reach out to Oracle Support with your concerns.

Further Reading

Tomcat in the Grid Infrastructure Home (Doc ID 2655066.1)

Data Pump and Faster Export of SecureFile LOBs

SecureFile LOBs are the best LOB type, partly because it allows parallel access to and from the LOB. The parallel access allows Data Pump to unload and load data faster.

The following applies to SecureFile LOBs:

select owner, table_name, column_name 
from   dba_lobs 
where  securefile='YES';

There is another blog post for Data Pump and BasicFile LOBs.

How Data Pump Works

First of all, you must allow Data Pump to work in parallel. Only when so, Data Pump assigns one worker – and only one worker – per table data object. That is a table, partition or subpartition.

Suppose the table data object is big enough, that one worker uses parallel query (PQ) to unload the data. The PQ worker will hand over the data to the worker, who will write it to the dump file.

How much is big enough? By default, Data Pump examines the object statistics, and if the segment is larger than 250 MB, it will use parallel query. You can change this threshold using the parallel_threshold parameter.

To export a SecureFile LOB in parallel:

  • You must allow Data Pump to do parallel work using the parallel parameter.
  • The table data object must be of a certain size.

Data Pump determines the size by using either:

  • Optimizer statistics (default) – estimate=statistics
  • Calculation – estimate=blocks

Why doesn’t Data Pump use parallel query on all objects? Because it takes time to start a PQ worker and keep track of them. It is a waste on small table data objects and will often be slower. Parallel query makes sense only when a table data object reaches a certain size.

How the Database Stores LOBs

With LOBs you can potentially store a lot of data.

If a LOB (in a single row) is less than 4000 bytes, it is stored in-row together with the rest of the data. Having small LOBs as part of the table is efficient. However, if a LOB is larger than 4000 bytes, it is inefficient to store it in-row. Instead, the database stores the entire LOB in a separate LOB segment. This is called out-of-row.

In Oracle Database 23ai, you can optionally store up to 8000 bytes in-row.

When Statistics Are Not Enough

Update, 6 May 2024: The below mentioned issue is solved in the 19.23.0 Data Pump Bundle Patch

If a LOB is stored in-row, the space it uses, is added to the total segment space of the table. However, if the LOB is stored out-of-row, the LOB space is added to the LOB segment space usage.

The table statistics (dba_tab_statistics) show only the blocks (i.e., space) used by the table. The size of the LOB segment does not affect the size of the table in the optimizer statistics. This is by design, because the purpose of statistics is to enable the optimizer to come up with good plans, and the size of the LOB segment is irrelevant in that situation.

In other words, you can have a table with 100 rows and 1 TB of LOB data. If all that LOB data is stored out-of-row, the table looks really small.

If Data Pump determines the size of the table based on the statistics, that table with 1 TB LOBs looks really small. As a result, Data Pump will not use parallel query on it.

This applies also to partitioned tables. In this case, Data Pump looks at the partition statistics instead of the table statistics.

Faster Exports

Bundle Patch

The patch to fix the above issue is included in the 19.23.0 Data Pump Bundle Patch.

Don’t Estimate By Statistics

You can change the way Data Pump determines the size of a table from statistics to a proper calculation:

expdp ... estimate=blocks
  • Estimating by blocks is slower than using statistics. You will see that the startup phase in Data Pump takes longer.
  • Due to a bug in Data Pump, this will not work unless your database is on 19.18.0 with the Data Pump bundle patch.

Fake Statistics

Since Data Pump only looks at the table data object statistics, you can fake the statistics. Trick Data Pump into believing the table itself is huge by setting statistics to high values:

begin
 dbms_stats.set_table_stats (
   ownname => 'APPUSER', 
   tabname => 'T1',
   numrows => 10000000, 
   numblks => 1000000);
 end;
 /
  • You must do this for all tables with big LOBs.
  • It requires testing to get the best result.
  • The fake statistics will influence the choices made by the optimizer, so only do this in your maintenance window. Also, note that setting statistics invalidates cursors in the library cache.
  • Ensure that the statistics gathering job doesn’t overwrite your fake stats.

Use Partitioning

If you partition a table, Data Pump uses one worker per partition or subpartition. Data Pump might also use parallel query on a single partition/subpartition depending on the number of partitions and subpartitions and available parallel processes. But only if the partition statistics reflect that it is significant in size. Otherwise, you run into the same problem as described above.

An advantage of having multiple Data Pump workers on the same partitioned table, is that more workers can write to dump files simultaneously. Not only can Data Pump read faster from the database, it can also store it faster into dump files.

Final Words

Our developers solved the issue in the 19.23.0 Data Pump Bundle Patch. The best solution is of course to stay current and apply the Data Pump Bundle Patch.

Finally, as a reminder, always convert your LOBs to SecureFile LOBs on import:

impdp ... transform=lob_storage:securefiles