Why Does Data Pump Need a Read Write Tablespace

The other day I had to export data using Data Pump and ran into an error:

expdp appuser/appuser ...

ORA-31626: job does not exist
ORA-31633: unable to create master table "APPUSER.SYS_EXPORT_SCHEMA_01"
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: at "SYS.KUPV$FT", line 1163
ORA-01647: tablespace 'APPTS' is read-only, cannot allocate space in it
ORA-06512: at "SYS.KUPV$FT", line 1056
ORA-06512: at "SYS.KUPV$FT", line 1044

APPTS is the name of the default tablespace of the exporting user, APPUSER. It appears that Data Pump requires that the default tablespace of the exporting user has its default tablespace in read write mode. Why is that?

The Control Table

To keep track of an export or import job, Data Pump writes information to a so-called control table. The control table is a regular table stored in the database.

Data Pump creates the control table in the default tablespace of the user executing the Data Pump job. It is not possible to store the control table in a different tablespace.

At the end of an export, as the very last thing, Data Pump exports the control table to the dump file. Similarly, at the start of an import, as the very first thing, Data Pump imports the control table.

Normally, you don’t see the control table in the database because Data Pump drops it when a job ends. You can change that by setting KEEP_MASTER=YES. This allows you to query the control table. The name of the control table is the same as the Data Pump job:

SQL> select * 
     from <executing_user>.<name_of_data_pump_job>

If you change the name of the Data Pump job using JOB_NAME, you can alter the name of the control table.

Previously, the control table was called the master table. That’s why the parameter KEEP_MASTER is named as it is.

Learn more about the control table in the webinar Data Pump Extreme – Deep Dive with Development.

Other Requirements

So, if Data Pump must create a table in the database, does that mean there are other requirements?

Yes, it does. Here is a list of the requirements:

  • Database must be open in read write mode
  • Executing user must have CREATE TABLE privilege
  • Executing user must have a quota on its default tablespace

Exporting From a Standby Database

Since there is a requirement that the database must be open in READ WRITE mode, it means that you can’t use Data Pump to export from a standby database. A standby database is always either MOUNTED or OPEN READ ONLY.

Workarounds

  1. If you have a standby database, you can temporarily convert it into a snapshot standby database and perform the export..
  2. Perform the import directly over a database link using the NETWORK_LINK option. See My Oracle Support note How To Use DataPump Export (EXPDP) To Export From Physical Standby Database (Doc ID 1356592.1) for details. Thanks to Franck Pachot for the tip.

Option 1 is preferable since Data Pump in network mode has a few restrictions.

Understand Why Data Pump Produce ORA-39218 or ORA-39216 on Evolved Types

Transportable Tablespaces is a great way of migrating data from one Oracle Database to another. However, if your database uses object types and those object types have evolved over time, you might hit an error during import:

W-1 Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE
ORA-39083: Object type TABLE:"APPUSER"."CARS" failed to create with error:
ORA-39218: type check on object type "APPUSER"."CAR_TYPE" failed
ORA-39216: object type "APPUSER"."CAR_TYPE" hashcode or version number mismatch

Failing sql is:
BEGIN SYS.DBMS_METADATA.CHECK_TYPE('APPUSER','CAR_TYPE','4','613350A0E37618A6108C500936D0C6162C',''); END;

Typically, there will be a number of dependent objects that fails with ORA-39112.

Let me explain the problem and what you can do about it.

Object Types and Evolution

The term object types refers to database objects that you create using the CREATE TYPE statement.

Object types are also known as user-defined types (UDT) and abstract data types (ADT).

When Oracle Database introduced object types many releases ago, they could not evolve. In simple words, you only had CREATE TYPE available. If you needed to change the type, you had to drop and recreate it. Later on, Oracle Database allowed type evolution, and you could also do ALTER TYPE.

An Example of Evolution

Let me illustrate object types and evolution using an example:

  • I create a supertype called VEHICLE_TYPE
  • I create another type called CAR_INFO_TYPE
  • I create a subtype called CAR_TYPE which inherits VEHICLE_TYPE and includes an attribute based on CAR_INFO_TYPE

Overview of the types used in the example - part 1

I add the three types:

CREATE TYPE "APPUSER"."VEHICLE_TYPE" 
   IS OBJECT (
      make VARCHAR2 (40)
) NOT FINAL;
/
CREATE TYPE "APPUSER"."CAR_INFO_TYPE" 
   IS OBJECT (
      model VARCHAR2(40)
);
/
CREATE TYPE "APPUSER"."CAR_TYPE" 
   UNDER VEHICLE_TYPE (
      car_info CAR_INFO_TYPE
);
/

I query the data dictionary to find the types and verify that their version is 1. I expect this because the types are brand-new.

select o.name, t.version# 
from obj$ o, user$ u,type$ t 
where o.owner# = u.user# and o.oid$ = t.toid and u.name='APPUSER'
order by o.name, t.version#;

NAME             VERSION#
-------------------------
CAR_INFO_TYPE           1
CAR_TYPE                1
VEHICLE_TYPE            1

Now I want to add two additional attributes to CAR_INFO_TYPE:

  • horsepower
  • color

Overview of the types used in the example - part 2

ALTER TYPE "APPUSER"."CAR_INFO_TYPE" ADD ATTRIBUTE horsepower NUMBER CASCADE
/
ALTER TYPE "APPUSER"."CAR_INFO_TYPE" ADD ATTRIBUTE color VARCHAR2(20) CASCADE
/

Now, I check the data dictionary again. CAR_INFO_TYPE has now advanced two versions to version 3. Two ALTER TYPE statements mean two new versions. But notice CAR_TYPE. It advances as well.

select o.name, t.version# 
from obj$ o, user$ u,type$ t 
where o.owner# = u.user# and o.oid$ = t.toid and u.name='APPUSER'
order by o.name, t.version#;

NAME             VERSION#
-------------------------
CAR_INFO_TYPE           1
CAR_INFO_TYPE           2
CAR_INFO_TYPE           3
CAR_TYPE                1
CAR_TYPE                2
CAR_TYPE                3
VEHICLE_TYPE            1

If you alter a type that another type uses, both types evolve. No ALTER TYPE was issued on CAR_TYPE, yet it still evolve. This becomes a problem later on.

Same Objects but Different Evolution

If I use the example above but change the order of statements, the types will evolve in a different way. They will still be the same objects, but they evolve differently.

Let me illustrate how the types can evolve differently. I execute the ALTER TYPE statements on CAR_INFO_TYPE immediately after its CREATE TYPE statement:

CREATE TYPE "APPUSER"."VEHICLE_TYPE" 
   IS OBJECT (
      make VARCHAR2 (40)
) NOT FINAL;
/
CREATE TYPE "APPUSER"."CAR_INFO_TYPE" 
   IS OBJECT (
      model VARCHAR2(40)
);
/
ALTER TYPE "APPUSER"."CAR_INFO_TYPE" ADD ATTRIBUTE horsepower NUMBER CASCADE
/
ALTER TYPE "APPUSER"."CAR_INFO_TYPE" ADD ATTRIBUTE color VARCHAR2(20) CASCADE
/
CREATE TYPE "APPUSER"."CAR_TYPE" 
   UNDER VEHICLE_TYPE (
      car_info CAR_INFO_TYPE
);
/

Here you can see the two approaches compared to each other

First approach Second approach
CREATE TYPE "APPUSER"."VEHICLE_TYPE" CREATE TYPE "APPUSER"."VEHICLE_TYPE"
CREATE TYPE "APPUSER"."CAR_INFO_TYPE" CREATE TYPE "APPUSER"."CAR_INFO_TYPE"
CREATE TYPE "APPUSER"."CAR_TYPE" ALTER TYPE "APPUSER"."CAR_INFO_TYPE"
ALTER TYPE "APPUSER"."CAR_INFO_TYPE" ALTER TYPE "APPUSER"."CAR_INFO_TYPE"
ALTER TYPE "APPUSER"."CAR_INFO_TYPE" CREATE TYPE "APPUSER"."CAR_TYPE"

When I check the version of the types, I see that CAR_TYPE doesn’t evolve together with CAR_INFO_TYPE:

select o.name, t.version# 
from obj$ o, user$ u,type$ t 
where o.owner# = u.user# and o.oid$ = t.toid and u.name='APPUSER'
order by o.name, t.version#;

NAME             VERSION#
-------------------------
CAR_INFO_TYPE           1
CAR_INFO_TYPE           2
CAR_INFO_TYPE           3
CAR_TYPE                1
VEHICLE_TYPE            1

In contrast to the previous example, because I changed the order of the statements, CAR_TYPE is now still on version 1 (not 3 as previously).

When you alter a type, you decide whether you also want to update the rows in a table that uses that type.

  • If you choose to update the corresponding table data, it can take quite some time on a big table.
  • If you choose to skip the update, the alteration will be fast. However, when you need to read the data, the database needs to use the evolution information to determine how to read the data.

What Data Pump Does

Data Pump can recreate the types during a regular import and transportable tablespaces. But it will handle the types one-by-one:

  • Create VEHICLE_TYPE (and all alterations if there were any)
  • Then create CAR_INFO_TYPE and all alterations
  • Then create CAR_TYPE (and all alterations if there were any)

Now, the definition of the types is correct (it has the correct attributes and stuff). But the evolution of the types does not necessarily match that of the source database. Unless the types in the source database were created in the same order as Data Pump does, then the versions of the types might be different.

What Happens During a Regular Import

During a regular (not transportable) export, Data Pump always use external table as access method when there is an evolved type involved. This access method causes a CREATE TABLE <external table> ... AS SELECT which converts each row’s version of a type to the highest version. All rows in the dump file is now of the highest version of the type.

On import, Data Pump creates the types, and perhaps the versions of the types are different. When Data Pump loads the data into the table, all rows are on the same version because they were converted during export. The data access layer underneath Data Pump is now more lenient. If one type version is found with a matching hash code, then data can be imported safely.

Data Pump then checks for a matching object type using a hash code only. The version does not have to match. Here’s a query to get the hash code:

select o.name, t.version#, t.hashcode 
from obj$ o, user$ u, type$ t 
where o.owner# = u.user# and o.oid$ = t.toid and u.name='APPUSER'
order by o.name, t.version#;

What Happens During a Transportable Tablespace Import

In transportable tablespace mode, things are different. Data Pump does not load the data into a dump file. Instead, it is kept in the data files. To read the data correctly, you must have the correct type definitions and information on how they evolve.

If the type evolution doesn’t match those of the source database, then Data Pump can’t be sure the data is accurate and will issue an error. Using the above query, the type evolution must match on version and hash code.

Solution

This is not a bug in Data Pump. Data Pump ensures that your data is not corrupted during import. Due to a restriction in the implementation of type evolution, Data Pump is not able to completely recreate a type’s evolution and this might cause problems.

These are your options:

Option 1: Use Regular Data Pump Import

As described above, during a regular import Data Pump handles the types and dependent objects differently.

  1. Perform transportable tablespace import.
  2. Find types and dependent objects that Data Pump can’t import.
  3. Import the affected types and objects using a regular Data Pump import.

Option 2: Recreate Types and Dependent Objects Without Data

In one case I worked on, a queue table used the evolved type. Before migration, the application team would ensure that the application processed all messages in the queue. Thus, at the time of migration, the queue was empty.

  1. The transportable tablespace import failed to create the queue table due to evolved type.
  2. After import, the DBA recreated the queue using DBMS_AQADMIN.
  3. Also, the DBA granted permissions on the queue.

If a regular table used the evolved type, you would need to recreate the table using CREATE TABLE syntax instead.

Option 3: Recreate Types Without Evolution Before Export

Recreate the types in the source database in the same way as Data Pump will create it on the target database. One type at a time. First, the CREATE TYPE statement and any ALTER TYPE statements on the same type. Then next type, and the next, and the next…

If a dependent table or queue uses the evolved type, it can be quite cumbersome and time consuming to recreate the type in the source database.

Option 4: Create Type in Target Using Same Evolution Before Import

If you know the evolution history of the types in the source database (i.e. in which order the CREATE TYPE and ALTER TYPE statements were issued), then you can create the types manually in the target database and ensure they evolve in the same way. You must creat the types exactly in the same way (including number of attributes, the order they were added and so forth).

  1. In target database, you create and alter types in the correct order. In order to do that you must also create the schema.

  2. You verify that types are of the same version in source and target database. The hash code must be identical as well:

    select o.name, t.version#, t.hashcode 
    from obj$ o, user$ u, type$ t 
    where o.owner# = u.user# and o.oid$ = t.toid and u.name='APPUSER'
    order by o.name, t.version#;
    
  3. You perform the transportable tablespace import.

  4. Data Pump reports an error for the types that already exist. You ignore those errors. You can also exclude types from the import (previous step) using EXCLUDE=TYPES.

  5. Ensure that dependent objects that use the affected types are valid. It could be other types, tables, or queues.

Conclusion

Evolved types can hit ORA-39218 or ORA-39216 during a transportable tablespace import. Implementation restrictions in object types cause these errors. Data Pump can’t evolve the types in the same way, and issues an error. You must find a different way of moving the types and dependent objects.

Happy migrating!

Appendix

References

Complete Test Case

For reference, here is the test case I used to research. Use our hands-on lab, Hitchhiker’s Guide for Upgrading to Oracle Database 19c.

1. Create Test Data

export ORAENV_ASK=NO
export ORACLE_SID=UP19
. oraenv
export ORAENV_ASK=YES

CONN / AS SYSDBA

DROP USER APPUSER CASCADE;
DROP TABLESPACE APPS INCLUDING CONTENTS AND DATAFILES;

CREATE TABLESPACE APPS;
CREATE USER APPUSER IDENTIFIED BY APPUSER DEFAULT TABLESPACE APPS;
ALTER USER APPUSER QUOTA UNLIMITED ON APPS;
GRANT CONNECT TO APPUSER;
GRANT CREATE TYPE TO APPUSER;
GRANT CREATE TABLE TO APPUSER;
GRANT AQ_ADMINISTRATOR_ROLE TO APPUSER;

CONN APPUSER/APPUSER

CREATE TYPE "APPUSER"."VEHICLE_TYPE" 
   IS OBJECT (
      make VARCHAR2 (40)
) NOT FINAL;
/

CREATE TYPE "APPUSER"."CAR_INFO_TYPE" 
   IS OBJECT (
      model VARCHAR2(40)
);
/

CREATE TYPE "APPUSER"."CAR_TYPE" 
   UNDER VEHICLE_TYPE (
      car_info CAR_INFO_TYPE
);
/

ALTER TYPE "APPUSER"."CAR_INFO_TYPE" ADD ATTRIBUTE horsepower NUMBER CASCADE
/
ALTER TYPE "APPUSER"."CAR_INFO_TYPE" ADD ATTRIBUTE color VARCHAR2(20) CASCADE
/

CREATE TABLE "APPUSER"."CARS" (
   id number, 
   car_info car_type
);

INSERT INTO "APPUSER"."CARS"
VALUES (1, APPUSER.CAR_TYPE('FORD', APPUSER.CAR_INFO_TYPE('FIESTA', 200, 'BLUE')));
COMMIT;

BEGIN
   DBMS_AQADM.CREATE_QUEUE_TABLE (
      queue_table        => 'APPUSER.CAR_QUEUE_TABLE',
      queue_payload_type => 'APPUSER.CAR_TYPE');

   DBMS_AQADM.CREATE_QUEUE (
      queue_name  => 'APPUSER.CAR_QUEUE',
      queue_table => 'APPUSER.CAR_QUEUE_TABLE');

   DBMS_AQADM.START_QUEUE (
      queue_name => 'APPUSER.CAR_QUEUE',
      enqueue    => TRUE);
END;
/

DECLARE
  l_enqueue_options     DBMS_AQ.enqueue_options_t;
  l_message_properties  DBMS_AQ.message_properties_t;
  l_message_handle      RAW(16);
  l_event_msg           APPUSER.CAR_TYPE;
BEGIN
  l_event_msg := APPUSER.CAR_TYPE('VOLVO', APPUSER.CAR_INFO_TYPE('V90', 400, 'GREY'));

  DBMS_AQ.enqueue(queue_name          => 'APPUSER.CAR_QUEUE',        
                  enqueue_options     => l_enqueue_options,     
                  message_properties  => l_message_properties,   
                  payload             => l_event_msg,             
                  msgid               => l_message_handle);

  COMMIT;
END;
/

conn / as sysdba
select o.name, o.oid$, t.hashcode, t.version# 
from obj$ o, user$ u,type$ t 
where o.owner# = u.user# and o.oid$ = t.toid and u.name='APPUSER'
order by o.name, t.version#;

2. Transportable Tablespace Using FTEX

mkdir /tmp/dpdir
rm /tmp/dpdir/*
mkdir /u02/oradata/CDB2/pdb2

sqlplus / as sysdba<<EOF
ALTER TABLESPACE USERS READ ONLY;
ALTER TABLESPACE APPS READ ONLY;
create or replace directory dpdir as '/tmp/dpdir';
grant read, write on directory dpdir to system;
EOF

export ORAENV_ASK=NO
export ORACLE_SID=CDB2
. oraenv
export ORAENV_ASK=YES
sqlplus / as sysdba<<EOF
   startup
   alter pluggable database PDB2 close;
   drop pluggable database PDB2 including datafiles;
   create pluggable database PDB2 admin user adm identified by adm file_name_convert=('pdbseed', 'pdb2');
   alter pluggable database PDB2 open;
   alter pluggable database PDB2 save state;
   alter session set container=PDB2;
   create directory mydir as '/tmp/dpdir';
   grant read, write on directory mydir to system;
   create public database link SOURCEDB connect to system identified by oracle using 'UP19';
EOF

export ORAENV_ASK=NO
export ORACLE_SID=UP19
. oraenv
export ORAENV_ASK=YES
sqlplus / as sysdba<<EOF
set line 300
select 'cp ' || file_name || ' /u02/oradata/CDB2/pdb2/df' || file_id as c1 from dba_data_files where tablespace_name in ('USERS', 'APPS');
EOF

#Copy the files

export ORAENV_ASK=NO
export ORACLE_SID=CDB2
. oraenv
export ORAENV_ASK=YES
impdp system/oracle@pdb2 network_link=sourcedb full=y \
transportable=always metrics=y exclude=statistics directory=mydir \
logfile=pdb2.log \
transport_datafiles='/u02/oradata/CDB2/pdb2/df4' \
transport_datafiles='/u02/oradata/CDB2/pdb2/df5'

sqlplus appuser/APPUSER@PDB2<<EOF
  select * from appuser.cars;
  select * from APPUSER.CAR_QUEUE_TABLE;
EOF

How to Apply Compression During Data Pump Import

When you migrate your data using Oracle Data Pump, you can transform your data as part of the import. One of the popular transformations is applying compression.

Check the License Guide and ensure your licenses are in place. Most compression options require a separate license.

Table Compression

Data Pump has a dedicated TRANSFORM clause that allows you to apply table compression on import:

$ impdp ... transform=table_compression_clause:\"compress for oltp\"

It works for table partitions and subpartitions as well.

How about Tablespace Default Compression Clause?

You can create a tablespace with a default compression clause. Any new tables, table partitions or table subpartitions will inherit this compression clause:

SQL> create tablespace demo1 ... default table compress for oltp;

If you import your tables into such a tablespace, it will not change the compression type for the tables. The tables will preserve the compression type from the source database.

Unless you instruct Data Pump to remove any compression clause from the DDL. Then the tables inherit the tablespace compression clause:

$ impdp ... transform=table_compression_clause:\"none\"

Using the SQLFILE command line parameter, you can extract the DDL from a dump file. If you have uncompressed tables, you can see that the CREATE TABLE statements have a dedicated NOCOMPRESS clause. This will override the tablespace setting unless you instruct Data Pump to remove it.

Index Compression

In Oracle Database 21c, you can use the TRANSFORM parameter to apply compression on indexes during import:

$ impdp ... transform=table_compression_clause:\"...\"

Oracle Database 19c

No transformation clause allows you to change the compression type of your indexes.

In contrast to default table compression, the default index compression type for a tablespace is honored during import.

If you want to compress your indexes:

  1. Create a tablespace with a default index compression clause:
    SQL> create tablespace tbs_index_1 ... default index compress advanced high;
    
  2. Instruct the database to inherit the tablespace compression clause when you create new indexes:
    SQL> alter system set db_index_compression_inheritance=tablespace;
    
  3. Import the data and remap to the new tablespace:
    $ impdp ... remap_tablespace=users:tbs_index_1
    

The db_index_compression_inheritance parameter works for global indexes on partitioned tables as well.

Partitioned Indexes

However, the db_index_compression_inheritance parameter does not work for

  • Global partitioned indexes
  • Local indexes.

There is an enhancement request to get that fixed, but it is yet to be implemented.

You can’t apply compression on such objects during a Data Pump import. You have two options:

  1. Compress the index after the import using the ALTER INDEX command. You can rebuild the index online with no interuption.
  2. Create the index manually.
    • Exclude the index from the Data Pump import using the EXCLUDE parameter
    • Extract the DDL for the index from a Data Pump SQL file using the SQLFILE parameter
    • Change the DDL to apply the proper compression and execute it after the Data Pump import

LOB Compression

No transformation clause allows you to change the compression type of your LOBs. But it is on our to-do list.

If you want to apply Advanced LOB Compression:

  1. First, import the metadata only:

    $ impdp system/oracle ... content=metadata_only
    
  2. Then change the LOB storage to enable compression:

    SQL> alter table ... modify lob (<column_name>)(compress high);
    
  3. Finally, import the data only:

    $ impdp system/oracle ... content=data_only
    
  4. Be aware that importing with content=metadata_only will lock statistics. Be sure to unlock them after import:

    exec dbms_stats.unlock_schema_stats(...
    

Another option is to create the LOB tables in advance with the desired LOB compression type. Then instruct Data Pump to accept existing tables and load data anyway:

$ impdp ... table_exists_action=append

Appendix

Test Case

You can use the test case in our hands-on lab:

sqlplus / as sysdba<<EOF
create or replace directory dpdir as '/u01/app/oracle/admin/UP19/dpdump';
drop tablespace apps including contents and datafiles;
drop user appuser cascade;

create tablespace apps;
create user appuser identified by APPUSER;
grant dba to APPUSER;
alter user APPUSER default tablespace APPS;
connect appuser/APPUSER
create table t1 as select * from dba_objects;

ALTER TABLE t1 MODIFY
PARTITION BY HASH (object_id)
PARTITIONS 2
STORE IN (apps, apps, apps, apps);  

CREATE INDEX i1local ON t1 (last_ddl_time) LOCAL;
CREATE INDEX i1global ON t1 (owner) global;

create table t3 as select * from dba_objects;
ALTER TABLE t3 MODIFY
PARTITION BY HASH (object_id)
SUBPARTITION BY HASH (object_name)
SUBPARTITIONS 2 PARTITIONS 2;  
CREATE INDEX i3local ON t3 (last_ddl_time) LOCAL;
CREATE INDEX i3global ON t3 (owner) global;
CREATE INDEX i3globalpart ON t3 (object_id) global partition by hash (object_id) partitions 2;

EOF

rm /u01/app/oracle/admin/UP19/dpdump/*
expdp system/oracle schemas=appuser directory=dpdir

sqlplus / as sysdba<<EOF
drop user appuser cascade;
drop tablespace apps including contents and datafiles;
create tablespace apps default table compress for oltp index compress advanced low;
alter system set db_index_compression_inheritance=tablespace scope=both;
EOF

impdp system/oracle directory=dpdir transform=table_compression_clause:\"none\"

sqlplus / as sysdba<<EOF
prompt "TABLES"
select table_name, compression, compress_for from all_tables where owner='APPUSER';
prompt "TAB PARTITIONS"
select table_name,partition_name, compression, compress_for from all_tab_partitions where table_owner='APPUSER';
prompt "TAB SUBPARTITIONS"
select table_name,subpartition_name, compression, compress_for from all_tab_subpartitions where table_owner='APPUSER';

prompt "INDEXES"
select index_name, compression from all_indexes where table_owner='APPUSER';
prompt "IND PARTITIONS"
select index_name, compression from all_ind_partitions where index_owner='APPUSER';
prompt "IND SUBPARTITIONS"
select index_name, compression from all_ind_subpartitions where index_owner='APPUSER';
EOF

Oracle Data Pump and BFILEs

If you need to move data using Oracle Data Pump and have BFILEs in your database, what do you need to be aware of?

What Is a BFILE?

From the documentation:

BFILEs are data objects stored in operating system files, outside the database tablespaces. Data stored in a table column of type BFILE is physically located in an operating system file, not in the database. The BFILE column stores a reference to the operating system file. BFILEs are read-only data types. The database allows read-only byte stream access to data stored in BFILEs. You cannot write to or update a BFILE from within your application.

They are sometimes referred to as external LOBs.

You can store a BFILE locator in the database and use the locator to access the external data:

To associate an operating system file to a BFILE, first create a DIRECTORY object that is an alias for the full path name to the operating system file. Then, you can initialize an instance of BFILE type, using the BFILENAME function in SQL or PL/SQL …

In short, it is stuff stored outside the database that you can access from inside the database. Clearly, this requires special attention when you want to move your data.

How Do I Move It?

There are three things to consider:

  1. The file outside the database – in the operating system.
  2. The directory object.
  3. The BFILE locator stored in the table.

Table and Schema Mode Export

  1. You must copy the file in the operating system. Since a BFILE is read-only, you can copy the file before you perform the actual export.
  2. You must create the directory object. Directory objects are system-owned objects and not part of a table or schema mode export.
  3. Data Pump exports a BFILE locator together with the table. It exports the BFILE locator just like any other column. On import, Data Pump inserts the BFILE locator but performs no sanity checking. The database will not throw an error if the file is missing in the OS or if the directory is missing or erroneous.

Full Export

  1. Like table and schema mode, you must copy the file.
  2. Directory objects are part of a full export. On import, Data Pump creates a directory object with the same definition. If you place the external files in a different location in the target system, you must update the directory object.
  3. Like table and schema mode. Data Pump exports the BFILE locator as part of the table.

Do I Have BFILEs in My Database?

You can query the data dictionary and check if there are any BFILEs:

SQL> select owner, table_name 
     from dba_tab_cols 
     where data_type='BFILE';

Further Reading

Oracle Data Pump and Compression – Also Without a License

Whenever you use Data Pump to export from Oracle Database, you should use compression. It’s conveniently built into Data Pump.

Pros:

  • The dump file is much smaller:
    • Less disk space is needed.
    • Easier to transfer over the network.
  • Often it is faster to use compression when you measure the entire workflow (export, transfer, and import).
  • Imports are often faster because less data needs to be written from disk.

Cons:

How Do I Enable Data Pump Compression

You simply set COMPRESSION option:

$ expdp ... compression=all

You use COMPRESSION option only for exports. When you import, Data Pump handles it automatically.

You only need a license for Advanced Compression Option when you use compression during export. You don’t need a license to import a compressed dump file.

Medium Is a Good Compression Algorithm

I recommend you use the medium compression algorithm:

$ expdp ... compression=all compression_algorithm=medium

Our experience and tests show that it best balances between compression ratio and CPU.

Here are the results of a test my team did:

Algorithm File Size (MB) Compression Ratio Elapsed Time
NONE 5.800 1,0 2m 33s
BASIC 705 8,2 3m 03s
LOW 870 6,6 3m 11s
MEDIUM 701 8,2 3m 01s
HIGH 509 11,3 12m 16s

I would recommend high algorithm only if you need to transfer over a really slow network.

But I Don’t Have a License

gzip

You can still compress the dump file but not using Data Pump. Use OS utilities. In this case, I recommend splitting the dump file into pieces. It is easier to handle, and you can start transferring the dump files as they are compressed:

$ expdp ... filesize=5G dumpfile=myexp%L.dmp
$ gzip -r /u01/app/oracle/dpdir

Now, you transfer the files, uncompress and import:

[target]$ gunzip -r /u01/app/oracle/dpdir
[target]$ impdp ...

rsync

Another option is to use rsync. It has the option to compress the dump file over the network only:

$ expdp ... filesize=5G dumpfile=myexp%L.dmp
$ rsync -z ...

Cheatsheet

If you have the proper license, use Data Pump compression during export:

$ expdp ... compression=all compression_algorithm=medium

If you don’t have a license, compress the dump file over the wire only:

$ rsync -z ....

Don’t combine Data Pump compression and gzip/rsync! Compressing compressed stuff is not a good idea.

Why Does It Take so Long to Import Check Constraints?

Short answer: If the data was validated originally, it must be validated (again) during import. If you are short on time, you can add the constraints as NOT VALIDATED instead. But be aware of the drawbacks.

Update 3 October 2024: Data Pump now supports transforming constraints into NOVALIDATE constraints. Check blog post for details.

The Problem

I got a question from a colleague importing customer data as part of a migration. Data Pump was spending a lot of time in TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT.

We attached to the Data Pump session and used the status command:

$ impdp attach=<job_name>

Import> status
...
Worker 1 Status:
  ...
  State: EXECUTING                      
  Object Schema: SH
  Object Name: C_CUST2_EMAIL
  Object Type: TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
  Worker Parallelism: 1

In another session, we ran a new Data Pump import, but this time into a SQL file so we could examine the metadata for the object C_CUST2_EMAIL:

$ impdp parfile=<par_file_name> sqlfile=ddl.sql

$ cat ddl.sql | grep C_CUST2_EMAIL
ALTER TABLE "SH"."CUSTOMERS2" ADD CONSTRAINT "C_CUST2_EMAIL" CHECK (cust_email like '%@%') ENABLE;

Data Pump is adding a check constraint and enables it with the ENABLE clause. But there is a missing detail – the constraint is also validated even though there is no VALIDATE clause. It is the default.

What Happens

In the source database, the check constraint was enabled and validated. During import, Data Pump adds the constraint with the same attributes. The only way to add a validated constraint – is to validate it.

But if the constraint was validated during export, then we know data in the dump file is validated as well. When importing data that we know is validated, why do we need to validate it again?

Because we know, but the database doesn’t know it. In order for a check constraint to be marked as validated, the database must ensure that it is truly the case. It trusts no one and insists on validating the data.

And when you import data from a dump file, can you actually be sure all data is good? What if …

  • someone tampered with the dump file
  • data was changed during export using the remap option
  • data was changed as part of character set conversion
  • data was changed as part of timezone file conversion

Why Does It Takes So Long To Validate A Constraint

To answer that, let’s trace:

SQL> ALTER SESSION SET TRACEFILE_IDENTIFIER = "CONST";
SQL> ALTER SESSION SET EVENTS '10046 trace name context forever, level 4';
SQL> ALTER TABLE sh.customers2 ADD CONSTRAINT c_cust2_valid 
     CHECK (cust_valid in ('I', 'A')) 
     ENABLE VALIDATE;

Run the trace file through tkprof and have a look at the first recursive statement after the ALTER TABLE ... ADD CONSTRAINT statement:

select /*+ all_rows ordered dynamic_sampling(2) */ A.rowid, :1, :2, :3 
from "SH"."CUSTOMERS2" A 
where not ( cust_email like '%@%')

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1     49.03      66.61    3103448    3103510          1           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3     49.03      66.62    3103448    3103510          1           0

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100     (recursive depth: 3)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  TABLE ACCESS FULL CUSTOMERS2 (cr=3103510 pr=3103448 pw=0 time=66618486 us starts=1 cost=2 size=39 card=1)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file scattered read                      24274        0.20         22.45
  PGA memory operation                            1        0.00          0.00

The database checks the table for any existing data that violates the constraint. It does so by issuing a SELECT using the inverse function of the check constraint. This results in a full table scan, and that is what takes so long.

If you have multiple check constraints on the same table, Data Pump adds them one by one, and that results in multiple full table scans.

Since full table scans rarely populate the buffer cache, you end up with a lot of disk reading as well. You can verify that in the example above in the row source part. Physical reads (pr) are almost the same as consistent gets (cr). Almost every read requires a physical read.

But I Know Data Is Good

If you are sure that the data is good and you want to avoid the validation of data, you can create the check constraints as enabled but not validated. Existing data is not checked, but new data will be validated.

There are two three ways of doing that:

1. Change Constraints in Source

Before exporting the data, in the source database, change all validated check constraints to NOVALIDATE:

ALTER TABLE "SH"."CUSTOMERS2" 
MODIFY CONSTRAINT "C_CUST2_EMAIL" 
ENABLE NOVALIDATE;

But be aware that you are changing the source database. If you need to use it for other purposes or as fallback, this might not be a good idea.

2. Change Constraints During Import

First, import the data, but exclude the constraints. This will be much faster because no time is spent on validating constraints:

$ impdp ... exclude=constraint

Next, extract the definition of all constraints from the Data Pump dump file and into a SQL file:

$ impdp ... include=constraint sqlfile=ddl.sql

Now, add the NOVALIDATE clause to all the statements:

ALTER TABLE "SH"."CUSTOMERS2" 
ADD CONSTRAINT "C_CUST2_EMAIL" 
CHECK (cust_email like '%@%') 
ENABLE NOVALIDATE;

3. Use Data Pump Transformation On Import

Check blog post for details.

But There Is a Catch!

Regardless of which method you choose, there is a catch. Validated constraints are very useful to the database because it enables the optimizer to perform query rewrite and potentially improve query performance. Also, index access method might become available instead of full table scans.

You want to get those constraints validated. But you don’t have to do it during the import. Validating an enabled, not validated constraint does not require a lock on the table, so you can do it while the database is open for business.

What About Rely

If you follow option 2 and create the constraints after import as enabled and not validated, you could also add the rely clause:

ALTER TABLE "SH"."CUSTOMERS2" 
ADD CONSTRAINT "C_CUST2_EMAIL" 
CHECK (cust_email like '%@%') 
RELY ENABLE NOVALIDATE;

Rely tells the database that you know the data is good. The optimizer still doesn’t trust you until you set the parameter QUERY_REWRITE_INTEGRITY to TRUSTED. Now, the optimizer can now benefit from some query rewrite options, but not all of them.

Nothing beats a truly validated constraint!

Conclusion

Importing validated check constraints takes time. If you want to save time in your maintenance window, instruct Data Pump to exclude the constraints, and add them yourself as enabled and not validated. This will save a lot of time that Data Pump otherwise would have to spend on validating the constraints.

The database benefits from validated constraints, so you should validate them. The database can validate constraints without a lock on the table if the constraints are already enabled. Thus, you can postpone the validation to a later time in your maintenance window, and you can perform other activities at the same time. Perhaps you can validate constraints while users are testing the database. Or wait until the next maintenance window. Just be aware that you might see some SQLs degrade until all constraints are validated.

If Importing Statistics Using DBMS_STATS Is Slow

When migrating Oracle Databases, you often transport the statistics using dbms_stats. Our team got involved in an interesting case the other day, and I would like to share the solution with you.

Problem

A customer migrated to Oracle Database 19c and decided to move the statistics using dbms_stats.import_schema_stats. They noticed that the procedure started to burn a lot of CPU, and after 38 hours, they gave up.

A SQL Monitor report showed one statement which spent almost all the time. An insert statement into sys.dbms_stats_id_map_tab with a subselect referencing the staging table. The staging table is the one you specify in the argument stattab in the call to dbms_stats.import_schema_stats. The staging holds the optimizer statistics in a portable format. From the staging table, the procedure can insert it into the data dictionary in the proper format.

Obviously, they could see already that the procedure would use far too much time.

Diagnosing

The first shot was to look at the code behind dbms_stats. But this specific code hadn’t changed since its introduction many releases ago.

Recursive statements that touch the data dictionary immediately brought our attention to dictionary statistics. But the customer told us that they were not stale.

The customer ran SQL Tuning Advisor on the offending statement, and one of the findings was about stale statistics. SQL Tuning Advisor recommended gathering statistics on the staging table and corresponding indexes.

One of our performance experts looked at the execution plan and found a pattern he had seen before. He tried to disable Join Predicate Push Down in the session. It helped, but this was just a workaround. We wanted to find the root cause.

Solution

The SQL Tuning advisor came up with the real problem. Stale statistics on the staging table and corresponding indexes. Once the customer gathered statistics on the staging table and indexes, the import of statistics finished in 2 hours and 27 minutes which was acceptable.

We also discovered that the dictionary statistics were not as accurate as the customer had initially concluded. In fact, by mistake, they had misinformed us. A fresh run of dbms_stats.gather_dictionary_stats gave a slight performance improvement as well.

Recommendations

These recommendations are now part of our best practices for migrations.

  1. Before importing statistics using dbms_stats.import_schema_stats, gather stats on the staging table.

  2. Immediately after importing a lot of data, and before you import statistics with dbms_stats or anything else, you should gather dictionary statistics. This applies to regular Data Pump imports and transportable tablespace metadata imports.

The run book should look like this:

$ impdp parfile=import.par ...
$ sqlplus / as sysdba

SQL> exec dbms_stats.gather_dictionary_stats; 
SQL> exec dbms_stats.gather_table_stats(
		ownname=>'SYSTEM',
		tabname=>'MY_STG_TAB_1',
		cascade=>TRUE);
SQL> exec dbms_stats.import_schema_stats(
		ownname=>'SYSTEM', 
		stattab=>'MY_STG_TAB_1', 

In the above example, the staging table is called SYSTEM.MY_STG_TAB_1.

The recommendation applies as well if you are using the procedures dbms_stats.import_database_stats or dbms_stats.import_table_stats.

Dictionary Statistics

Why is it important to gather dictionary statistics immediately after the import?

When you import data or use transportable tablespaces, you often have a brand new, empty database. Then you import a lot of objects. Those objects are represented as rows in the data dictionary. For instance, the tables you import now appear as rows in SYS.TAB$, the partitions in SYS.TABPART$, the indexes in SYS.IND$, and so forth.
Those internal tables were almost empty before – there were not a lot of tables. Now they have a lot of rows. This means that the statistics are stale. When you start to use functionality in the database, like importing statistics, recursive queries using the internal tables will be executed. With stale statistics on the dictionary, you can have suboptimal execution plans and bad performance. Gathering dictionary statistics can fix this for you.

Conclusion

Statistics are always vital, whether optimizer statistics on user data or internally in the data dictionary. Be sure to verify the accuracy of your statistics when you have problems.

Also, SQL Tuning Advisor is a great tool. It can quickly come up with suggestions for fixing problems. Use the recommendations as input to your troubleshooting. SQL Tuning Advisor also works on internal tables.

Additional Information

We have a few videos on our YouTube channel which have more information about transporting statistics with dbms_stats.

DataPump Export from Data Guard

Exporting data from your production database can be a pain for several reasons:

  • You probably want a consistent export so you set the Data Pump parameter FLASHBACK_TIME. That requires a lot of undo space. If there is heavy activity on the database, then you risk running into ORA-01555.
  • You want export the data quickly using the parallel options in Data Pump. The quicker you export, the less susceptible you also are to ORA-01555. But parallel uses more resources.

If you have an Active Data Guard, you can export your data from the standby database. It should be straightforward, right? An export is after all just reading data…

But Data Pump is not just a simple export tool. It is quite advanced – and to support all that functionality it does require a master table to be created which can be used to coordinate the operation. Further, it uses Advanced Queueing (AQ) for communicating between the processes. And AQ also requires a writable database.

But how about DML Redirect that was introduced in Oracle Database 19c? Can it be used to handle those few DMLs? No, because creation of the master table is a DDL – not a DML. If you try, you will fail:

$  expdp system@localhost:1521/pdb1 schemas=SH directory=mydir

Export: Release 19.0.0.0.0 - Production on Mon Apr 12 18:19:22 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Password:

Connected to: Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
ORA-31626: job does not exist
ORA-31633: unable to create master table "SYSTEM.SYS_EXPORT_SCHEMA_05"
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: at "SYS.KUPV$FT", line 1163
ORA-16000: database or pluggable database open for read-only access
ORA-06512: at "SYS.KUPV$FT", line 1056
ORA-06512: at "SYS.KUPV$FT", line 1044

Solution

I turned on my bat signal and got help from my fellow Product Manager, Pieter Van Puymbroeck. I turned on the bat signal and got help Photo by Ali Kokab on Unsplash

Use a snapshot standby, he said. Of course – Snapshot Standby is part of Enterprise Edition, so if you have Data Guard, you also have Snapshot Standby. If you need a quick recap on what a snapshot standby database is, you should check out this video.

First, I convert my physical standby database (named CDB19_fra3zt) into a snapshot standby database:

DGMGRL> convert database 'CDB19_fra3zt' to snapshot standby;

Next, create a directory object that can be used with Data Pump:

SQL> create directory mydir as '/tmp';

Finally, start an export. Notice that I haven’t set FLASHBACK_TIME. When exporting from a snapshot standby there are probably no other users on the database, so you don’t need to worry about consistency of the export:

$ expdp system schemas=sales directory=mydir

My data has now been exported, and it is time to convert my snapshot standby database back into a physical standby database to properly protect my precious database:

DGMGRL> convert database 'CDB19_fra3zt' to physical standby;

But

Will this jeopardize my primary database. No, but ….

Your primary database is still protected, because log files are still sent to the standby database, but not applied. What will then happen if I need to switch over or fail over?

  1. Convert the snapshot standby database back into a physical standby database
  2. That will implicitly issue a Flashback Database and revert all the changes made
  3. Then catch up with the primary by applying all the log files
  4. Finally, complete the switchover or failover

Bottomline, it will take longer! The flashback operation is really fast and should be done within minutes. After all, Data Pump isn’t making that many changes. Only the master table and some AQ stuff. How much time then to do log apply? That depends entirely on how big the apply lag is.

So, if you choose to go down this path be prepared that in the event of a switchover or a failover, it will take a little longer until the standby is open for business. Unless, you have two standby databases, then you will still have run ready to jump in immediately.

GoldenGate

If you need the Data Pump export to perform an initial load for GoldenGate, then you might want to control the SCN at which the export is happening. If so, cancel redo apply on the standby database and roll forward to the desired SCN:

alter database recover managed standby database cancel;
alter database recover managed standby database until change n;

Before converting the standby database to snapshot standby:

alter database convert to snapshot standby;

Later on, when you need to start replication in GoldenGate you can use the following option:

START REPLICAT ATCSN n

It Must Be From Physical Standby

On Twitter I was made aware that AWS RDS does not allow snapshot standby. The user referred to a MOS note describing how to export from a physical standby database. You need a proxy database and a database link. You find the details in How To Use DataPump Export (EXPDP) To Export From Physical Standby Database (Doc ID 1356592.1).

I recommend using this method only as a last resort. It is more cumbersome and you don’t get the same performance as via a snapshot standby database. Using Data Pump over a database link limits certain parallel capabilities.

Conclusion

You can export from your Data Guard, if you convert it temporarily to a snapshot standby. Be prepared that a switchover or failover operation will take longer.

If you want to know more about Data Pump, you should check out our webinar Migration Strategies – Insights, Tips and Secrets.

And finally, speaking of Data Pump, remember to get the Data Pump bundle patch (bug 32551008) as Roy mentioned on Twitter.