What Is Data Pump Import Doing In PROCACT_SCHEMA

At a recent event, a customer asked me a question.

I’m doing a Data Pump import, and processing PROCACT_SCHEMA takes a lot of time. What’s going on?

I love meeting our customers and hearing about these cases. So, let’s dig into it.

Situation

Here is an extract from the Data Pump import log:

08-SEP-24 18:10:18.604: W-4 Processing object type DATABASE_EXPORT/SCHEMA/PROCACT_SCHEMA
08-SEP-24 22:08:24.804: W-6      Completed 78693 PROCACT_SCHEMA objects in 14283 seconds
08-SEP-24 22:08:24.805: W-6      Completed by worker 1 78692 PROCACT_SCHEMA objects in 14283 seconds
08-SEP-24 22:08:24.826: W-6 Processing object type DATABASE_EXPORT/EARLY_POST_INSTANCE_IMPCALLOUT/MARKER

A few observations:

  • Seeing a customer using logtime=all and metrics=yes in their Data Pump jobs is great. It adds valuable diagnostic information to the log file.
  • Almost four hours on PROCACT_SCHEMA.
  • Almost 80.000 schemas – that’s a lot.
  • Only one active worker.

Tracing

The customer sent me a 10046 trace for that period. They had the suspicion that Data Pump was doing something for each schema. I had the same thought, so I sorted the trace file:

sort CDB01_dw00_54345.trc > sort.txt

Using my favorite text editor, I could scroll through the content and find this repeating call:

sys.dbms_logrep_imp.instantiate_schema(schema_name=>SYS_CONTEXT('USERENV','CURRENT_SCHEMA'), export_db_name=>'...', inst_scn=>'...');COMMIT; END; 

There’s not much public information about this package, but it is a package used during import to move logical replication settings – used by Oracle Streams and Oracle GoldenGate.

Oracle Streams

I’m not a big Streams expert, but I do know that it is desupported in Oracle Database 19c. But could there be some leftovers in the dictionary that caused Data Pump to move the configuration?

I asked the customer to query many of the DBA_STREAMS_* views in the database. One of them returned a lot of rows:

select count(*) from dba_streams_columns;

  COUNT(*)
----------
     72636

Solution

The customer asked for advice on using the Data Pump import parameter STREAMS_CONFIGURATION=NO. Judging from the parameter name, it sounds like a good idea, but the documentation reads:

… STREAMS_CONFIGURATION parameter specifies whether to import any GoldenGate Replication metadata that may be present in the export dump file.

But it states GoldenGate metadata – not Streams. But look in the 18c documentation:

… import any Streams metadata that may be present in the export dump file.

Streams and GoldenGate share a lot of the same architecture in the database because they serve the same purpose: Data replication. That’s why the parameter still exists even if Streams is desupported.

The customer didn’t use GoldenGate, so they decided to exclude the replication metadata during import:

impdp ... streams_configuration=no

The PROCACT_SCHEMA part of the import went from almost four hours to just 30 seconds!

Final Words

In this case, a full import, the issue happened during DATABASE_EXPORT/SCHEMA/PROCACT_SCHEMA. It could happen in a schema import as well. In that case, the phase would be SCHEMA_EXPORT/POST_SCHEMA/PROCACT_SCHEMA.

What is PROCACT? It is short for procedural actions. That phase handles things in the database that are not real objects nevertheless something you want to include. Data Pump calls out to other parts of the code to get the job done. That’s also why you typically see one worker active. I’d like to write a separate blog post about that one day.

A full export/import is convenient, but it includes everything, including the things you didn’t know you had in the database – leftovers or garbage data. A schema export/import is a cleaner approach but would still be affected by the above issue. If you’re short on time, consider a schema export/import and include just the objects that you really need. Something like:

impdp ... schemas=abc,def include=table,index,constraint,trigger,package

 

Happy data-pumping!

Faster Data Pump Import of LOBs Over Database Link

A colleague was helping a customer optimize an import using Data Pump via a database link that involved SecureFile LOBs.

Do you see a way to parallelize the direct import to improve performance and thus shorten the time it takes to import? Or is it not possible for LOB data?

Network mode imports are a flexible way of importing your data when you have limited access to the source system. However, it comes with the price of restrictions. One of them being:

  • Network mode import does not use parallel query (PQ) child processes.

In Data Pump, one worker will process a table data object which is either a:

  • Table
  • Table partition
  • Table subpartition

So, for a regular table, this means just one worker is processing the table and it doesn’t use parallel query. That’s bound to be slow for larger data sets, but can you do something?

Starting Point

To illustrate my point, I’ll use a sample data set consisting of:

  • One schema (BLOBLOAD)
  • With one table (TAB1)
  • Containing two columns
    • Number (ID)
    • BLOB (BLOB_DATA)
  • The table has around 16.000 rows
  • Size is 50 GB

Doing a regular Data Pump import over a database link is slow because there’s only one worker and no parallel query:

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   parallel=4

...

21-OCT-24 05:30:36.813: Job "SYSTEM"."SYS_IMPORT_SCHEMA_01" successfully completed at Mon Oct 21 05:30:36 2024 elapsed 0 00:11:50

Almost 12 minutes!

Partitioning

Since we know that multiple workers can process different partitions of the same table, let’s try to partition the source table. I’ll use hash partitioning and ensure my partitions are equally distributed:

alter table tab1 
modify partition by hash (id) 
partitions 32 online;

Repeat the import:

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   parallel=4

...

21-OCT-24 09:08:00.897: Job "SYSTEM"."SYS_IMPORT_SCHEMA_01" successfully completed at Mon Oct 21 09:08:00 2024 elapsed 0 00:04:26

Just 4m 26s – that’s a huge improvement!

In the log, file you’ll see that multiple workers are processing partitions individually. So, even without parallel query, I get parallelism because of multiple workers on the same table – each on different partitions.

But partitioning is a separately licensed option.

Using QUERY Parameter and Multiple Data Pump Imports

I’ve previously blocked about do-it-yourself parallelism for Data Pump exports of BasicFile LOBs. Can I use the same approach here?

The idea is to start multiple Data Pump jobs importing the same table, but each working on a subset of the data.

  • First, import just the metadata
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=metadata_only
    
  • Next, start 4 concurrent imports importing just the rows. Each import works on a subset of the data using thery query parameter:
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=data_only \
       query="where mod(id, 4)=0"
    
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=data_only \
       query="where mod(id, 4)=1"
    
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=data_only \
       query="where mod(id, 4)=2"
    
    impdp ... \
       network_link=srclnk \
       schemas=blobload \
       content=data_only \
       query="where mod(id, 4)=3"
    

No – that’s not possible. During imports, Data Pump acquires a lock on the table being imported using the APPEND hint. This is from a trace of the imports:

INSERT /*+  APPEND  NESTED_TABLE_SET_REFS   PARALLEL(KUT$,1)   */ INTO "BLOBLOAD"."TAB1"  KUT$ ("ID", "BLOB_DATA")
SELECT /*+ NESTED_TABLE_GET_REFS  PARALLEL(KU$,1)  */ "ID", "BLOB_DATA" FROM "BLOBLOAD"."TAB1"@srclnk KU$ WHERE mod(id, 4)=1

If you try to start multiple imports into the same table, you get an error:

ORA-02049: timeout: distributed transaction waiting for lock

So, let’s prevent that by adding data_options=disable_append_hint to each Data Pump import jobs.

Now, multiple Data Pump jobs may work on the same table, but it doesn’t scale lineary.

  • One concurrent job: Around 12 minutes
  • Four concurrent jobs: Around 8 minutes
  • Eight concurrent jobs: Around 7 minutes

It gives a performance benefit, but probably not as much as you’d like.

Two-Step Import

If I can’t import into the same table, how about starting four simultaneous Data Pump jobs using the do-it-yourself approach above, but importing into separate tables and then combining all the tables afterward?

I’ll start by loading 1/4 of the rows (notice the QUERY parameter):

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   query=\(blobload.tab1:\"WHERE mod\(id, 4\)=0\"\)

While that runs, I’ll start three separate Data Pump jobs that each work on a different 1/4 of the data. I’m remapping the table into a new table to avoid the locking issue:

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   include=table \
   remap_table=tab1:tab1_2 \
   query=\(blobload.tab1:\"WHERE mod\(id, 4\)=1\"\)

In the remaining two jobs, I’ll slightly modify the QUERY and REMAP_TABLE parameters:

impdp ... \
   network_link=srclnk \
   schemas=blobload \
   include=table \
   remap_table=tab1:tab1_3 \
   query=\(blobload.tab1:\"WHERE mod\(id, 4\)=2\"\)
impdp ... \
   network_link=srclnk \
   schemas=blobload \
   include=table \
   remap_table=tab1:tab1_4 \
   query=\(blobload.tab1:\"WHERE mod\(id, 4\)=3\"\)

Now, I can load the rows from the three staging tables into the real one:

ALTER SESSION FORCE PARALLEL DML;
INSERT /*+ APPEND PARALLEL(a) */ INTO "BLOBLOAD"."TAB1" a 
SELECT /*+ PARALLEL(b)  */ * FROM "BLOBLOAD"."TAB1_2" b;
commit;

INSERT /*+ APPEND PARALLEL(a) */ INTO "BLOBLOAD"."TAB1" a 
SELECT /*+ PARALLEL(b)  */ * FROM "BLOBLOAD"."TAB1_3" b;
commit;

INSERT /*+ APPEND PARALLEL(a) */ INTO "BLOBLOAD"."TAB1"  a
SELECT /*+ PARALLEL(b)  */ * FROM "BLOBLOAD"."TAB1_4" b;
commit;

This approach took around 7 minutes (3m 30s for the Data Pump jobs, and 3m 30s to load the rows into the real table). Slower than partitioning but still faster than the starting point.

This approach is complicated; the more data you have, the more you need to consider things like transaction size and index maintenance.

Conclusion

Network mode imports have many restrictions, which also affect performance. Partitioning is the easiest and fastest improvement, but it requires the appropriate license option. The final resort is to perform some complicated data juggling.

Alternatively, abandon network mode imports and use dump files. In dump file mode, one worker can use parallel query during export and import, which is also fast.

Thanks

I used an example from oracle-base.com to generate the test data.

VIEWS_AS_TABLES – A Hidden Data Pump Gem

VIEWS_AS_TABLES is a neat feature in Oracle Data Pump that allows you to export the contents of a view and import them as tables.

The idea is to export a view as if it were a table. The dump file contains a table definition:

  • With the same name as the view
  • With the same columns as the view
  • With the same data as the view

Show Me

  1. In the source database, you create a view, or you can use an existing view:
SQL> create view sales as select * from all_objects;
  1. Next, you export that view as a table:
$ expdp ... views_as_tables=sales
  1. In the target database, you import:
$ impdp ...
  1. The view is now a table:
SQL> select object_type from all_objects where object_name='SALES';

OBJECT_TYPE
-----------
TABLE

When To Use It

  • Faster import of data over a database link when using the QUERY parameter. Normally, the predicate in the QUERY parameter is evaluated on the target database, so during a Data Pump import over a database link, all rows are retrieved from the source database. Then, the QUERY parameter is applied to filter the rows. This is inefficient if you select a smaller portion of a larger table. By using VIEWS_AS_TABLES the filtering happens on the source database and might speed up the import dramatically.

  • Customized data export. Another case I worked on involved a system where the user must be able to extract certain data in a format of their choosing. The user could define a view, export it, and import it into their local database for further processing. The view could:

    • Include various columns and the user can decide the ordering and column names.
    • Join tables to create a more complete data set.
    • Translate columns with domain values to text (like 1 being NEW, 2 being IN PROGRESS and so forth).
    • De-normalize data to make it more human-readable.
    • Format dates and numbers according to the user’s NLS settings.
  • Transform tables are part of a migration. I’ve also seen some customers perform powerful transformations to data while the data was migrated. There are a lot of transformations already in Data Pump, but in these cases, the customer had more advanced requirements.

The Details

  • You can use VIEWS_AS_TABLES in all modes: full, tablespace, schema, and table.
  • The table has the same name as the view. But you can also use the REMAP_TABLE option in Data Pump to give it a new name.
  • During export, Data Pump:
    1. Creates an empty table with the same structure as the view (select * from <view> where rownum < 1).
    2. Exports the table metadata
    3. Unloads data from the view
    4. Drops the interim table
  • Data Pump also exports dependent objects, like grants, that are dependent on the view. On import, Data Pump adds those grants to the table.

Conclusion

A powerful feature that might come in handy one day to transform your data or boost the performance of network link imports.

Leave a comment and let me know how you used the feature.

How to Speed up Data Pump Imports Using NOVALIDATE Constraints

If you want to save time during a Data Pump import, you can transform constraints to NOT VALIDATED. Regardless of the constraint state in the source database, Data Pump will create the constraint using the novalidate keyword.

This can dramatically reduce the time it takes to import. But be aware of the drawbacks.

The Problem

Here is an example from a big import:

01-JAN-24 08:19:00.257: W-38 Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
01-JAN-24 18:39:54.225: W-122      Completed 767 CONSTRAINT objects in 37253 seconds
01-JAN-24 18:39:54.225: W-122      Completed by worker 1 767 CONSTRAINT objects in 37253 seconds

There is only one worker processing constraints, and it took more than 10 hours to add 767 constraints. Ouch!

A Word About Constraints

Luckily, most databases use constraints extensively to enforce data quality. A constraint can be:

  • VALIDATED
    • All data in the table obeys the constraint.
    • The database guarantees that data is good.
  • NOT VALIDATED
    • All data in the table may or may not obey the constraint.
    • The database does not know if the data is good.

When you create a new constraint using the VALIDATE keyword (which is also the default), the database recursively full scans the entire table to ensure existing data is good. If you add more constraints, the database full scans each time. Since full table scans rarely make it into the buffer cache, each new constraint causes a lot of physical reads.

How Does Data Pump Add Constraints

Data Pump adds the constraints in the same state as in the source. As mentioned above, the constraints are most likely VALIDATED.

During import, Data Pump:

  1. Creates an empty table
  2. Loads data
  3. Adds dependent objects, like constraints

It could look like this in a simplified manner:

Example of Data Pump importing a table

For each of the alter table ... add constraint commands will trigger a full table scan because of the validate keyword. For a large database, this really hurts, especially because the full scan does not go parallel.

The Solution

The idea is to add the constraints as NOT VALIDATED but still ENABLED.

  • NOT VALIDATED means the database doesn’t check the existing data
  • ENABLED means the database enforces the constraints for new data

In Data Pump, there is a simple transformation:

impdp ... transform=constraint_novalidate:y

Data Pump adds all constraints using the novalidate keyword regardless of the state in the source database.

Adding constraints using NOVALIDATE keyword

Instead of a full table scan for each new constraint, the alter table ... add constraint command is instant. It’s just a short write to the data dictionary, and that’s it. No full table scan.

This transformation requires Oracle Database 19c, Release Update 23 with the Data Pump Bundle Patch.

Update: A few ran into the following error when using the feature:

ORA-39001: invalid argument value
ORA-39042: invalid transform name CONSTRAINT_NOVALIDATE

Unfortunately, you need to add patch 37280692 as well. It’s included in 19.27.0 Data Pump Bundle Patch.

Is It Safe To Use?

Yes. There is no chance that this feature corrupts your data. Further, you know that data was good in the source, so it will be good in the target database as well.

However, you should take care when you are changing data on import. The alteration might lead to constraints being unable to validate and you won’t know this until you eventually perform the validation. The data is still perfectly fine, however, the constraint would need to be altered to match the new data.

Imagine the following:

  • You are importing into a different character set – from singlebyte to Unicode.
  • One of your constraints checks the length of a text using byte semantics with the function LENGTHB.
  • After import into the Unicode database, some characters may take up two bytes or more.
  • The result of the LENGTHB function would change and you would need to update the constraint definition. Either by allowing more bytes or using LENGTH or LENGTHC.

Let me give you an example:

  • In my singlebyte database (WE8MSWIN1252), I have a table with these two rows:
    • ABC
    • ÆØÅ (these are special Danish characters)
  • In singlebyte all characters take up one byte, so
    • LENGTHB('ABC') = 3
    • LENGTHB('ÆØÅ') = 3
  • I migrate to Unicode and now the special Danish character expand. They take up more space in AL32UTF8:
    • LENGTHB('ABC') = 3
    • LENGTHB('ÆØÅ') = 6
  • If I have a check constraint using the LENGTHB function, I would need to take this into accout. Plus, there are other similar functions that works on bytes instead of chars, like SUBSTRB.

It’s probably rare to see check constraints using byte semantic functions, like LENGTHB and SUBSTRB. But I’ve seen that in some systems that had to integrate with other systems.

You can end up in a similar situation if you:

  • use the remap_data option to change data
  • perform other kinds of data transformation

Since the constraint is still enabled, the database still enforces the constraint for new data after the import.

What’s the Catch?

Validated constraints are very useful to the database because it enables the optimizer to perform query rewrite and potentially improve query performance. Also, index access method might become available instead of full table scans with a validated constraint.

You want to get those constraints validated. But you don’t have to do it during the import. Validating an enabled, not validated constraint does not require a lock on the table. Thus, you can postpone the validation to a later time in your maintenance window, and you can perform other activities at the same time (like backup). Perhaps you can validate constraints while users are testing the database. Or wait until the next maintenance window.

Further, Data Pump always adds validated constraints in these circumstances:

  • On DEFAULT ON NULL columns
  • Used by a reference partitioned table
  • Used by a reference partitioned child table
  • Table with Primary key OID
  • Used as clustering key on a clustered table

What About Rely

After import, you could manually add the rely clause:

alter table ... modify constraint ... rely;

Rely tells the database that you know the data is good. The optimizer still doesn’t trust you until you set the parameter QUERY_REWRITE_INTEGRITY to TRUSTED. Now, the optimizer can now benefit from some query rewrite options, but not all of them.

Nothing beats a truly validated constraint!

Validate Constraints Using Parallel Query

Since you want to validate the constraints, Connor McDonald made a video showing you can do that efficiently using parallel query:

Changing the default parallel degree (as shown in the video) might be dangerous in a running system.

  • All other queries will also run with parallel
  • ALTER TABLE might lead to cursor invalidation

So, here’s a better approach (thanks Connor):

alter session force parallel query;
alter table ... modify constraint ... enable validate;
  • The validation happens:
  • Without table lock
  • In parallel
  • And with no cursor invalidation

Nice!

Final Words

If you’re short on time, consider adding constraints as not validated.

The above case with more than 10 hours spent on adding validated constraints; that could have been just a few seconds with novalidate constraints. That’s a huge difference to a time critical migration.

Don’t forget to validate them at one point, because validated constraints are a benefit to the database.

Check my previous blog post for further details on constraint internals.

Appendix

Autonomous Database

The fix is also available in Autonomous Database, both 19c and 23ai.

Zero Downtime Migration

If you import via Zero Downtime Migration (ZDM) you need to add the following to your ZDM response file:

DATAPUMPSETTINGS_METADATATRANSFORMS-1=name:CONSTR_NOVALIDATE,value:1

You might have to change METADATATRANSFORMS-1 to METADATATRANSFORMS-<n> if you have additional transformations (where <n> is a incrementing number).

How to Trace Oracle Data Pump

If you ever encounter problems with Oracle Data Pump, you can use this recipe to get valuable tracing.

Over the years, I’ve helped many customers with Data Pump issues. The more information you have about a problem, the sooner you can come up with a solution. Here’s my list of things to collect when tracing a Data Pump issue.

Daniel’s Tracing Recipe

1. AWR

  • Be sure you have a proper license to use AWR.

  • Set the snapshot interval to 15 minutes and create a new snapshot:

    exec dbms_workload_repository.modify_snapshot_settings(null, 15);
    exec dbms_workload_repository.create_snapshot;
    
  • If you are on Multitenant, do so in the root container and in the PDB.

2. SQL Trace

  • Depending on the nature of the problem, you can enable SQL trace of the Data Pump processes:

    alter system set events 'sql_trace {process: pname = dw | process: pname = dm} level=8';
    
    • You can change the trace level as required.
    • dm is the Data Pump control process, dw are worker processes.
  • If you already know the SQL ID causing problems, you can enable tracing for just that SQL:

    alter system set events 'sql_trace[SQL: <sql-id>]';
    
    • Replace <sql-id> with the offending SQL ID.

3. Start Data Pump

  • Start the Data Pump job that you want to trace:
    expdp ... metrics=yes logtime=all trace=<trace-setting>
    impdp ... metrics=yes logtime=all trace=<trace-setting>
    

4. AWR

  • Be sure you have a proper license to use AWR.

  • When the Data Pump job completes or after you stopped it, reset the snapshot interval to the original value and create a new AWR snapshot:

    exec dbms_workload_repository.modify_snapshot_settings(null, <original-value>);
    exec dbms_workload_repository.create_snapshot;
    
  • Create an AWR report spanning the entire period:

    @?/rdbms/admin/awrrpt
    
    • If needed, you can later on create AWR reports for a shorter period.
  • If you are on Multitenant, do so in the root container and in the PDB.

Get the Information

Collect the following information:

  1. The Data Pump log file.
  2. AWR reports – on CDB and PDB level
  3. Data Pump trace files
    • Stored in the database trace directory
    • Control process file name: *dm*
    • Worker process file names: *dw*

This should be a great starting point for diagnosing your Data Pump problem.

What Else

  • Remember, you can use the Data Pump Log Analyzer to quickly generate an overview and to dig into the details.

  • Regarding Data Pump parameters metrics=yes and logtime=all. You should always have those in your Data Pump jobs. They add very useful information at no extra cost. In Oracle, we are discussing whether these should be default in a coming version of Data Pump.

Leave a comment and let me know your favorite way of tracing Oracle Data Pump.

Sure, Let Me Analyze This 200.000-Line Log File

Imagine importing a large database using Oracle Data Pump. In the end, Data Pump tells you success/failure and the number of errors/warnings encountered. You decide to have a look at the log file. How big is it?

$ du -h import.log
 29M   import.log

29 MB! How many lines?

$ wc -l import.log
  189931 import.log

Almost 200.000 lines!

How on earth can you digest that information and determine whether you can safely ignore the errors/warnings recorded by Data Pump?

Data Pump Logfile Analyzer

This is where Data Pump Logfile Analyzer (DPLA) can help you.

DPLA can summarize the log file into a simple report. Summary of a Data Pump job

It can give you an overview of each type of error. Showing the errors reported in a Data Pump log file

It can tell you where Data Pump spent the most time. Showing which Data Pump phases took the longest

It can produce an interactive HTML report. HTML report from Data Pump Log Analyzer

And so much more. It’s a valuable companion when you use Oracle Data Pump.

Tell Me More

DPLA is not an official Oracle tool.

It is a tool created by Marcus Doeringer. Marcus works for Oracle and is one of our migration superstars. He’s been involved in the biggest and most complicated migrations and knows the pain of digesting a 200.000-line log file.

He decided to create a tool to assist in the analysis of Data Pump log files. He made it available for free on his GitHub repo.

Give It a Try

Next time you have a Data Pump log file, try to use the tool. It’s easy, and instructions come with good examples.

If you like it, be sure to star his repo. ⭐

If you can make it better, I’m sure Marcus would appreciate a pull request.

Thanks, Marcus, good job! 💪

Does Data Pump Export Advanced Data Optimization Policies and Heat Maps?

During our recent webinar on cross-platform migration, an attendee asked a good question:

Does Data Pump in a Full Transportable Export/Import move ADO policies and heat map information?

Let’s find out.

What Is It?

ADO means Advanced Data Optimization and is a feature that:

… automate the compression and movement of data between different tiers of storage within the database.

ADO uses heat maps that :

… provides data access tracking at the segment-level and data modification tracking at the segment and row level.

In other words,

  • You might have slower and faster storage.
  • You define ADO policies on tables and indexes (including partitions and subpartitions).
  • The policies define in which storage the database stores the rows.
  • The heat maps tell which objects you access.
  • Based on heat maps, a job automatically moves rows between storage depending on the use and/or compress it.

What Happens in Data Pump

There are three pieces of essential information:

  1. ADO policies. Defined on objects and governs how the database should store data.
  2. Heat map information. The database records your usage of the data and stores the information in heat maps. Later on, the database uses the heat maps to determine what to do with your data.
  3. ILM settings. Tells the database how to perform the ILM (Information Lifecycle Management) maintenance. For example, you can define the parallel degree that the database uses during maintenance or the number of concurrent maintenance jobs.

You can find a test case at the end of the blog post. I’m using test data and queries from oracle-base.com.

ADO Policies

Data Pump transfers the ADO policies in all modes:

  • Full transportable
  • Full
  • Schema
  • Tablespace
  • Table

You can verify it:

SELECT   policy_name,
         object_owner,
         object_name,
         object_type,
         inherited_from,
         enabled,
         deleted
FROM     dba_ilmobjects
ORDER BY 1;

Heat Map Information

Data Pump does not transfer any heat map information. I didn’t find any way to move the heat map information to the new database.

ILM Settings

Data Pump does not transfer ILM settings – not even in full modes. You must manually move the settings.

  • You can find the current ILM settings:

    SELECT * FROM DBA_ILMPARAMETERS;
    
  • And change the settings in the target database:

    EXEC DBMS_ILM_ADMIN.CUSTOMIZE_ILM(DBMS_ILM_ADMIN.DEG_PARALLEL , 4);
    
    • You must translate the name of the setting to the corresponding PL/SQL constant. You change the setting DEGREE OF PARALLELISM by using DBMS_ILM_ADMIN.DEG_PARALLEL.
  • The documentation holds a complete list of ILM settings.

Appendix

Further Reading

Test Data

I am using our Hands-On lab that you can use on Oracle LiveLabs. Again, thanks to oracle-base.com for test data and queries.

CONN / AS SYSDBA
--Turn on heat map tracking
ALTER SYSTEM SET HEAT_MAP=ON SCOPE=BOTH;

--Customize ILM settings
BEGIN
  DBMS_ILM_ADMIN.CUSTOMIZE_ILM(DBMS_ILM_ADMIN.RETENTION_TIME, 42);
  DBMS_ILM_ADMIN.CUSTOMIZE_ILM(DBMS_ILM_ADMIN.JOBLIMIT, 42);
  DBMS_ILM_ADMIN.CUSTOMIZE_ILM(DBMS_ILM_ADMIN.ABS_JOBLIMIT, 42);
  DBMS_ILM_ADMIN.CUSTOMIZE_ILM(DBMS_ILM_ADMIN.DEG_PARALLEL , 4);
END;
/
SELECT * FROM DBA_ILMPARAMETERS;

--Create tablespaces for ILM policy
CREATE TABLESPACE FAST_STORAGE_TS DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
CREATE TABLESPACE MEDIUM_STORAGE_TS DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
CREATE TABLESPACE SLOW_STORAGE_TS DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
CREATE USER TEST IDENTIFIED BY TEST QUOTA UNLIMITED ON USERS;
GRANT CREATE SESSION, CREATE TABLE TO TEST;
ALTER USER TEST QUOTA UNLIMITED ON FAST_STORAGE_TS;
ALTER USER TEST QUOTA UNLIMITED ON MEDIUM_STORAGE_TS;
ALTER USER TEST QUOTA UNLIMITED ON SLOW_STORAGE_TS;

CONN TEST/TEST

--Create table where we can track usage
CREATE TABLE t1 (
  id          NUMBER,
  description VARCHAR2(50),
  CONSTRAINT t1_pk PRIMARY KEY (id)
);
INSERT INTO t1
SELECT level,
       'Description for ' || level
FROM   dual
CONNECT BY level <= 10;
COMMIT;

--Generate "usage"
SELECT *
FROM   t1;
SELECT *
FROM   t1
WHERE  id = 1;

--Create table for ILM policy
CREATE TABLE invoices (
  invoice_no    NUMBER NOT NULL,
  invoice_date  DATE   NOT NULL,
  comments      VARCHAR2(500)
)
PARTITION BY RANGE (invoice_date)
(
  PARTITION invoices_2016_q1 VALUES LESS THAN (TO_DATE('01/04/2016', 'DD/MM/YYYY')) TABLESPACE slow_storage_ts,
  PARTITION invoices_2016_q2 VALUES LESS THAN (TO_DATE('01/07/2016', 'DD/MM/YYYY')) TABLESPACE slow_storage_ts,
  PARTITION invoices_2016_q3 VALUES LESS THAN (TO_DATE('01/09/2016', 'DD/MM/YYYY')) TABLESPACE medium_storage_ts
    ILM ADD POLICY TIER TO slow_storage_ts READ ONLY SEGMENT AFTER 6 MONTHS OF NO ACCESS,
  PARTITION invoices_2016_q4 VALUES LESS THAN (TO_DATE('01/01/2017', 'DD/MM/YYYY')) TABLESPACE medium_storage_ts
    ILM ADD POLICY TIER TO slow_storage_ts READ ONLY SEGMENT AFTER 6 MONTHS OF NO ACCESS,
  PARTITION invoices_2017_q1 VALUES LESS THAN (TO_DATE('01/04/2017', 'DD/MM/YYYY')) TABLESPACE fast_storage_ts
    ILM ADD POLICY TIER TO medium_storage_ts READ ONLY SEGMENT AFTER 3 MONTHS OF NO ACCESS,
  PARTITION invoices_2017_q2 VALUES LESS THAN (TO_DATE('01/07/2017', 'DD/MM/YYYY')) TABLESPACE fast_storage_ts
    ILM ADD POLICY TIER TO medium_storage_ts READ ONLY SEGMENT AFTER 3 MONTHS OF NO ACCESS
)
ILM ADD POLICY ROW STORE COMPRESS BASIC SEGMENT AFTER 3 MONTHS OF NO ACCESS;

Test Case

###############
# FULL EXPORT #
###############

export ORAENV_ASK=NO
export ORACLE_SID=UP19
. oraenv
export ORAENV_ASK=YES
rm /tmp/expdat.dmp
sql / as sysdba<<EOF
   create or replace directory expdir as '/tmp';
EOF
expdp system/oracle full=y directory=expdir job_name=fullexp

export ORAENV_ASK=NO
export ORACLE_SID=CDB2
. oraenv
export ORAENV_ASK=YES
sql / as sysdba<<EOF
    startup
    alter pluggable database pdb1 close immediate;
    drop pluggable database pdb1 including datafiles;
    create pluggable database pdb1 admin user admin identified by admin;
    alter pluggable database all open;
    alter session set container=pdb1;
    create tablespace users DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    CREATE TABLESPACE fast_storage_ts DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    CREATE TABLESPACE medium_storage_ts DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    CREATE TABLESPACE slow_storage_ts DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    create or replace directory expdir as '/tmp';
EOF

impdp system/oracle@pdb1 directory=expdir

sql / as sysdba<<EOF
    alter session set container=pdb1;
    select * from DBA_ILMPARAMETERS;
    SELECT policy_name,
       object_owner,
       object_name,
       object_type,
       inherited_from,
       enabled,
       deleted
    FROM   dba_ilmobjects
    ORDER BY 1;
    SELECT track_time,
        owner,
        object_name,
        segment_write,
        full_scan,
        lookup_scan
    FROM   dba_heat_map_seg_histogram
    ORDER BY 1, 2, 3;
EOF

#################
# SCHEMA EXPORT #
#################

export ORAENV_ASK=NO
export ORACLE_SID=UP19
. oraenv
export ORAENV_ASK=YES
rm /tmp/expdat.dmp
sql / as sysdba<<EOF
   create or replace directory expdir as '/tmp';
EOF
expdp system/oracle schemas=test directory=expdir job_name=schemaexp

export ORAENV_ASK=NO
export ORACLE_SID=CDB2
. oraenv
export ORAENV_ASK=YES
sql / as sysdba<<EOF
    startup
    alter pluggable database pdb1 close immediate;
    drop pluggable database pdb1 including datafiles;
    create pluggable database pdb1 admin user admin identified by admin;
    alter pluggable database all open;
    alter session set container=pdb1;
    create tablespace users DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    CREATE TABLESPACE fast_storage_ts DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    CREATE TABLESPACE medium_storage_ts DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    CREATE TABLESPACE slow_storage_ts DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    create or replace directory expdir as '/tmp';
EOF

impdp system/oracle@pdb1 directory=expdir

sql / as sysdba<<EOF
    alter session set container=pdb1;
    select * from DBA_ILMPARAMETERS;
    SELECT policy_name,
       object_owner,
       object_name,
       object_type,
       inherited_from,
       enabled,
       deleted
    FROM   dba_ilmobjects
    ORDER BY 1;
    SELECT track_time,
        owner,
        object_name,
        segment_write,
        full_scan,
        lookup_scan
    FROM   dba_heat_map_seg_histogram
    ORDER BY 1, 2, 3;
EOF

################
# TABLE EXPORT #
################

export ORAENV_ASK=NO
export ORACLE_SID=UP19
. oraenv
export ORAENV_ASK=YES
rm /tmp/expdat.dmp
sql / as sysdba<<EOF
   create or replace directory expdir as '/tmp';
EOF
expdp system/oracle tables=test.t1,test.invoices directory=expdir job_name=tabexp

export ORAENV_ASK=NO
export ORACLE_SID=CDB2
. oraenv
export ORAENV_ASK=YES
sql / as sysdba<<EOF
    startup
    alter pluggable database pdb1 close immediate;
    drop pluggable database pdb1 including datafiles;
    create pluggable database pdb1 admin user admin identified by admin;
    alter pluggable database all open;
    alter session set container=pdb1;
    create tablespace users DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    CREATE TABLESPACE fast_storage_ts DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    CREATE TABLESPACE medium_storage_ts DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    CREATE TABLESPACE slow_storage_ts DATAFILE SIZE 1M AUTOEXTEND ON NEXT 1M;
    create or replace directory expdir as '/tmp';
    CREATE USER TEST IDENTIFIED BY TEST QUOTA UNLIMITED ON USERS;
    GRANT CREATE SESSION, CREATE TABLE TO TEST;
    ALTER USER TEST QUOTA UNLIMITED ON FAST_STORAGE_TS;
    ALTER USER TEST QUOTA UNLIMITED ON MEDIUM_STORAGE_TS;
    ALTER USER TEST QUOTA UNLIMITED ON SLOW_STORAGE_TS;
EOF

impdp system/oracle@pdb1 directory=expdir

sql / as sysdba<<EOF
    alter session set container=pdb1;
    select * from DBA_ILMPARAMETERS;
    SELECT policy_name,
       object_owner,
       object_name,
       object_type,
       inherited_from,
       enabled,
       deleted
    FROM   dba_ilmobjects
    ORDER BY 1;
    SELECT track_time,
        owner,
        object_name,
        segment_write,
        full_scan,
        lookup_scan
    FROM   dba_heat_map_seg_histogram
    ORDER BY 1, 2, 3;
EOF

How to Upgrade Oracle Database and Replace the Operating System

A reader asked me for advice on upgrading Oracle Database and replacing the underlying operating system.

  • Currently on Oracle Database 12.1.0.2
  • Currently on Windows Server 2012
  • Upgrade to Oracle Database 19c
  • Move to new servers with Microsoft Windows Server 2022

What’s the recommended approach for transitioning to Oracle 19c on Windows 2022?

Oracle Data Guard

My first option is always Oracle Data Guard. It is often a superior option. You move the entire database, and the only interruption is a Data Guard switchover.

In this case, where the reader needs to replace the operating system, the first thing to check is platform certifications. Always check platform certifications on My Oracle Support. It has the most up-to-date information.

Here is an overview of the platform certification for Oracle Database 12.1.0.2 and 19c.

Oracle Database 12.1.0.2 Oracle Database 19c
Windows Server 2008
Windows Server 2008 R2
Windows Server 2012
Windows Server 2012 R2 Windows Server 2012 R2
Windows Server 2016
Windows Server 2019
Windows Server 2022

Oracle Database 19c does not support the current platform, Windows Server 2012. Thus, the reader can’t set up a standby database on the new servers and transition via a regular switchover.

Windows Server 2012 R2

Let’s imagine the current servers were using Windows Server 2012 R2. Both database releases support this platform. I would recommend this approach:

  1. Upgrade to Oracle Database 19c on current servers.
  2. Set up new servers with Windows Server 2022.
  3. Create standby database on new server.
  4. Transition to new servers with a regular Data Guard switchver.

This approach requires two maintenance windows. Yet, it is still my favorite because it is very simple.

RMAN Backups

You could also use RMAN and incremental backups. You don’t need much downtime – just the time necessary for a final incremental backup and restore. Like with Data Guard, you bring over the entire database.

RMAN can restore backups from a previous version, and you can use that to your advantage.

  1. Provision new servers with just Oracle Database 19c.
  2. Backup on 12.1.0.2.
  3. Restore and recover the database on the new servers with Oracle Database 19c binaries.
  4. After the final incremental backup, open the new database in upgrade mode and perform the upgrade.

We covered this approach in one of our webinars; you can also find details in this blog post.

Move Storage

You can also unmount the storage from the old server, and attach it to the new server.

  1. Run AutoUpgrade in analyze mode to determine upgrade readiness.
  2. Down time starts.
  3. Run AutoUpgrade in fixup mode to fix any issues preventing the upgrade from starting.
  4. Cleanly shut down the source database.
  5. Move the storage to the new server.
  6. Start the database on the new server in upgrade mode.
  7. Start AutoUpgrade in upgrade mode to complete the upgrade.

This is just a high-level overview. For a real move, there are many more intermediate steps.

Be sure to have a proper rollback plan. You are re-using the data files and AutoUprade in upgrade mode does not create a guaranteed restore point.

Data Pump

Data Pump is also a viable option, especially for smaller, less complex databases. It also enables you to restructure your database, for example:

  • Transform old BasicFile LOBs to SecureFile
  • Implement partitioning
  • Exclude data (for archival)
  • You can import directly into a higher release and even directly into a PDB.

But – the larger the database, the longer downtime (generally speaking).

When you use Data Pump for upgrades, I recommend using a full database export.

Full Transportable Export/Import

You can also use transportable tablespaces for upgrades. You can even migrate directly into a PDB on Oracle Database 19c.

The downside of transportable tablespace is that you must copy the data files to the new system.

But often, you can unmount the storage and mount the storage on the new servers. This avoids the cumbersome process of copying the data files to the new system.

Another approach is to combine transportable tablespaces with incremental backups, if you want to lower the downtime needed. This approach leaves the original database untouched, leaving you with a perfect rollback option.

Oracle GoldenGate

You could also use Oracle GoldenGate. But for most upgrades, it is overkill, partly because of the restrictions and considerations. I see this as a sensible option only if you have very strict downtime or fallback requirements.

Conclusion

What is the best option?

It depends…

This post helps you make the best decision for your organization.

How to Fix ORA-14519 Conflicting Tablespace Blocksizes During Oracle Data Pump Import

Recently, our team has received multiple questions about an error occurring during an Oracle Data Pump import:

ORA-14519: Conflicting tablespace blocksizes for table : Tablespace <string> block size <number< [partition specification] conflicts with previously specified/implied tablespace <string> block size <number< [object-level default]

Data Pump can’t import a partitioned table if you don’t store the table itself and all partitions in tablespaces with the same block size.

What is the Problem?

In almost all cases, you must store a partitioned table and all its partitions in tablespaces with the same block size. The documentation states:

Use caution when creating partitioned objects in a database with tablespaces of different block sizes. The storage of partitioned objects in such tablespaces is subject to some restrictions. Specifically, all partitions of the following entities must reside in tablespaces of the same block size:

  • Conventional tables
  • Indexes

You find the same restriction in Common Questions and Errors on Partition Operations Involving Tablespaces of Different Block Sizes (Doc ID 1401064.1):

Q: Can I store different partitions of the same table in tablespaces of different block sizes? A: No. For each conventional table (except index-organized tables), all partitions of that table must be stored in tablespaces with the same block size.

In the cases we looked at, Data Pump import recreates the partitioned table as it was defined in the source database. But the definition of the partitioned table is invalid – it spans tablespaces of different block size. The target database rejects the CREATE TABLE statement.

Solution

Fix in Source Database

In My Oracle Support note IMPORT OF PARTITIONED TABLE IN NON DEFAULT BLOCKSIZE TABLESPACE FAILS WITH ORA-14519(Doc ID 272229.1), the specified solution is to fix the invalid table in the source database. Often, users don’t want to change the source database. Such a change might have to go through a change management process, or the user might not be willing to take the risk of changing things in the source database.

The solution works, but not ideal.

Fix in Target Database

My colleague, Klaus Gronau, came up with a better solution. Fix it during import in the target database. The example below uses a schema-based Data Pump job.

  1. Extract the defintion of the offending table and schema:
    impdp ... include="table:""in('MYPARTTAB1')""" \
              include="user:""in('APPUSER')""" \
              include=tablespace_quota \			 
              sqlfile=myparttab1.sql
    
    • The SQL file also contains objects depending on the table like constraints and indexes.
  2. Edit the SQL file:
    vi myparttab1.sql
    
    • Ensure that the tablespace of the table itself matches the tablespaces used by the partitions. Check the tablespace clauses.
    • Optionally, you can move the constraint and index creation out of the file. After the next Data Pump import, you can add the constraints and indexes. This might speed up the process. You might even change the parallel degree on the index creation to create it faster.
  3. Create the user, quotas, partitioned table and dependent objects using the SQL file.
    @myparttab1.sql
    
  4. Perform the Data Pump import:
    impdp ... table_exists_action=truncate \
              data_options=trust_existing_table_partitions
    
    • The table_exists_action tells Data Pump to truncate any existing tables. The partitioned table is already empty since we just created it.
    • The data_options parameter instructs Data Pump to trust that the definition of the partitioned table matches that stored in the dump file. Data Pump does not perform any validation of the table, which is fine because we just created the table using the definition from the dump file.

Word of advice:

  • This method uses the exclude and include clauses to filter out the offending table. If the dump file contains a table of the same name, but in a different schema, then the filter applies to both those tables. The filter does not take the schema into account. It will match all tables, in any schema, with the name myparttab1.

Transportable Tablespace

If you move data with transportable tablespaces, there is no way to change the table definition on import. You must fix the problem in the source database.

Appendix

Thanks

A big shoutout to my colleague, Klaus Gronau, who provided the test case and solution for this issue.

How Can It Happen?

  • Observe the following test case:

    alter system set db_32k_cache_size=30M scope=both;
    create tablespace users_8 datafile size 50m blocksize 8k;
    create tablespace users_32 datafile size 50m blocksize 32k;
    
    create user appuser identified by "appuser" default tablespace users_8;
    grant connect, create table to appuser;
    alter user appuser quota unlimited on users_8;
    alter user appuser quota unlimited on users_32;
    
    connect appuser/appuser
    
  • The following create table statement fails because of the different tablespace block sizes:

    create table myparttab1 (
       col1 number
    ) tablespace users_8 
    partition by range (col1) (
       partition myparttab1_1 values less than (100001) tablespace users_32 ,
       partition myparttab1_2 values less than (maxvalue) tablespace users_32
    );
    
    ERROR at line 5:
    ORA-14519: Conflicting tablespace blocksizes for table : Tablespace USERS_32
    block size 32768 [partition specification] conflicts with previously
    specified/implied tablespace USERS_8 block size 8192 [object-level default]
    
  • But if you remove the table tablespace specification, it works. See how I removed tablespace users_8. However, the user default tablespace is users_8. The table segment of the partitioned table inherits the user default. The table definition is now unsupported:

    create table myparttab1 (
       col1 number
    ) 
    partition by range (col1) (
       partition myparttab1_1 values less than (100001) tablespace users_32 ,
       partition myparttab1_2 values less than (maxvalue) tablespace users_32
    );
    
    Table created.
    
  • You can check the table and partition defintion in the data dictionary:

    select 'TABLE' as type, 
           table_name as segment_name, 
           nvl(tablespace_name, (select default_tablespace from user_users))  as tablespace 
    from user_tables 
    where table_name='MYPARTTAB1'
    union all
    select 'PARTITION', 
           partition_name, 
           tablespace_name 
    from user_tab_partitions 
    where table_name='MYPARTTAB1';
    
    TYPE	  SEGMENT_NAME	  TABLESPACE
    --------- --------------- ---------------
    TABLE	  MYPARTTAB1	  USERS_8
    PARTITION MYPARTTAB1_1	  USERS_32
    PARTITION MYPARTTAB1_2	  USERS_32
    
  • We filed a bug to tighten the syntax validation in the create table statement. For now, store the table segment in the same tablespace as the partitions using the tablespace clause on the create table statement.

How to Trace Oracle Data Pump

Data Pump is hanging!

If you ever think the above, the answer is probably:

No, it’s not, it’s working…

It might be working in a suboptimal way, but nevertheless, it’s working. Here’s how you can tell what’s going on.

My Ultimate Tracing Guide

I use this approach every time I work with Oracle Data Pump and need to see what happens underneath the hood.

Before Data Pump

  • I always ensure dictionary and fixed objects statistics are current before starting Data Pump:

    begin
       dbms_stats.gather_schema_stats('SYS');
       dbms_stats.gather_schema_stats('SYSTEM');
       dbms_stats.gather_fixed_objects_stats;
    end;
    /
    
    • I usually don’t use gather_dictionary_stats. I prefer gathering schema stats instead. I’ve seen a few edge cases with gather_dictionary_stats over the years, but mostly it works fine. But you know, old dog and new tricks…
    • You should always follow this advice even if you are not tracing a specific problem.
  • I change the AWR snapshot interval to 15 minutes to get better granularity and manually create an AWR snapshot:

    begin
       dbms_workload_repository.modify_snapshot_settings(null, 15);
       dbms_workload_repository.create_snapshot;
    end;
    /
    
    
  • Optionally, if I need to diagnose a performance problem, I enable SQL trace for Data Pump processes:

    alter system set events 'sql_trace {process: pname = dw | process: pname = dm} level=8';
    
    • Or a specific SQL ID:
    alter system set events 'sql_trace[SQL:  ]';
    

Starting Data Pump

  • I add diagnostic information (metrics and logtime) to the log file and turn on tracing:
    expdp ... metrics=yes logtime=all trace=1FF0300
    impdp ... metrics=yes logtime=all trace=1FF0300
    
    • 1FF0300 turns on tracing for more or less everything in Data Pump, but there are other trace levels.

After Data Pump

  • I create a new AWR snapshot:
    begin
       dbms_workload_repository.create_snapshot;
    end;
    /
    
    • You might also want to reset the AWR snap interval to the previous setting.
  • Plus, an AWR report spanning the entire period.
    @?/rdbms/admin/awrrpt
    
  • Data Pump writes the tracing into the process trace files. You can find them in the database trace directory:
    • Control process trace files named *dm*.trc.
    • Worker process trace files named *dw*.trc.

Happy Debugging

Normally, the above information is enough to figure out the underlying problem. If you supply the above to Oracle Support when you open the next case, they’ll be super excited.

Further Reading