Oracle Data Pump and BFILEs

If you need to move data using Oracle Data Pump and have BFILEs in your database, what do you need to be aware of?

What Is a BFILE?

From the documentation:

BFILEs are data objects stored in operating system files, outside the database tablespaces. Data stored in a table column of type BFILE is physically located in an operating system file, not in the database. The BFILE column stores a reference to the operating system file. BFILEs are read-only data types. The database allows read-only byte stream access to data stored in BFILEs. You cannot write to or update a BFILE from within your application.

They are sometimes referred to as external LOBs.

You can store a BFILE locator in the database and use the locator to access the external data:

To associate an operating system file to a BFILE, first create a DIRECTORY object that is an alias for the full path name to the operating system file. Then, you can initialize an instance of BFILE type, using the BFILENAME function in SQL or PL/SQL …

In short, it is stuff stored outside the database that you can access from inside the database. Clearly, this requires special attention when you want to move your data.

How Do I Move It?

There are three things to consider:

  1. The file outside the database – in the operating system.
  2. The directory object.
  3. The BFILE locator stored in the table.

Table and Schema Mode Export

  1. You must copy the file in the operating system. Since a BFILE is read-only, you can copy the file before you perform the actual export.
  2. You must create the directory object. Directory objects are system-owned objects and not part of a table or schema mode export.
  3. Data Pump exports a BFILE locator together with the table. It exports the BFILE locator just like any other column. On import, Data Pump inserts the BFILE locator but performs no sanity checking. The database will not throw an error if the file is missing in the OS or if the directory is missing or erroneous.

Full Export

  1. Like table and schema mode, you must copy the file.
  2. Directory objects are part of a full export. On import, Data Pump creates a directory object with the same definition. If you place the external files in a different location in the target system, you must update the directory object.
  3. Like table and schema mode. Data Pump exports the BFILE locator as part of the table.

Do I Have BFILEs in My Database?

You can query the data dictionary and check if there are any BFILEs:

SQL> select owner, table_name 
     from dba_tab_cols 
     where data_type='BFILE';

Further Reading

New Webinars Coming Up

I am really excited to announce two new webinars:

  • Data Pump Best Practices and Real World Scenarios April 5, 2023, 16:00 CET
  • Release and Patching Strategies for Oracle Database 23c May 10, 2023, 16:00 CET

Oracle Database 19c Upgrade Virtual Classroom

You can sign up here.

The entire team, Roy, Mike, Bill, Rodrigo, and myself, are working hard to polish all the details.

Data Pump Best Practices and Real World Scenarios

In short: It’s all the stuff we couldn’t fit into our last Data Pump webinar.

Here’s the full abstract: > We promised to share more information in our last Data Pump Deep Dive With Development seminar. And here we are back again. Data Pump best practices is the topic we would like to emphasize on today. This will include some common tips and tricks but target especially parallel optimizations and transformation. It is quite common that you restructure objects and types when you use Data Pump for a migration. So we will give a detailed overview on the most common scenarios. This will guide us directly to real world scenarios where we’ll demonstrate several of those best practices used by customers.

Release and Patching Strategies for Oracle Database 23c

This is a revamped version of our very first webinar, Release and Patching Strategy. It’s updated to reflect the latest changes, and we have included even more details and demos.

Last time, the interest for this webinar was huge, and we ended up maxing out of Zoom capacity. A lot of you couldn’t get it. So, you better be ready on time, or you might miss your seat.

The full abstract: > This is a session every Oracle customer needs to attend to. Oracle Database 23c, the next long-term support release will be available sometime this year. Now it is time to refresh your knowledge about the best and most efficient strategies for your future release planning. Are there changes to the release numbering? Are there important changes regarding database patching? We will give you a complete overview on the available patch bundles and recent and future changes. We’ll discuss and showcase why a proper patching strategy is of vital importance – and how you can automate and optimize certain essential tasks.

But I Can’t Make It

Don’t worry. As usual, we will publish the recording on our YouTube channel and share the slides with you. Keep an eye out on my Webinars page.

But it’s better to watch it live. The entire team will be there, and we will answer all your questions. I promise you; we won’t leave until all questions have been answered.

All Tech, No Marketing

Remember, our mantra is: All tech, no marketing.

These webinars are technical. This is the place for you if you want all the gory details and cool demos.

I hope to see you there

Oracle Data Pump and Compression – Also Without a License

Whenever you use Data Pump to export from Oracle Database, you should use compression. It’s conveniently built into Data Pump.

Pros:

  • The dump file is much smaller:
    • Less disk space is needed.
    • Easier to transfer over the network.
  • Often it is faster to use compression when you measure the entire workflow (export, transfer, and import).
  • Imports are often faster because less data needs to be written from disk.

Cons:

How Do I Enable Data Pump Compression

You simply set COMPRESSION option:

$ expdp ... compression=all

You use COMPRESSION option only for exports. When you import, Data Pump handles it automatically.

You only need a license for Advanced Compression Option when you use compression during export. You don’t need a license to import a compressed dump file.

Medium Is a Good Compression Algorithm

I recommend you use the medium compression algorithm:

$ expdp ... compression=all compression_algorithm=medium

Our experience and tests show that it best balances between compression ratio and CPU.

Here are the results of a test my team did:

Algorithm File Size (MB) Compression Ratio Elapsed Time
NONE 5.800 1,0 2m 33s
BASIC 705 8,2 3m 03s
LOW 870 6,6 3m 11s
MEDIUM 701 8,2 3m 01s
HIGH 509 11,3 12m 16s

I would recommend high algorithm only if you need to transfer over a really slow network.

But I Don’t Have a License

gzip

You can still compress the dump file but not using Data Pump. Use OS utilities. In this case, I recommend splitting the dump file into pieces. It is easier to handle, and you can start transferring the dump files as they are compressed:

$ expdp ... filesize=5G dumpfile=myexp%L.dmp
$ gzip -r /u01/app/oracle/dpdir

Now, you transfer the files, uncompress and import:

[target]$ gunzip -r /u01/app/oracle/dpdir
[target]$ impdp ...

rsync

Another option is to use rsync. It has the option to compress the dump file over the network only:

$ expdp ... filesize=5G dumpfile=myexp%L.dmp
$ rsync -z ...

Cheatsheet

If you have the proper license, use Data Pump compression during export:

$ expdp ... compression=all compression_algorithm=medium

If you don’t have a license, compress the dump file over the wire only:

$ rsync -z ....

Don’t combine Data Pump compression and gzip/rsync! Compressing compressed stuff is not a good idea.

Why Does It Take so Long to Import Check Constraints?

Short answer: If the data was validated originally, it must be validated (again) during import. If you are short on time, you can add the constraints as NOT VALIDATED instead. But be aware of the drawbacks.

The Problem

I got a question from a colleague importing customer data as part of a migration. Data Pump was spending a lot of time in TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT.

We attached to the Data Pump session and used the status command:

$ impdp attach=<job_name>

Import> status
...
Worker 1 Status:
  ...
  State: EXECUTING                      
  Object Schema: SH
  Object Name: C_CUST2_EMAIL
  Object Type: TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
  Worker Parallelism: 1

In another session, we ran a new Data Pump import, but this time into a SQL file so we could examine the metadata for the object C_CUST2_EMAIL:

$ impdp parfile=<par_file_name> sqlfile=ddl.sql

$ cat ddl.sql | grep C_CUST2_EMAIL
ALTER TABLE "SH"."CUSTOMERS2" ADD CONSTRAINT "C_CUST2_EMAIL" CHECK (cust_email like '%@%') ENABLE;

Data Pump is adding a check constraint and enables it with the ENABLE clause. But there is a missing detail – the constraint is also validated even though there is no VALIDATE clause. It is the default.

What Happens

In the source database, the check constraint was enabled and validated. During import, Data Pump adds the constraint with the same attributes. The only way to add a validated constraint – is to validate it.

But if the constraint was validated during export, then we know data in the dump file is validated as well. When importing data that we know is validated, why do we need to validate it again?

Because we know, but the database doesn’t know it. In order for a check constraint to be marked as validated, the database must ensure that it is truly the case. It trusts no one and insists on validating the data.

And when you import data from a dump file, can you actually be sure all data is good? What if …

  • someone tampered with the dump file
  • data was changed during export using the remap option
  • data was changed as part of character set conversion
  • data was changed as part of timezone file conversion

Why Does It Takes So Long To Validate A Constraint

To answer that, let’s trace:

SQL> ALTER SESSION SET TRACEFILE_IDENTIFIER = "CONST";
SQL> ALTER SESSION SET EVENTS '10046 trace name context forever, level 4';
SQL> ALTER TABLE sh.customers2 ADD CONSTRAINT c_cust2_valid 
     CHECK (cust_valid in ('I', 'A')) 
     ENABLE VALIDATE;

Run the trace file through tkprof and have a look at the first recursive statement after the ALTER TABLE ... ADD CONSTRAINT statement:

select /*+ all_rows ordered dynamic_sampling(2) */ A.rowid, :1, :2, :3 
from "SH"."CUSTOMERS2" A 
where not ( cust_email like '%@%')

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1     49.03      66.61    3103448    3103510          1           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3     49.03      66.62    3103448    3103510          1           0

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100     (recursive depth: 3)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  TABLE ACCESS FULL CUSTOMERS2 (cr=3103510 pr=3103448 pw=0 time=66618486 us starts=1 cost=2 size=39 card=1)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file scattered read                      24274        0.20         22.45
  PGA memory operation                            1        0.00          0.00

The database checks the table for any existing data that violates the constraint. It does so by issuing a SELECT using the inverse function of the check constraint. This results in a full table scan, and that is what takes so long.

If you have multiple check constraints on the same table, Data Pump adds them one by one, and that results in multiple full table scans.

Since full table scans rarely populate the buffer cache, you end up with a lot of disk reading as well. You can verify that in the example above in the row source part. Physical reads (pr) are almost the same as consistent gets (cr). Almost every read requires a physical read.

But I Know Data Is Good

If you are sure that the data is good and you want to avoid the validation of data, you can create the check constraints as enabled but not validated. Existing data is not checked, but new data will be validated.

There are two ways of doing that:

1. Change Constraints in Source

Before exporting the data, in the source database, change all validated check constraints to NOVALIDATE:

ALTER TABLE "SH"."CUSTOMERS2" 
MODIFY CONSTRAINT "C_CUST2_EMAIL" 
ENABLE NOVALIDATE;

But be aware that you are changing the source database. If you need to use it for other purposes or as fallback, this might not be a good idea.

2. Change Constraints During Import

First, import the data, but exclude the constraints. This will be much faster because no time is spent on validating constraints:

$ impdp ... exclude=constraint

Next, extract the definition of all constraints from the Data Pump dump file and into a SQL file:

$ impdp ... include=constraint sqlfile=ddl.sql

Now, add the NOVALIDATE clause to all the statements:

ALTER TABLE "SH"."CUSTOMERS2" 
ADD CONSTRAINT "C_CUST2_EMAIL" 
CHECK (cust_email like '%@%') 
ENABLE NOVALIDATE;

But There Is a Catch!

Regardless of which method you choose, there is a catch. Validated constraints are very useful to the database because it enables the optimizer to perform query rewrite and potentially improve query performance. Also, index access method might become available instead of full table scans.

You want to get those constraints validated. But you don’t have to do it during the import. Validating an enabled, not validated constraint does not require a lock on the table, so you can do it while the database is open for business.

What About Rely

If you follow option 2 and create the constraints after import as enabled and not validated, you could also add the rely clause:

ALTER TABLE "SH"."CUSTOMERS2" 
ADD CONSTRAINT "C_CUST2_EMAIL" 
CHECK (cust_email like '%@%') 
RELY ENABLE NOVALIDATE;

Rely tells the database that you know the data is good. The optimizer still doesn’t trust you until you set the parameter QUERY_REWRITE_INTEGRITY to TRUSTED. Now, the optimizer can now benefit from some query rewrite options, but not all of them.

Nothing beats a truly validated constraint!

Conclusion

Importing validated check constraints takes time. If you want to save time in your maintenance window, instruct Data Pump to exclude the constraints, and add them yourself as enabled and not validated. This will save a lot of time that Data Pump otherwise would have to spend on validating the constraints.

The database benefits from validated constraints, so you should validate them. The database can validate constraints without a lock on the table if the constraints are already enabled. Thus, you can postpone the validation to a later time in your maintenance window, and you can perform other activities at the same time. Perhaps you can validate constraints while users are testing the database. Or wait until the next maintenance window. Just be aware that you might see some SQLs degrade until all constraints are validated.

XTTS: Testing the Procedure Multiple Times Using Read-Only Tablespaces

You can perform multiple tests of the procedure for cross-platform transportable tablespaces (XTTS) and incremental backups. You don’t have to restore and recover the data files every time. It makes your testing effort a lot more efficient.

Default Behaviour

When you perform a Data Pump import of transportable tablespaces, the data files plugs into the target database, and the tablespaces turns into read write mode immediately. It happens in the phase DATABASE_EXPORT/PLUGTS_FULL/PLUGTS_BLK.

Dump Pump must turn the tablespaces into read write mode for several reasons:

  • Suppose you have tables with columns to type timestamp with timezone information (TSTZ) and there is a different in the database time zone file version. In that case, all TSTZ columns must be updated to reflect the new timezone conventions.
  • The data file bitmap of free space must be updated in case you have segments in your tablespaces that are not imported.

When the tablespaces switches to read write mode the data file headers are updated. Now, the data files changes in such a way that they belong to that database (I imagine it has something to do with DBID, SCN, and so forth).

This also means that you can only import the tablespaces one time. If you want to repeat the process, you need to restore and recover the data files. Not even Flashback Database can save you.

TRANSPORTABLE=KEEP_READ_ONLY

A new option was added to Data Pump in Oracle Database 19c to solve this: TRANSPORTABLE=KEEP_READ_ONLY. Here is what the documentation says:

If specified, then data files remain in read-only mode. As a result of this setting, the tables containing TSTZ column data cannot be updated, and are dropped from the import. In addition, data file bitmaps cannot be rebuilt.

Keeping the tablespaces read only and leaving the data files untouched sounds good. But there are some restrictions:

  • If you have TSTZ columns, they can’t be checked and updated. This means that the entire table with a TSTZ column is skipped during import. You will see this error in the Data Pump log file: ORA-39339: Table "SCHEMA"."TABLE" was skipped due to transportable import and TSTZ issues. To solve it, you need to manually import the table afterward using a Data Pump table mode export and import.
  • If you have any segments in your data files that no longer belong to an object (e.g. if that specific object was not imported), the free space bitmap can’t be updated, and you have free space that can’t be used. You can solve this later on when the tablespaces are in read write mode using dbms_space_admin.tablespace_rebuild_bitmaps.

If you can live with these restrictions, you can use this feature.

Testing

You can use TRANSPORTABLE=KEEP_READ_ONLY when testing in the following way:

  1. Perform the backups on the source database using $ORACLE_HOME/perl/bin/perl xttdriver.pl --backup.
  2. Transfer res.txt and restore/recover the data files using $ORACLE_HOME/perl/bin/perl xttdriver.pl --restore
  3. Now you want to test the migration procedure. Set the tablespaces in the source database in read only mode, do an incremental backup and perform the Data Pump export. Optionally, use a temporarily activated standby database to avoid interruptions on the primary production database.
  4. Set a guaranteed restore point in the target database.
  5. After recovering the data files on the target system, you perform the Data Pump import. You must set the Data Pump parameter TRANSPORTABLE=KEEP_READ_ONLY to leave the data files untouched.
  6. When you are done with your tests, you either flash back to the restore point or delete the data in your database. The latter would be a series of DROP commands (schema, roles etc.) followed by DROP TABLESPACE commands. To preserve the data files, be sure to use DROP TABLESPACE ... INCLUDING CONTENTS KEEP DATAFILES. The KEEP clause is especially vital on ASM and with OMF.

    When you specify INCLUDING CONTENTS, the KEEP DATAFILES clause lets you instruct the database to leave untouched the associated operating system files, including Oracle Managed Files. You must specify this clause if you are using Oracle Managed Files and you do not want the associated operating system files removed by the INCLUDING CONTENTS clause.

  7. You can now continue to recover the data files on your target system with new incremental backups from the source database. You can roll forward the data files even though we had them plugged into a database. This is really cool. This allows you to repeat the test with new fresh production data without the tedious task of completely restoring and recovering the data files.
  8. Repeat the test cycle as many times as you want. If the data files are left untouched using TRANSPORTABLE=KEEP_READ_ONLY and you don’t turn them into read write mode manually, you can repeat the process.

Production Migration

Can you use TRANSPORTABLE=KEEP_READ_ONLY for the production migration? Yes, you can. But eventually, you will need to turn the tablespaces into read write mode.

I would not usually recommend it. But recently, I was involved in a specific case where it was helpful.

During testing, a customer ran into an issue where Data Pump hung during import. They killed the import and tried to repeat the Data Pump import. Unfortunately, Data Pump imports of transportable tablespaces are not resumable until Oracle Database 21c. So they had to start Data Pump all over. But the data files had already been touched by the target database. Now Data Pump refused to progress with the import because the data files were not as expected. The customer was advised on how to avoid the Data Pump hang. But they were still a little concerned about their upcoming production migration. They would like to keep the tablespaces in read only mode. Just in case something similar would happen. In that case, they could easily start all over because the data files were untouched.

Conclusion

During a migration project, you should use the Data Pump feature TRANSPORTABLE=KEEP_READ_ONLY to ease your testing efforts. In some rare cases, it might also be helpful for production migrations.

Other Blog Posts in This Series

Oracle Database 21c Is Here

Last week Oracle released Oracle Database 21c for additional platforms: Linux and Exadata. Other platforms will follow. You should keep an eye out for Release Schedule of Current Database Releases (Doc ID 742060.1) for further information.

Things to Notice

In my part of the Oracle Database, there are things to notice. I want to highlight:

To get all the details, visit the Upgrade and Utilities part of the new features documentation. There are some good examples of how the features can be used.

Behaviour Changes

Read-Only Oracle Home (ROOH) is now the default. Be sure to set the following environment variables to control the location of these directories:

  • ORACLE_BASE_HOME
  • ORACLE_BASE_CONFIG

I like ROOH, but it takes some time to get used to. For instance, network/admin files (tnsnames, sqlnet) and dbs files (pfile, spfile) are now in a new location.

The Pre-Upgrade Information Tool or preupgrade.jar is removed and replaced by AutoUpgrade. A few new parameters have been introduced to make the transition easier.

Innovation Release

Remember, 21c is an innovation release, which means a shorter support window than Long Term Releases such as Oracle Database 19c. If you adopt Innovation Releases, you should be prepared to upgrade to the next database release within one year after the next database release ships.

I would not recommend that you upgrade your production systems to Oracle Database 21c due to the limited support period. Not unless you are prepared to upgrade the database soon again – when support runs out. Oracle Database 19c is the current Long Term Support release. I recommend that for production databases.

Different release types for Oracle Database - innovation vs long term support

To learn more about innovation release and our release model, have a look at our slide deck. We discuss it in the first chapter.

New Features

I want to mention a few new features. They haven’t attracted as much attention as the marque features, but they are still cool.

Expression based init.ora parameters make it possible to base database parameters (init.ora) on calculations made on the system’s configuration. For example, setting the database parameter CPU_COUNT to half the number of CPUs (Windows):

alter system set cpu_count='$NUMBER_OF_PROCESSORS/2';

For more details, check out my video on YouTube.

Placeholders in SQL DDL Statements can improve application security because sensitive information, like passwords, doesn’t need to be hardcoded in SQL DDL. Example: You can make this statement:

CREATE USER :!username IDENTIFIED BY :!password ...

And Oracle Call Interface programs can substitute the placeholders into:

CREATE USER "DANIEL" IDENTIFIED BY "MyS3cr3tP!d" ...

This is similar to data binding but occurs in Oracle Client.

Conclusion

The complete 21c documentation is online, so I suggest that you head on over there and have a look. In the upgrade guide, you can find the list of behavior changes and also deprecated and desupported functionality. And finally, but most interesting perhaps, is Learning Database New Features.

Try it out in Always Free ADB or explore the Oracle LiveLabs.

Zero Downtime Migration – Logical Online Migration of Very Large Databases

Things always work on PowerPoint slides. And they almost always work in our lab environments as well. And recorded demos are also pretty bullet-proof.

But what happens when you have a huge database to migrate? A Very Large Database (VLDB). First, things get more exciting. Second, there are more knobs to turn and the smallest thing might become a big thing. Here are some things to consider when you migrate VLDBs using the Logical Online method.

Existing Data Guard No need to worry about your existing on-prem Oracle Data Guard environment. The Logical Migration uses Oracle Data Pump and Oracle GoldenGate. Both tools can be used on a database in a Data Guard and does not interfere with Data Guard operations.

But no switchovers or failovers are allowed during replication. GoldenGate can be configured to fully support a Data Guard environment and seamlessly reconnect in case of a Data Guard switchover. But it requires advanced configuration of GoldenGate. Zero Downtime Migration (ZDM) does the GoldenGate configuration but does so in a basic way that does not support Data Guard switchovers.

Data Pump Data Pump is used for the initial load of data.

The initial load happens before downtime, so generally you shouldn’t worry too much about the time it takes to perform the Data Pump export and import. However, the longer it takes, the bigger a replication will GoldenGate need to catch up. If the gap is too big, it might become a problem.

Apply a proper degree of parallelism. Rule-of-thumb:

On-prem: 2 x number of physical cores OCI: Number of OCPUs You can change the Data Pump parallel setting using the ZDM response file parameters DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXPORTPARALLELISMDEGREE and DATAPUMPSETTINGS_DATAPUMPPARAMETERS_IMPORTPARALLELISMDEGREE

When you are importing in OCI, consider scaling up on CPUs. More CPUs, more Data Pump parallel, faster import. Bare Metal and Exadata DB Systems scales online, whereas Virtual Machines needs around 10 minutes of downtime.

ZDM applies Data Pump compression automatically. Remember you don’t need a license for Advanced Compression Option to use compression when you are migrating with ZDM. Our experience is that MEDIUM most of the time is the best alternative. It provides a good compression ratio at a low CPU footprint. Although HIGH can compress better, it often comes at a much higher CPU footprint. ZDM uses MEDIUM compression algorithm. It’s not possible to change the Data Pump compression algorithm. The interested reader may find a benchmark below, that compares the different compression algorithms:

ZDM automatically transforms BasicFile LOBs into SecureFile LOBs – which is very good. SecureFile LOBs are superior in many ways, also when it comes to importing. Based on a customer case, we did a benchmark that measures the time it takes to import LOB data. The customer saw a 3x improvement during by transforming to SecureFile LOBs – and got the benefits of SecureFile LOBs afterwards. ZDM transforms to SecureFile LOBs automatically. Don’t turn it off. Here are the log entries from the two imports. Notice the elapsed time:

BasicFile LOBs

10-OCT-20 21:43:21.848: W-3 . . imported "SCHEMA"."TABLE" 31.83 GB 681025 rows in 804 seconds using direct_path

SecureFile LOBs

15-OCT-20 18:16:48.663: W-13 . . imported "SCHEMA"."TABLES" 31.83 GB 681025 rows in 261 seconds using external_table Don’t use import over network link. Use dump files. Performance-wise there are some restrictions when you import over network link. These restrictions really hurt on a VLDB.

The Data Pump works will automatically be started on all nodes, if the target database is a RAC database. That should allow for the fastest possible import. If needed, you can force all workers on to the same node using DATAPUMPSETTINGS_DATAPUMPPARAMETERS_NOCLUSTER

But when you export to dump files, remember to ensure you have adequate disk space on the source database host to hold the dump file. Use mount points that can deliver good write performance and ideally, they should be separated from the storage that holds the database data files.

A Data Pump export does not affect your existing backup operations. But the export and the backup will fight over the same resources, so I recommend suspending major backup operations (level 0+1) until after the export.

Speaking of fighting over resources. You should run the export at off peak hours. That will allow the export to use many more resources. The export is not a fully consistent export. Each table is consistent, but ZDM does not use FLASHBACK_TIME or FLASHBACK_SCN to make the entire export consistent. This helps avoid ORA-01555 snapshot too old during the export. The SCN of each table is recorded in the dump file and GoldenGate later on uses that information to start replication on each table individually from the appropriate SCN.

GoldenGate Replication How much data will GoldenGate need to transfer from source database to GoldenGate hub and from GoldenGate hub to target database? GoldenGate stores the replication data in trail files. The trail files are smaller than redo logs. The size of trail files is typically 30-40 % of the size of the redo logs. Imagine a database generating 15 TB of redo a day. The size of the trail files will be 4,5-6 TB. If you further apply compression on the trail files, you can typically reduce the size to around 1/8. Using our example, the trail files are now 550-750 GB. Which is significantly less than the initial 15 TB.

In the ZDM response file there is a parameter called GOLDENGATESETTINGS_EXTRACT_PERFORMANCEPROFILE. The default value is MEDIUM, but you have the option of changing it do HIGH. This should increase the performance of the extract process.

ZDM configure GoldenGate to use Parallel Replicat. This gives you a few options. First, you can control the number of mappers by using the parameter GOLDENGATESETTINGS_REPLICAT_MAPPARALLELISM (default is 4). Also, apply parallelism is auto-tuned. You can set a minimum and maximum value using GOLDENGATESETTINGS_REPLICAT_MINAPPLYPARALLELISM and GOLDENGATESETTINGS_REPLICAT_MAXAPPLYPARALLELISM=50. Defaults are 4 and 50 so I don’t think it will be necessary to change.

By default, ZDM does not configure GoldenGate with DDL replication. Replicating without DDL gives the best performance. If you must have DDL replication in your migration project, you can look at the ZDM response file parameter GOLDENGATESETTINGS_REPLICATEDDL. But be prepared for a performance degradation.

In our webinar, Migrating Very Large Databases my team and I discuss what to consider when migrating databases with GoldenGate.

Last, you can find more advice in Administering Oracle GoldenGate 19.1, Tuning the Performance of Oracle GoldenGate.

GoldenGate Health Check Oracle GoldenGate comes with health check scripts that you can use on the source and target database. This will help you monitor and troubleshoot the process. Here is information on how to install and use the health check scripts. You can also find information in Administering Oracle GoldenGate 19.1, Using Healthcheck Scripts to Monitor and Troubleshoot.

Backup of Target Database When you use logical online migration in ZDM, you create the target database in advance. In contrast to physical migration, the target database is not overwritten. This means that you can configure automatic backup of your target OCI database before you complete the migration and switch over to OCI. You will even have time to and possibility of testing the backup.

I recommend that you configure automatic backup after the Data Pump initial load. This way there is less archive logs to back up as well.

Data Guard on Target Database Similar to automatic backup, you can also create your Data Guard association before the switchover. Likewise, wait until after the initial load has completed.

Conclusion Even huge Oracle Databases can be migrated to OCI using Zero Downtime Migration. You might need to make a few adjustments from the standard flow, but it is absolutely doable. Your database can be protected by backup and Data Guard from the very second you switch over to OCI. Besides the actual migration, you should also do your best to ensure performance stability once the database is open for business. For that purpose, you should have a look at our webinar Performance Stability, Tips and Tricks and Underscores.

Other Blog Posts in This Series Introduction Install And Configure ZDM Physical Online Migration Physical Online Migration to DBCS Physical Online Migration to ExaCS Physical Online Migration and Testing Physical Online Migration of Very Large Databases Logical Online Migration Logical Online Migration to DBCS Logical Offline Migration to Autonomous Database Logical Online Migration and Testing Logical Online Migration of Very Large Databases Logical Online and Sequences Logical Offline Migration How To Minimize Downtime Logical Migration and Statistics Logical Migration and the Final Touches Create GoldenGate Hub Monitor GoldenGate Replication The Pro Tips

Zero Downtime Migration – Logical Online Migration to DBCS

Let me show you how you can migrate an Oracle Database into OCI. My source database is a PDB running on 12.1.0.2. I want to migrate it directly into a 19c CDB. I will show you:

  • import via dump file and object storage
  • import via database link

This procedure can be used to migrate to:

  • VM DB Systems
  • Bare Metal DB Systems
  • Exadata DB System (ExaCS)

Prerequisites

  1. I need a ZDM service host. I already blogged about how to create such. The computer is called zdmhost and the user is called zdmuser.
  2. My source database is a PDB called srcpdb, the CDB is called srccdb, and the host is named srchost. All my data is in the schema SH.
  3. I have already created a GoldenGate hub as described in this blog post. It is called ogg19cora.
  4. I want to migrate to a VM DB System, and it is already created. I name it tgthost, the CDB is named tgtcdb and the PDB is named tgtpdb. You can name it whatever you want. Be sure to use ASM as your storage management software.

Overview of the components in this demo

Prepare Source

My source database is a PDB. I need to prepare the PDB and CDB$ROOT. If you have a non-CDB perform all the steps in your non-CDB unless otherwise written.

First, I prepare the database for GoldenGate replication:

alter session set container=CDB$ROOT;
alter database force logging;
alter database add supplemental log data;
alter system set enable_goldengate_replication=true scope=both;
--At least 2GB
alter system set streams_pool_size=2g scope=both;
alter system set global_names=false;

Create a common user for GoldenGate in CDB$ROOT. Skip this step for non-CDBs:

alter session set container=CDB$ROOT;
create user c##ggadmin identified by <my_secret_pwd> default tablespace users temporary tablespace temp;
grant connect, resource to c##ggadmin;
grant unlimited tablespace to c##ggadmin;
grant select any dictionary to c##ggadmin;
grant create view to c##ggadmin;
grant execute on dbms_lock to c##ggadmin;
exec dbms_goldengate_auth.grant_admin_privilege('c##ggadmin',container=>' all');

Create a user for GoldenGate:

alter session set container=SRCPDB;
create user ggadmin identified by <my_secret_pwd> default tablespace users temporary tablespace temp;
grant connect, resource to ggadmin;
grant unlimited tablespace to ggadmin;
grant select any dictionary to ggadmin;
grant create view to ggadmin;
grant execute on dbms_lock to ggadmin;
exec dbms_goldengate_auth.grant_admin_privilege('ggadmin');

Generate a list of tablespaces that must exist in the target database. Add all schemas to the in-list:

alter session set container=SRCPDB;
select distinct tablespace_name
from (
      select distinct tablespace_name from dba_segments where owner in ('SH')
      union
      select distinct default_tablespace from dba_users where username in ('SH')
      union
      select distinct tablespace_name from dba_ts_quotas where dropped = 'NO' and username in ('SH')
      );

Optional. The best starting point for Data Pump is good and accurate dictionary statistics. This helps Data Pump extract the information as fast as possible:

exec dbms_stats.gather_dictionary_stats;

Via Dump File

I need to create a directory that can be used by Data Pump:

[oracle@srchost]$ mkdir -p /u01/app/oracle/datapump/mydirsrc

Prepare Target

The target database must use a timezone file version that is equal to or higher than the source:

alter session set container=tgtpdb;
select * from v$timezone_file;

Prepare the database for GoldenGate replication:

alter session set container=CDB$ROOT;
alter system set enable_goldengate_replication=true scope=both;

Create a user for GoldenGate:

alter session set container=tgtpdb;
create user ggadmin identified by <my_secret_pwd> default tablespace users temporary tablespace temp;
grant connect, resource to ggadmin;
grant unlimited tablespace to ggadmin;
grant select any dictionary to ggadmin;
grant create view to ggadmin;
grant execute on dbms_lock to ggadmin;
exec dbms_goldengate_auth.grant_admin_privilege('ggadmin');

Allow the GoldenGate user to perform DML (see appendix A):

alter session set container=tgtpdb;
grant insert any table to ggadmin;
grant update any table to ggadmin;
grant delete any table to ggadmin;

I ensure that the list of tablespaces I found in the source database, exist in the target database. If some are missing, I create them. Also, I will extend the data files in advance, so the size is pretty close to the expected end size. For new tablespaces I set the initial size of the data files. If the data files are too small, I will waste a lot of time during import to auto-extend. If I make them too big, I potentially waste space:

alter database datafile '...' resize 100G;
create tablespace ... size 100G ...;

Via Dump File

I need to create a directory that can be used by Data Pump. It will hold the Data Pump dump files and logs:

[oracle@tgthost]$ mkdir -p /u01/app/oracle/datapump/mydirtgt

Via DB Link

When I use a DB link, ZDM insists on using the built-in directory called DATA_PUMP_DIR. I ensure that the database directory exist. Also, I verify that the directory exist in the file system:

select directory_path from dba_directories where directory_name='DATA_PUMP_DIR';

I will do the import directly over a database link. In that case, the target host must be able to resolve the host name of my source database host:

[root@tgthost]$ echo "<source IP address> srchost" >> /etc/hosts

Prepare ZDM

The ZDM service host must be able to resolve the host names of the source and target database host and the GoldenGate hub:

[root@zdmhost]$ echo "<source IP address> srchost" >> /etc/hosts
[root@zdmhost]$ echo "<target IP address> tgthost" >> /etc/hosts
[root@zdmhost]$ echo "<GoldenGate IP address> ogg19cora" >> /etc/hosts

I put my private SSH keys to the source and target database host into ~/.ssh directory. Ensure permissions are set properly:

[zdmuser@zdmhost]$ cp srchost_key_file ~/.ssh
[zdmuser@zdmhost]$ chmod 400 ~/.ssh/srchost_key_file 
[zdmuser@zdmhost]$ cp tgthost_key_file ~/.ssh
[zdmuser@zdmhost]$ chmod 400 ~/.ssh/tgthost_key_file 

Test the connection. I connect as opc, but you might have a different user. Read more about access to the database host in the documentation:

[zdmuser@zdmhost]$ ssh -i ~/.ssh/srchost_key_file opc@srchost date
[zdmuser@zdmhost]$ ssh -i ~/.ssh/tgthost_key_file opc@tgthost date

If you have configured a proper certificate on your GoldenGate hub, you can jump to the next chapter. If not, read on.

When deployed the GoldenGate hub comes with a self-signed certificate. ZDM will complain about it, because it is considered insecure. For migrations that involve real data, you should use a proper certificate on your GoldenGate hub to ensure your migration is secure. But for tests and demos you might want to disregard the security warning. If so, you can add the self-signed certificate as a trusted one to the Java JDK’s certificate store ($ZDM_HOME/jdk/jre/lib/security/cacerts). You find instructions on how to do so in the MOS support note Zero Downtime Migration – GoldenGate Hub Certificate Known Issues (Doc ID 2768483.1). Use the same host that you added to /etc/hosts.

For your reference, this is the error that ZDM will throw if the certificate is not trusted:

Verifying status of Oracle GoldenGate Microservices at URL "https://..."
PRGZ-1136 : failed to verify configuration and status of Oracle GoldenGate Microservices at URL "https://..."
PRGG-1008 : failed to retrieve detailed information for the health of the Service Manager on Oracle GoldenGate hub "https://..."
PRGG-1001 : HTTP GET request "https://.../services/v2/config/health" failed.
javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
unable to find valid certification path to requested target

Prepare Response File

I will use a template response file as the basis for my response file:

[zdmuser@zdmhost]$ cp $ZDM_HOME/rhp/zdm/template/zdm_logical_template.rsp ~/logical_online.rsp
[zdmuser@zdmhost]$ cmod 700 ~/logical_online.rsp

Via Dump Files

This is the response file that I end up with (see appendix E for an explanation):

MIGRATION_METHOD=ONLINE_LOGICAL
DATA_TRANSFER_MEDIUM=OSS

SOURCEDATABASE_ADMINUSERNAME=SYSTEM
SOURCEDATABASE_GGADMINUSERNAME=GGADMIN
SOURCEDATABASE_CONNECTIONDETAILS_HOST=srchost
SOURCEDATABASE_CONNECTIONDETAILS_PORT=1521
SOURCEDATABASE_CONNECTIONDETAILS_SERVICENAME=srcpdb....oraclevcn.com
SOURCECONTAINERDATABASE_ADMINUSERNAME=SYSTEM
SOURCECONTAINERDATABASE_GGADMINUSERNAME=C##GGADMIN
SOURCECONTAINERDATABASE_CONNECTIONDETAILS_HOST=srchost
SOURCECONTAINERDATABASE_CONNECTIONDETAILS_PORT=1521
SOURCECONTAINERDATABASE_CONNECTIONDETAILS_SERVICENAME=SRCCDB_fra3dd....oraclevcn.com

TARGETDATABASE_OCID=ocid1.database.oc1.eu-frankfurt-1....
TARGETDATABASE_ADMINUSERNAME=SYSTEM
TARGETDATABASE_GGADMINUSERNAME=GGADMIN
TARGETDATABASE_CONNECTIONDETAILS_HOST=tgthost
TARGETDATABASE_CONNECTIONDETAILS_PORT=1521
TARGETDATABASE_CONNECTIONDETAILS_SERVICENAME=tgtpdb....oraclevcn.com

OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_TENANTID=ocid1.tenancy.oc1....
OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_USERID=ocid1.user.oc1....
OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_FINGERPRINT=58:b9:...
OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_PRIVATEKEYFILE=/home/zdmuser/.oci/oci_api_key.pem
OCIAUTHENTICATIONDETAILS_REGIONID=eu-frankfurt-1

GOLDENGATEHUB_ADMINUSERNAME=oggadmin
GOLDENGATEHUB_URL=https://ogg19cora....oraclevcn.com
GOLDENGATEHUB_SOURCEDEPLOYMENTNAME=Source
GOLDENGATEHUB_TARGETDEPLOYMENTNAME=Target
GOLDENGATEHUB_COMPUTEID=ocid1.instance.oc1.eu-frankfurt-1....

DATAPUMPSETTINGS_JOBMODE=SCHEMA
DATAPUMPSETTINGS_DATAPUMPPARAMETERS_IMPORTPARALLELISMDEGREE=2
DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXPORTPARALLELISMDEGREE=2
DATAPUMPSETTINGS_IMPORTDIRECTORYOBJECT_NAME=MYDIRTGT
DATAPUMPSETTINGS_IMPORTDIRECTORYOBJECT_PATH=/u01/app/oracle/datapump/mydirtgt
DATAPUMPSETTINGS_EXPORTDIRECTORYOBJECT_NAME=MYDIRSRC
DATAPUMPSETTINGS_EXPORTDIRECTORYOBJECT_PATH=/u01/app/oracle/datapump/mydirsrc
DATAPUMPSETTINGS_DATABUCKET_BUCKETNAME=zdm-staging
DATAPUMPSETTINGS_DATABUCKET_NAMESPACENAME=oradbclouducm
INCLUDEOBJECTS-1=owner:SH

Via DB Link

This is the response file that I end up with (see appendix E for an explanation):

MIGRATION_METHOD=ONLINE_LOGICAL
DATA_TRANSFER_MEDIUM=DBLINK

SOURCEDATABASE_ADMINUSERNAME=SYSTEM
SOURCEDATABASE_GGADMINUSERNAME=GGADMIN
SOURCEDATABASE_CONNECTIONDETAILS_HOST=srchost
SOURCEDATABASE_CONNECTIONDETAILS_PORT=1521
SOURCEDATABASE_CONNECTIONDETAILS_SERVICENAME=srcpdb....oraclevcn.com
SOURCECONTAINERDATABASE_ADMINUSERNAME=SYSTEM
SOURCECONTAINERDATABASE_GGADMINUSERNAME=C##GGADMIN
SOURCECONTAINERDATABASE_CONNECTIONDETAILS_HOST=srchost
SOURCECONTAINERDATABASE_CONNECTIONDETAILS_PORT=1521
SOURCECONTAINERDATABASE_CONNECTIONDETAILS_SERVICENAME=SRCCDB_fra3dd....oraclevcn.com

TARGETDATABASE_OCID=ocid1.database.oc1.eu-frankfurt-1....
TARGETDATABASE_ADMINUSERNAME=SYSTEM
TARGETDATABASE_GGADMINUSERNAME=GGADMIN
TARGETDATABASE_CONNECTIONDETAILS_HOST=tgthost
TARGETDATABASE_CONNECTIONDETAILS_PORT=1521
TARGETDATABASE_CONNECTIONDETAILS_SERVICENAME=tgtpdb....oraclevcn.com

OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_TENANTID=ocid1.tenancy.oc1....
OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_USERID=ocid1.user.oc1....
OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_FINGERPRINT=58:b9:...
OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_PRIVATEKEYFILE=/home/zdmuser/.oci/oci_api_key.pem
OCIAUTHENTICATIONDETAILS_REGIONID=eu-frankfurt-1

GOLDENGATEHUB_ADMINUSERNAME=oggadmin
GOLDENGATEHUB_URL=https://ogg19cora....oraclevcn.com
GOLDENGATEHUB_SOURCEDEPLOYMENTNAME=Source
GOLDENGATEHUB_TARGETDEPLOYMENTNAME=Target
GOLDENGATEHUB_COMPUTEID=ocid1.instance.oc1.eu-frankfurt-1....

DATAPUMPSETTINGS_JOBMODE=SCHEMA
DATAPUMPSETTINGS_DATAPUMPPARAMETERS_IMPORTPARALLELISMDEGREE=2
INCLUDEOBJECTS-1=owner:SH

Perform Evaluation

I am now ready to perform a migration evaluation. It is a dry run of the migration and performs various sanity checks. Nothing is changed during the evaluation:

[zdmuser@zdmhost]$ $ZDM_HOME/bin/zdmcli migrate database \
   -rsp /home/zdmuser/logical_online.rsp \
   -sourcenode srchost \
   -sourcedb SRCCDB_fra3dd \
   -srcauth zdmauth \
   -srcarg1 user:opc \
   -srcarg2 identity_file:/home/zdmuser/.ssh/srchost_key_file \
   -srcarg3 sudo_location:/usr/bin/sudo \
   -targetnode tgthost \
   -tgtauth zdmauth \
   -tgtarg1 user:opc \
   -tgtarg2 identity_file:/home/zdmuser/.ssh/tgthost_key_file \
   -tgtarg3 sudo_location:/usr/bin/sudo \
   -eval

A few comments:

  • sourcenode and targetnode are the host names of the source and target database host. Those names must be resolvable which I why I put them into /etc/hosts already.
  • sourcedb is the DB_UNIQUE_NAME of the source database.
  • srcarg1 is the name of the user that I connect as to the source database host. You might need to change that.
  • srcarg2 and tgtarg2 is the location of the private key file that I use to connect via SSH.

Next, I am prompted for various passwords and I also get a job ID. When submitting a migration job you are prompted for various passwords

The migration evaluation is then started, and I can use the job ID to monitor it:

[zdmuser@zdmhost]$ $ZDM_HOME/bin/zdmcli query job -jobid <job ID>

Output from zdmcli query job command

My colleague Sinan Petrus Toma showed how to loop:

[zdmuser@zdmhost]$ while :; do $ZDM_HOME/bin/zdmcli query job -jobid <job ID>; sleep 10; done

However, I prefer to get more details, so I tail the ZDM log file instead. This little one-liner finds the newest file and tails it:

[zdmuser@zdmhost]$ tail -n 50 -f "`ls -td /u01/app/oracle/chkbase/scheduled/*log | head -1`"

And I get this level of detail:

zdmhost: 2021-05-28T06:25:25.518Z : Executing phase ZDM_PRE_MIGRATION_ADVISOR
zdmhost: 2021-05-28T06:25:31.883Z : Source PDB name : SRCPDB
zdmhost: 2021-05-28T06:25:32.410Z : Running CPAT (Cloud Premigration Advisor Tool) on the source node srchost ...
zdmhost: 2021-05-28T06:25:38.533Z : Premigration advisor output:
Cloud Premigration Advisor Tool Version 21.0.0
Cloud Premigration Advisor Tool completed with overall result: WARNING
Cloud Premigration Advisor Tool generated report location: /u01/app/oracle/zdm/zdm_SRCCDB_fra3dd_6/out/premigration_advisor_report.json

Start Migration

When the evaluation passes, I can start the real migration. I am re-using the same command line, but I have removed the -eval option. Instead – and this is important – I am using -pauseafter to tell ZDM to pause the migration, just before the switchover takes place. Downtime has not started yet. The database is still open for business, but ZDM will copy the data and start the GoldenGate replication:

[zdmuser@zdmhost]$ $ZDM_HOME/bin/zdmcli migrate database \
   -rsp /home/zdmuser/logical_online.rsp \
   -sourcenode srchost \
   -sourcedb SRCCDB_fra3dd \
   -srcauth zdmauth \
   -srcarg1 user:opc \
   -srcarg2 identity_file:/home/zdmuser/.ssh/srchost_key_file \
   -srcarg3 sudo_location:/usr/bin/sudo \
   -targetnode tgthost \
   -tgtauth zdmauth \
   -tgtarg1 user:opc \
   -tgtarg2 identity_file:/home/zdmuser/.ssh/tgthost_key_file \
   -tgtarg3 sudo_location:/usr/bin/sudo \
   -pauseafter ZDM_MONITOR_GG_LAG

Again, I have to input the passwords as in eval mode. Use the job ID to monitor the progress or tail the log file. Note down the job ID. I need it later on to resume the migration.

When ZDM completes the phase ZDM_MONITOR_GG_LAG it will pause and wait. I can verify it with zdmcli query job: ZDM is currently paused

My data has been imported to the target database in OCI, GoldenGate has been configured, and my changes are being replicated. I can monitor the replication by logging on to the GoldenGate hub.

Important: Now, I strongly recommend that you take care of the statistics in your target database:

  1. Gather dictionary statistics (DBMS_STATS.GATHER_DICTIONARY_STATS). After the import the data dictionary is now full of a lot of new data – the metadata about your tables and other objects. To ensure a smooth running database, gather dictionary statistics.
  2. Decide on what to do about statistics on user objects as described in this blog post.

Complete Migration

Now it is time to finalize the migration and switch over to the OCI target database. All I need to do, is to resume the paused ZDM job. I use the job ID that was created when I started the migration:

[zdmuser@zdmhost]$ $ZDM_HOME/bin/zdmcli resume job -jobid <job ID>

ZDM will now ensure that all changes are replicated before switching over to the OCI target database. Again, I use the zdmcli query job command to monitor the progress and I can tail the log file. After a short while the migration completes. ZDM migration completed

That’s it. I have now migrated into OCI! And my new database is running Oracle Database 19c in addition.

I have a few post-migration tasks to carry out:

  • I prefer to shut down the source database to ensure that use it anymore.
  • ZDM will remove the GoldenGate configuration, but I need to disable the replication in the database: alter system set enable_goldengate_replication=false scope=both;.

Finally, I would recommend that you take a look at Logical Migration and the Final Touches. A few pointers that can make your migration even more successful.

Other Blog Posts in This Series

Appendix

A – GoldenGate user privileges

When GoldenGate is replicating changes into the target database, it does so as a regular database user. Thus, it must have privileges to perform DML and DDL on the appropriate schemas. There is a number of ways to do that.

I granted INSERT ANY, UPDATE ANY, and DELETE ANY which will allow me to perform DML in any schema. But that does not cover DDL. If I want to cover that as well there is a number of similar ANY privileges that I would need to grant, like CREATE ANY TABLE, CREATE ANY INDEX and so forth.

But it is discouraged to perform DDL statements during the replication. By default, ZDM does not configure DDL replication. If you want to replicate DDL you need to use the response file parameter GOLDENGATESETTINGS_REPLICATEDDL=true.

Other customers prefer an easier solution and grant PDB_DBA role (or DBA in a non-CDB). Connor McDonald also blogged about a procedure to perform schema grant. If you need inspiration you can have a look at the GGADMIN user in an Autonomous Database. It has PDB_DBA role amongst others.

In the end it comes down to preferences and security regulations in your organization.

B – Update Cloud Premigration Advisor Tool

As part of the migration ZDM will use the Cloud Premigration Advisor Tool (CPAT) to check your source database. The version of CPAT that ships with ZDM is not the latest one. So, you might be missing out on bug fixes and new recommendations. If you want to get the latest and greatest recommendations, I suggest that you manually update the CPAT tool in your ZDM home. You can find instructions in the MOS note Cloud Premigration Advisor Tool (CPAT) Analyzes Databases for Suitability of Cloud Migration (Doc ID 2758371.1)

C – Ignore Certain Data Pump Errors

If there is an error in either your Data Pump export or import, then ZDM will error out. But by default, ZDM will ignore the following errors:

  • ORA-31684: Object type string already exists
  • ORA-39111: Dependent object type string skipped, base object type string already exists
  • ORA-39082 Object type string created with compilation warnings

If you know that your data exports or imports with a specific error and you want to ignore it then you can add your own errors. Please get in touch with My Oracle Support for details.

D – RAC Databases

If your source or target database is a RAC database, then you only need access to one of the nodes. Select the node that you want to use and use the host name and SSH keys to that host.

E – Response File

Allow me to put some comments on the values:

Setting Comment
DATAPUMPSETTINGS_DATABUCKET_BUCKETNAME Name of the bucket in OCI Object Storage that will be used as a staging area. I recommend using a separate and dedicated bucket for each migration
DATAPUMPSETTINGS_DATABUCKET_NAMESPACENAME This is your Object Storage Namespace. You find it in the OCI Console, Administration > Tenancy Details
DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXPORTPARALLELISMDEGREE For on-prem databases set to number of physical cores x 2. For OCI databases set to number of OCPUs. Standard Edition does not allow parallel, so set to 1
DATAPUMPSETTINGS_DATAPUMPPARAMETERS_IMPORTPARALLELISMDEGREE Set to the number of OCPUs. Standard Edition does not allow parallel, so set to 1
DATAPUMPSETTINGS_JOBMODE I recommend using SCHEMA. Each schema to import is specified in a separate parameter, INCLUDEOBJECTS-n
GOLDENGATEHUB_ADMINUSERNAME Remember the usernames in GoldenGate are case sensitive
GOLDENGATEHUB_SOURCEDEPLOYMENTNAME The name of the source deployment in GoldenGate hub. If you went with the default value, it should be Source
GOLDENGATEHUB_TARGETDEPLOYMENTNAME The name of the target deployment in GoldenGate hub. If you went with the default value, it should be Target
GOLDENGATEHUB_URL Construct the URL by using the Fully Qualified Domain Name (FQDN), e.g. https://ogg19cora.subx.oraclevcn.com
INCLUDEOBJECTS-n Specify each schema in a separate parameter, example: INCLUDEOBJECTS-1=owner:SH and INCLUDEOBJECTS-2=owner:OE
OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_FINGERPRINT This is the fingerprint of the API keys that you used to configure OCI CLI on the ZDM service host. You can find the fingerprint in the OCI Console. Go to Identify > Users > User Details > API Keys
OCIAUTHENTICATIONDETAILS_USERPRINCIPAL_PRIVATEKEYFILE This is the private key file that you used when you configured OCI CLI on the ZDM service host
SOURCECONTAINERDATABASE_CONNECTIONDETAILS_SERVICENAME This is a service that connects directly into CDB$ROOT. I normally use lsnrctl status to find the name of the service
SOURCECONTAINERDATABASE_CONNECTIONDETAILS_SERVICENAME This is a service that connects directly into CDB$ROOT. I normally use lsnrctl status to find the name of the service
SOURCECONTAINERDATABASE_ The settings apply to my source CDB. If you have a non-CDB simply remove these settings
SOURCEDATABASE_CONNECTIONDETAILS_SERVICENAME This is a service that connects directly into the source PDB (or non-CDB). I normally use lsnrctl status to find the name of the service
SOURCEDATABASE_ These settings apply to my source PDB. Or, to your non-CDB
TARGETDATABASE_CONNECTIONDETAILS_SERVICENAME This is a service that connects directly into the target PDB. I normally use lsnrctl status to find the name of the service

F – Troubleshooting

If you forget to set global_names=false you might run into this error:

PRGD-1019 : creation of Data Pump job "..." with operation tyoe "IMPORT_ONLINE" failed
PRGD-1016 : stored procedure "DBMS_DATABASE.OPEN" execution as user ...
ORA-20000: Datapump: Unexpected error
ORA-06512: at line 1
ORA-39006: internal error
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
ORA-06512: at "SYS.DBMS_DATAPUMP", line 4932
ORA-06512: at "SYS.DBMS_DATAPUMP", line 6844
ORA-06512: at line 1

Zero Downtime Migration – Logical Online Migration

You can migrate a database with Zero Downtime Migration (ZDM) using the Logical Online method. The migration will happen with a combination of Data Pump and Oracle GoldenGate. This way even very large databases can be migrated with very little downtime.

In the introduction blog post I covered the basic characteristics of this approach and explained that Oracle GoldenGate can be used at no extra license cost and it is really simple to use.

Migrate your Oracle Database with Zero Downtime Migration ZDM using Data Pump and GoldenGate

With the Logical Online method, you can even target Autonomous Databases. Both Oracle Autonomous Transaction Processing (ATP) and Oracle Autonomous Data Warehouse (ADW) and also shared and dedicated. When you target an Autonomous Database, you don’t have access to the underlying database host, so instead of SSH access, ZDM will require SQL*Net connectivity.

Benefits

  • First and most important, you don’t need a license for Oracle GoldenGate to use this approach. Yes, that’s right! You are allowed to use a marketplace edition of Oracle GoldenGate 19c for 183 days to migrate your database with ZDM.

Oracle GoldenGate for Oracle – Database Migrations can be used for 183 days to perform migrations into Oracle databases located in Oracle Cloud Infrastructure using the following tools: Oracle Zero Downtime Migration and Oracle Cloud Infrastructure Database Migration

  • Even if you don’t have any GoldenGate experience, you can still use this approach. For example, you don’t need to install and configure GoldenGate yourself. The marketplace image creates the GoldenGate installation for you. Afterwards, ZDM configures GoldenGate and will eventually clean up as well.
  • You can migrate directly into a higher release. For example, if your source database is 12.1.0.2, you can migrate into 19c directly. This avoids the extra downtime normally needed to perform an upgrade.
  • You can migrate directly into a PDB. For example, if you source database is a non-CDB, you can migrate directly into a PDB. This avoids the extra downtime normally needed to convert the database into a PDB using the script noncdb_to_pdb.sql.
  • You can use the full power of Data Pump and GoldenGate to transform your data. By default, ZDM will convert any old BasicFile LOBs into SecureFile LOBs during import. But you can apply other transformations as well. You can partition the table in the target database, you can change the tablespace layout, you can apply compression and anything else that is possible with Data Pump.
  • You can migrate into a different character set (Data Pump and GoldenGate restrictions apply).
  • Your Data Pump export is automatically compressed if the source database is Enterprise Edition. Normally, Data Pump compression would require a license for the Advanced Compression Option. But when you migrate with ZDM to OCI, Oracle allows you to use Data Pump compression without having the Advanced Compression Option.
  • Your Data Pump export is automatically encrypted if the source database is Enterprise Edition. Normally, Data Pump encryption would require a license for the Advanced Security Option. But when you migrate with ZDM to OCI, Oracle allows you to use Data Pump encryption without having the Advanced Security Option.
  • You get recommendations on the migration before it is actually started. ZDM includes the Cloud Premigration Advisor Tool (CPAT) which analyzes your database and give you specific recommendations that matches the target database. For example, if you target an Autonomous Database, CPAT will warn if you have unsupported objects.
  • You can migrate a SE2 database into OCI using this approach as well.
  • You can build your target database in advance and configure backup upfront as well. You can test your backup/recovery strategy before you complete the migration.
  • You can build a standby database in advance as well. Your new OCI target database can be protected by Data Guard from the very second you perform the switch-over. I would recommend doing the initial Data Pump load first, and then build the standby database. This ensures that a minimum of changes has to be applied via redo.

Considerations

  • The time zone file in the OCI target database must be higher than that in the source database. Typically, this is not a problem if you provision a brand-new DB System. But if you plan to use an existing CDB in OCI do check the time zone file upfront:
SQL> select * from v$timezone_file;
  • In My Oracle Support you can find a list of recommended patches to apply on the source database when using Oracle GoldenGate. There is one for 11g and one for 12c and newer. You don’t have to apply those patches, but it is a recommendation. However, if you run into issues, you potentially have to patch your source database.
  • Most databases can be migrated with Oracle GoldenGate. Only very exotic data types are not supported. Be sure to check the documentation.
  • When you use Data Pump to migrate you will lose a lot of the information that is stored in the data dictionary, like
    • AWR
    • SQL Plan Baselines
    • SQL Profiles
    • SQL Patches
  • You can still transfer that information, but you need to use other tools. If this is relevant to you, I suggest that you watch our webinar Performance Stability, Tips and Tricks and Underscores where it is covered in detail.
  • Depending on whether you use a full or schema mode export you need to take care of your public objects afterwards. It could be synonyms or database links.
  • For a huge database you should be prepared for the time it takes to perform the Data Pump export and import. No need to worry too much because GoldenGate will keep track of the changes. But for a 100 TB database you should not start the job the day before the switchover is supposed to take place.
  • It is not recommended to execute DDL statements when GoldenGate is replicating changes. The extract and replicat processes that are created by GoldenGate does not replicate DDL statements. If you for example add a table after the initial Data Pump load, it will not get replicated by GoldenGate. However, GoldeGate is a very powerful tool and you can configure it to replicate DDL as well. But you will need to adjust the GoldenGate user privileges and change the parameter files. In the documentation you can read more about the possibilities of DDL replication and how to configure it.

Fallback

Using this approach there is no straight forward way of falling back to the source on-prem database. You could use GoldenGate to reverse the replication when you switch over to the OCI target database. However, it is not configured automatically by ZDM. So, this is very you would need to know GoldenGate and do the required configuration.

Conclusion

You get a lot more options when you use the Logical Online method to migrate your database. Especially, the possibility of migrating directly into a higher release or into a PDB is very attractive. Plus, you can use GoldenGate during the migration at no extra license cost.

However, the flexibility comes at a cost. There is a little more configuration and it is not as straight-forward as the Physical Online method. But it is not complicated.

Want to Know More

If you want to know more about using Data Pump and GoldenGate for database migrations, I suggest that you take a look at our webinar Migration Strategies – Insights, Tips and Secrets

In addition, these links contain additional useful information:

Other Blog Posts in This Series

DataPump Export from Data Guard

Exporting data from your production database can be a pain for several reasons:

  • You probably want a consistent export so you set the Data Pump parameter FLASHBACK_TIME. That requires a lot of undo space. If there is heavy activity on the database, then you risk running into ORA-01555.
  • You want export the data quickly using the parallel options in Data Pump. The quicker you export, the less susceptible you also are to ORA-01555. But parallel uses more resources.

If you have an Active Data Guard, you can export your data from the standby database. It should be straightforward, right? An export is after all just reading data…

But Data Pump is not just a simple export tool. It is quite advanced – and to support all that functionality it does require a master table to be created which can be used to coordinate the operation. Further, it uses Advanced Queueing (AQ) for communicating between the processes. And AQ also requires a writable database.

But how about DML Redirect that was introduced in Oracle Database 19c? Can it be used to handle those few DMLs? No, because creation of the master table is a DDL – not a DML. If you try, you will fail:

$  expdp system@localhost:1521/pdb1 schemas=SH directory=mydir

Export: Release 19.0.0.0.0 - Production on Mon Apr 12 18:19:22 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Password:

Connected to: Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
ORA-31626: job does not exist
ORA-31633: unable to create master table "SYSTEM.SYS_EXPORT_SCHEMA_05"
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: at "SYS.KUPV$FT", line 1163
ORA-16000: database or pluggable database open for read-only access
ORA-06512: at "SYS.KUPV$FT", line 1056
ORA-06512: at "SYS.KUPV$FT", line 1044

Solution

I turned on my bat signal and got help from my fellow Product Manager, Pieter Van Puymbroeck. I turned on the bat signal and got help Photo by Ali Kokab on Unsplash

Use a snapshot standby, he said. Of course – Snapshot Standby is part of Enterprise Edition, so if you have Data Guard, you also have Snapshot Standby. If you need a quick recap on what a snapshot standby database is, you should check out this video.

First, I convert my physical standby database (named CDB19_fra3zt) into a snapshot standby database:

DGMGRL> convert database 'CDB19_fra3zt' to snapshot standby;

Next, create a directory object that can be used with Data Pump:

SQL> create directory mydir as '/tmp';

Finally, start an export. Notice that I haven’t set FLASHBACK_TIME. When exporting from a snapshot standby there are probably no other users on the database, so you don’t need to worry about consistency of the export:

$ expdp system schemas=sales directory=mydir

My data has now been exported, and it is time to convert my snapshot standby database back into a physical standby database to properly protect my precious database:

DGMGRL> convert database 'CDB19_fra3zt' to physical standby;

But

Will this jeopardize my primary database. No, but ….

Your primary database is still protected, because log files are still sent to the standby database, but not applied. What will then happen if I need to switch over or fail over?

  1. Convert the snapshot standby database back into a physical standby database
  2. That will implicitly issue a Flashback Database and revert all the changes made
  3. Then catch up with the primary by applying all the log files
  4. Finally, complete the switchover or failover

Bottomline, it will take longer! The flashback operation is really fast and should be done within minutes. After all, Data Pump isn’t making that many changes. Only the master table and some AQ stuff. How much time then to do log apply? That depends entirely on how big the apply lag is.

So, if you choose to go down this path be prepared that in the event of a switchover or a failover, it will take a little longer until the standby is open for business. Unless, you have two standby databases, then you will still have run ready to jump in immediately.

GoldenGate

If you need the Data Pump export to perform an initial load for GoldenGate, then you might want to control the SCN at which the export is happening. If so, cancel redo apply on the standby database and roll forward to the desired SCN:

alter database recover managed standby database cancel;
alter database recover managed standby database until change n;

Before converting the standby database to snapshot standby:

alter database convert to snapshot standby;

Later on, when you need to start replication in GoldenGate you can use the following option:

START REPLICAT ATCSN n

It Must Be From Physical Standby

On Twitter I was made aware that AWS RDS does not allow snapshot standby. The user referred to a MOS note describing how to export from a physical standby database. You need a proxy database and a database link. You find the details in How To Use DataPump Export (EXPDP) To Export From Physical Standby Database (Doc ID 1356592.1).

I recommend using this method only as a last resort. It is more cumbersome and you don’t get the same performance as via a snapshot standby database. Using Data Pump over a database link limits certain parallel capabilities.

Conclusion

You can export from your Data Guard, if you convert it temporarily to a snapshot standby. Be prepared that a switchover or failover operation will take longer.

If you want to know more about Data Pump, you should check out our webinar Migration Strategies – Insights, Tips and Secrets.

And finally, speaking of Data Pump, remember to get the Data Pump bundle patch (bug 32551008) as Roy mentioned on Twitter.