Troubleshooting Rabbit Hole: From Data Guard to Data Integrity Checks

I always fear the worst when I get a TNS error. It’s not my expertise. A TNS error was exactly what I got while I configured a Data Guard environment. Redo Transport didn’t work; the redo logs never made it to the standby database.

The Error

I took a look in the alert log on the primary database and found this error:

2022-05-10T08:25:28.739917+00:00
"alert_SALES2.log" 5136L, 255034C
        TCP/IP NT Protocol Adapter for Linux: Version 12.2.0.1.0 - Production
  Time: 10-MAY-2022 18:09:02
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12650

TNS-12650: No common encryption or data integrity algorithm
    ns secondary err code: 0
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0

A little further in the alert log, I found proof that the primary database could not connect to the standby database:

2022-05-10T18:09:02.991061+00:00
Error 12650 received logging on to the standby
TT04: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (12650)
TT04: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
2022-05-10T18:09:02.991482+00:00
Errors in file /u01/app/oracle/diag/rdbms/sales2_fra3cx/SALES2/trace/SALES2_tt04_75629.trc:
ORA-12650: No common encryption or data integrity algorithm
Error 12650 for archive log file 1 to '...'

The Investigation

As always, Google it! Although I have used DuckDuckGo for privacy reasons instead of Google for many years, I still say google it, which is fairly annoying.

The search revealed this MOS note: ORA-12650: No Common Encryption Or Data Integrity Algorithm When Using SQLNET.CRYPTO_CHECKSUM_TYPES_SERVER=sha256 (Doc ID 2396891.1) Although it is fairly old, it led me to look for issues with data integrity checks defined in sqlnet.ora.

The primary database had the following defined in sqlnet.ora:

SQLNET.CRYPTO_CHECKSUM_SERVER=REQUIRED
SQLNET.CRYPTO_CHECKSUM_TYPES_SERVER=(SHA1)
SQLNET.CRYPTO_CHECKSUM_CLIENT=REQUIRED
SQLNET.CRYPTO_CHECKSUM_TYPES_CLIENT=(SHA1)

The above means that any connection made to or from this database must use data integrity checks. CRYPTO_CHECKSUM_SERVER and CRYPTO_CHECKSUM_CLIENT defines that. Also, the database will only accept connections using the SHA1 algorithm.

Then I looked in sqlnet.ora on the standby database:

SQLNET.CRYPTO_CHECKSUM_CLIENT=ACCEPTED
SQLNET.CRYPTO_CHECKSUM_SERVER=ACCEPTED
SQLNET.CRYPTO_CHECKSUM_TYPES_CLIENT=(SHA256,SHA384,SHA512,SHA1)
SQLNET.CRYPTO_CHECKSUM_TYPES_SERVER=(SHA256,SHA384,SHA512)

This database does not require data integrity checks. But if the other party requests or requires it, then the server is fine with it. That’s the meaning of ACCEPTED. But look at the allowed algorithms. When acting as server (i.e. receiving connections from someone else), it does not allow SHA1 algorithm, the only one allowed by the counterpart.

The Solution

I decided to remove all instances of SHA1 because:

  • It is an old algorithm
  • Any 12c database or client supports newer algorithms
  • In this environment, I don’t have any old 11g servers or clients

I added all the SHA-2 algorithms as supported algorithms. Now, sqlnet.ora in both databases look like this:

SQLNET.CRYPTO_CHECKSUM_TYPES_CLIENT=(SHA256,SHA384,SHA512)
SQLNET.CRYPTO_CHECKSUM_TYPES_SERVER=(SHA256,SHA384,SHA512)

This solved the problem and now redo transport worked fine.

If I wanted to go maximum security, I should allow only the SHA512 algorithm in both sqlnet.ora files:

SQLNET.CRYPTO_CHECKSUM_TYPES_CLIENT=(SHA512)
SQLNET.CRYPTO_CHECKSUM_TYPES_SERVER=(SHA512)

And force both databases to always use data integrity checks:

SQLNET.CRYPTO_CHECKSUM_SERVER=REQUIRED
SQLNET.CRYPTO_CHECKSUM_CLIENT=REQUIRED

Security

Some questions I asked myself while reading the Security Guide 19c.

Why do you want data integrity checks in our connections?

To protect against two types of attack:

  1. Data modification attack An unauthorized party intercepting data in transit, altering it, and retransmitting it is a data modification attack. For example, intercepting a $100 bank deposit, changing the amount to $10,000, and retransmitting the higher amount is a data modification attack.
  2. Replay attack Repetitively retransmitting an entire set of valid data is a replay attack, such as intercepting a $100 bank withdrawal and retransmitting it ten times, thereby receiving $1,000.

Can I do more to strengthen security in sqlnet.ora?

Yes. You should definitely also take a look at network encryption to protect data-in-transit. Take a look at Configuring Oracle Database Native Network Encryption and Data Integrity in the Security Guide 19c. These four parameters are of interest:

Also, reading Securing the Oracle Database – A technical primer can inspire you.

What’s wrong with SHA-1?

It’s old and has been made insecure by computer evolution. From Wikipedia:

In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as a hexadecimal number, 40 digits long. It was designed by the United States National Security Agency, and is a U.S. Federal Information Processing Standard.

Since 2005, SHA-1 has not been considered secure against well-funded opponents; as of 2010 many organizations have recommended its replacement. NIST formally deprecated use of SHA-1 in 2011 and disallowed its use for digital signatures in 2013. As of 2020, chosen-prefix attacks against SHA-1 are practical. As such, it is recommended to remove SHA-1 from products as soon as possible and instead use SHA-2 or SHA-3. Replacing SHA-1 is urgent where it is used for digital signatures.

XTTS: Testing the Procedure On Activated Standby Database

While preparing for the production migration of your Oracle Database using cross-platform transportable tablespaces (XTTS) and incremental backups, you should test the procedure.

It is often impossible to get a full copy of the production database or a comparable test system. If you still want to test the migration procedure, you need to take the backups on the production database. That might have an impact on your production database. You could offload the backups to a standby database but periodically must set tablespaces in read only mode to perform the Data Pump exports. This means downtime for your application.

But there is a way to do all the work on the standby database. This will allow you to test the migration procedure without affecting the primary production database.

You can also read about the procedure in the MOS note Using XTTs in a Data Guard Environment.

How To

You should use the procedure described in a previous blog post about backups on a standby database. When you reach the final incremental backups you need to follow the information below.

I assume you have conducted a level 0 and a number of incremental backups on the standby database. Now it is time for the final incremental backup. We will do this on the standby database as well. You don’t need to touch the primary database except for a few minor changes (archive current log and defer redo log transport).

Now you have a consistent backup of the data files and a corresponding Data Pump export; both components were taken from a standby database. Transfer the files to your target database and test the procedure.

Production Migration

The heading says you can use this procedure for testing. How about the production migration?

Well, it could work, but I would not recommend it.

When you use this approach, you leave the primary database open for business. At the same time, you take the final incremental backup. This opens for a potential catastrophe. Users entering data into the open, primary database while you have taken the last incremental backup. That would mean data loss!

Using the other standby database approach, you are sure that no more data is entered into the source database, when you perform the final migration. You ensure this by setting the tablespaces read only in the primary database. This is why I do not recommend that you use the approach with an activated standby database for your production migration.

Conclusion

You should test your migration procedure and get comfortable with your runbook. You can use a temporarily activated standby database if you don’t have a dedicated test environment. This allows you to perform realistic testing without interrupting the primary database and your users.

Other Blog Posts in This Series

XTTS: Target Database and Data Guard

Often when you migrate an Oracle Database using cross-platform transportable tablespace (XTTS) and incremental backups, it is a big database. Big databases must often be protected by Data Guard.

How do you ensure that Data Guard protects the target database at the end of the migration?

Build After Migration

A simple solution is to build the standby database after the migration. There are some downsides to this approach:

  • It takes time. In many situations, the business requires that the database can’t go live until a standby database is in place. Having to build a standby database will prolong the downtime.
  • It puts a load on the primary database. Right after the migration, you don’t have your backups in place yet, so you will need to build the standby databases directly from the primary database. That requires a lot of I/O and network traffic. You might want to use those resources for other activities, like taking a level 0 backup or regathering statistics.
  • It can become even more complicated if you migrate into multitenant architecture. Rebuilding the standby database of an already running, active CDB might not be an option. The other PDBs in the CDB are compromised while the standby is rebuilt.

Restore Data Files to Standby Host

Here is an approach that offers much less downtime. Restore the data files onto the standby host as well as part of the migration. During the Data Pump import, the plug-in propagates to the standby database via redo apply.

When migrating with XTTS and the Perl script you are familiar with running the backup on the source database (xttdriver.pl --backup). Also, you know how to restore and recover the data files on the target system (xttdriver.pl --restore).

Now, the idea is you restore and recover the data files on the primary host and also on the standby host. During the Data Pump import, the data files are plugged into the primary database. The plug-in commands are propagated to the standby database via redo apply. Then, the standby database can plug-in the data files if they are located in the same place as on the primary database. The rest of the Data Pump import will propagate as well, and in the end you will have a working Data Guard environment.

You must recover the data files on the primary and standby database to the exact same SCN. Be sure to restore all backups to both the primary and standby database.

Overview of the process

How To

The following assumes that your target databases are using ASM. Also, I am migrating directly into a PDB. But the procedure is similar for migrations into a non-CDB database. Use the procedure described in my previous blog post or MOS note V4 Reduce Transportable Tablespace Downtime using Cross Platform Incremental Backup (Doc ID 2471245.1). Adjust the procedure according to the following.

  • Prepare the target database and build a standby database before starting the migration.
  • In xtt.properties set destconnstr to point to the target primary database.
  • Copy xtt.properties to the target primary host and target standby host.
  • On target standby host, change xtt.properties and set destconnstr to point to the target standby database.
  • Create the cross-platform backups on the source database using xttdriver.pl --backup in the usual way
  • Now copy the file res.txt to both the target primary and target standby database.
  • The backups in the scratch location should also be available to both the target primary and standby databases.
  • Now restore or recover the backups on the target primary and target standby database using xttdriver.pl --restore.
  • Repeat the backup/restore/recover process as many times as needed. Keep recovering the target primary and target standby database. It is very important that the data files on both the target primary and target standby database are recovered to the exact same SCN.
  • On the target databases the data files will be restored into ASM. The data file itself is created with an OMF file name:
    +<disk_group>/<db_unique_name>/<source_guid>/DATAFILE/<omf_name>
    
    It could be something like this:
    +DATA/CDB1_FRA356/86D5DC2587337002E0532AB2A8C0A57C/DATAFILE/ACCOUNT.281.1099469863
    
    • The disk group is what you specified in xtt.properties in dest_datafile_location
    • DB_UNIQUE_NAME is not the same on the primary and standby database. It will differ.
    • To comply with OMF standard the database must restore the data files into a folder corresponding to the PDB it belongs to. However, currently, the data files do not belong to a PDB. We haven’t done the Data Pump import plug-in operation yet. The database will create a folder with the GUID of the source database. If you are interested, you can get the GUID from the source database using select guid from v$containers.
    • The last part of the OMF file name is the tablespace name, and some numbers representing the file ID and a number to ensure uniqueness. This part will differ on the primary and standby database as well.
  • We now know that the data file name is different on the primary and standby database. Previously, it was stated that it is important that the data files are stored in the same location and has the same name. This is a problem! But the Perl script solves that by creating ASM aliases.
  • The aliases will be created in the location specified by dest_datafile_location in xtt.properties. Use asmcmd to verify it. The column SYS (System-generated) is N, meaning this is not a proper OMF file. Also, we can see in the Name column that it is an alias:
    ASMCMD> ls -l +DATA
    Type      Redund  Striped  Time             Sys  Name
    DATAFILE  UNPROT  COARSE   MAR 16 08:00:00  N    account_25.dbf => +DATA/CDB1_FRA2KR/86D5DC2587337002E0532AB2A8C0A57C/DATAFILE/ACCOUNT.282.1099469855
    DATAFILE  UNPROT  COARSE   MAR 16 08:00:00  N    accountidx_26.dbf => +DATA/CDB1_FRA2KR/86D5DC2587337002E0532AB2A8C0A57C/DATAFILE/ACCOUNTIDX.280.1099469855
    ...
    
  • If you look at the directory where the alias is pointing to, you can see that the files are proper OMF files – real data files. Column SYS is Y, and Name does not contain the alias pointer =>:
    ASMCMD> cd +DATA/CDB1_FRA2KR/86D5DC2587337002E0532AB2A8C0A57C/DATAFILE
    ASMCMD> ls -l 
    Type      Redund  Striped  Time             Sys  Name
    DATAFILE  UNPROT  COARSE   MAR 16 08:00:00  Y    ACCOUNT.282.1099469855
    DATAFILE  UNPROT  COARSE   MAR 16 08:00:00  Y    ACCOUNTIDX.280.1099469855
    ...
    
  • Thus, the aliases are hiding the fact that the data files have different name on the target primary and target standby database.
  • When you prepare the parameter file for the Data Pump import, be sure to reference the aliases – not the OMF named data files. The aliases have the same name on both the target primary and target standby database:
    $ cat import.par
    
    transport_datafiles=+DATA/account_25.dbf
    transport_datafiles=+DATA/accountidx_26.dbf
    transport_datafiles=...
    
  • Then start the Data Pump import. The data files are plugged into the primary database during DATABASE_EXPORT/PLUGTS_FULL/PLUGTS_BLK. When the redo containing that information is applied on the standby database, the standby database will plug in the data files as well.
  • When the Data Pump import completes, you can verify that the standby database survived and is still applying redo. A switch-over is also a good way of testing it.

Conclusion

You can prepare the target standby database in advance. This enables the target database to be protected by Data Guard as soon as the Data Pump import completes.

Restore and recover the data files to the target primary and target standby database. If they are recovered to the exact same SCN the plug-in of the data files propagates to the standby database via redo apply. After the Data Pump import, your target database has a fully functional standby database.

Further Reading

Other Blog Posts in This Series

XTTS: Backup on Standby Database

When doing a cross-platform migration with transportable tablespaces and incremental backup, is it possible to perform backups on a standby database? If so, you could offload the work from a primary database.

The short answer is yes. You can use a physical standby database but not a snapshot standby database.

Using a standby database for the backups is an advantage in some situations:

  • The primary database does not have the capacity to perform the extra backups (CPU, disk space, or I/O).
  • You want the primary database to be completely unaffected by the migration.

How To

You follow the regular procedure described in the MOS note V4 Reduce Transportable Tablespace Downtime using Cross Platform Incremental Backup (Doc ID 2471245.1) with a few changes. The changes are described in Using XTTs in a Data Guard Environment.

Be Careful

You must have an Active Data Guard license, if you:

  • Open a physical standby database and enable redo apply.
  • Enable block change tracking on the standby database (for faster incremental backups).

Procedure

The primary database is called SRCPRI and runs on the host src-pri. The standby is called SRCSTDBY and runs on the host src-stdby.

  • Ensure allowstandby=1 in xtt.properties.

  • Whenever you need to perform a backup on the physical standby database – level 0 or incremental:

    • Cancel redo apply and open the physical standby database:
    SRCSTDBY SQL> alter database recover managed standby database cancel;
    SRCSTDBY SQL> alter database open;
    
    • Perform the backup
    [oracle@src-stdby]$ $ORACLE_HOME/perl/bin/perl xttdriver.pl --backup
    
    • Bring the standby database back in MOUNT mode and re-enable redo apply:
    SRCSTDBY SQL> shutdown immediate
    SRCSTDBY SQL> startup mount
    SRCSTDBY SQL> alter database recover managed standby database disconnect from session;
    
  • When it is time to perform the final incremental backup:

    • Set tablespaces in READ ONLY mode on the primary database:
    SRCPRI SQL> alter tablespace ... read only;
    
    • Archive current log and ensure it is applied on the standby database:
    SRCPRI SQL> alter system archive log current;
    
    • When you have confirmed the redo is applied on the standby database, cancel redo apply and open it:
    SRCSTDBY SQL> alter database recover managed standby database cancel;
    SRCSTDBY SQL> alter database open;
    
    • Verify tablespaces are READ ONLY:
    SRCSTDBY SQL> select tablespace_name, status from dba_tablespaces;
    
    • Perform the final incremental backup
    [oracle@src-stdby]$ $ORACLE_HOME/perl/bin/perl xttdriver.pl --backup
    
    • Perform the Data Pump export on the primary database:
    [oracle@src-pri]$ expdp system ... full=y transportable=always ...
    

Active Data Guard

If you have a license for Active Data Guard, you can simply keep the physical standby database in OPEN WITH REDO APPLY mode. You don’t need to switch from OPEN mode to MOUNT mode.

Conclusion

Being able to perform the backups on a standby database is a huge advantage. But it makes the process slightly more complicated, so I would recommend it only if really needed. In addition, if you don’t follow the procedure strictly, you might accidentally use a feature that requires Active Data Guard.

The MOS note V4 Reduce Transportable Tablespace Downtime using Cross Platform Incremental Backup (Doc ID 2471245.1) has some additional details about using a standby database. You can get those details by opening a Service Request and asking for it.

Other Blog Posts in This Series

Rolling Upgrades of Oracle Database on Exadata Cloud Service

If you want to minimize the downtime needed to upgrade your Oracle Database 19c on Exadata Cloud Service, one of the options is to use rolling upgrades. Our good friends in the Maximum Availability Architecture (MAA) team recently posted a very good MOS note with step-by-step instructions.

Exadata Cloud Database 19c Rolling Upgrade With DBMS_ROLLING (Doc ID 2832235.1)

What Is Rolling Upgrade?

A rolling upgrade uses a standby database called a logical standby database.

A logical standby database is initially created as an identical copy of the primary database, but it later can be altered to have a different structure. The logical standby database is updated by executing SQL statements. The flexibility of a logical standby database lets you upgrade Oracle Database software (patch sets and new Oracle Database releases) and perform other database maintenance in rolling fashion with almost no downtime.

Logical Standby Databases, Data Guard Concepts and Administration 21c

With almost no downtime means the time it takes to perform a Data Guard switchover. Typically, that is a few minutes. If your application is configured properly, the downtime can be hidden from the end-user. The downtime will appear as a brown-out where the session will be waiting for the database to complete the switchover before continuing.

Rodrigo explains how the process works in this video from our webinar How Low Can You Go? Zero Downtime Operations

Pro tip: The process uses a so-called Transient Logical Standby or TLS, so you can find additional information by searching for that term as well.

Can I use rolling upgrade on my database?

Rolling upgrades using DBMS_ROLLING requires the Active Data Guard Option which is included in your ExaCS license.

In addition, there are requirements to the data types in your database and a few other prerequisites. Check the documentation for details.

In this video, Roy explains how to determine the readiness of your Oracle Database. The video is also from our webinar How Low Can You Go? Zero Downtime Operations.

Other Options for Minimizing Downtime During Upgrades

If your database is not capable of performing rolling upgrades, you can still do something to minimize downtime. You can try to tune the upgrade itself or use Oracle GoldenGate. Both options are described in detail in our webinar How Low Can You Go? Zero Downtime Operations. You can flip through the slides or watch the complete recording.

If you decide to go with Oracle GoldenGate, and since your database is in OCI, you can benefit from the OCI GoldenGate service. Now, you may think: GoldenGate, that sounds expensive! But it is not. The new OCI GoldenGate service comes at a completely new price model, which is very attractive for shorter use cases like upgrades and migrations. You don’t pay a license for the source and target CPUs but instead for usage of CPUs on the GoldenGate hub itself. And you only pay by the hour.

Conclusion

Finally, just a kudos to my colleague Sebastian Alasino for putting together a very good, easy-to-follow MOS note.

Here is a demo of rolling upgrades (not on ExaCS – but a lot is identical)

Happy upgrading!

Zero Downtime Migration – Physical Online Migration of Very Large Databases

Following the blog post on migrating Very Large Databases (VLDBs) using Logical Online method, let’s touch upon the Physical Online method as well.

Existing Data Guard

VLDBs are very often protected by Data Guard with one or more standby databases. When you start a migration with Zero Downtime Migration (ZDM) you don’t want to promise your existing Data Guard setup. If something happens during the preparation of the migration to OCI, you still want to be able to switch over to an on-prem standby database. Something like this:

Setup with on-prem Oracle Database Data Guard and standby database in OCI

When ZDM configure the OCI standby database, it will be added as a manually managed database, i.e. it does not use Data Guard broker. It does not interfere with your Data Guard broker setup and the setup remains valid. A few things to observe:

  • When ZDM is working, no switch-overs are allowed. This will cause the process to error out.
  • By working I mean – from the second you start the ZDM migration and until it is paused at ZDM_CONFIGURE_DG_SRC.
  • When ZDM is paused, you can do as many switch-overs as you like. Just ensure that the original source database become the primary database again as soon as possible.
  • When you need to complete the migration, the source database must be the primary database and no switch-overs are allowed. Which does make sense because in this last phase, ZDM is switching over to the OCI target database.
  • What about fail-overs. Fail-overs means loss of data and to accept that you need to open the database with RESETLOGS. This causes all sort of havoc in your Data Guard setup. You are back to start.

At which phases can you switch over to your on-prem standby database without jeopardizing the ZDM migration

Important

But you must remove your on-prem Data Guard setup, before you complete the migration with ZDM, i.e. you resume the paused job. ZDM will not be able to finish properly if there is a conflicting Data Guard configuration. To remove your existing Data Guard setup:

  1. ZDM migration is currently paused at ZDM_CONFIGURE_DG_SRC.
  2. Connect using Data Guard CLI (dgmgrl) to source on-prem database (primary). Remove configuration:
remove configuration
  1. Connect using SQL*Plus (sqlplus) to source on-prem database (primary). Remove Data Guard broker configuration:
alter system set dg_broker_start=FALSE scope=memory;
alter system reset dg_broker_start;
alter system reset dg_broker_config_file1;
alter system reset dg_broker_config_file2;
  1. Check log_archive_config. It must contain DB_UNIQUE_NAME of the source on-prem database (primary) and the OCI target database – nothing else. Using the above example, it should be:
alter system set log_archive_config='dg_config=(SALES_1,SALES_oci)' scope=both;
  1. Check log_archive_dest_n parameters. Ensure that Data Guard broker removed the correct ones. There should only be one remote redo log destination – and that is the OCI target database – all others should be removed. If you need to remove one, use this command (remember adjusting the suffix):
alter system set log_archive_dest_2='' scope=memory;
alter system reset log_archive_dest_2;
  1. Ensure that log_archive_config is set correctly (like #4) in the OCI target database:
alter system set log_archive_config='dg_config=(SALES_1,SALES_oci)' scope=both;
  1. Finally, complete migration:
$ZDM_HOME/bin/zdmcli resume job -jobid <id>

The Backup

ZDM needs a full backup that can be restored on your target.

  • DBCS: If your target database is one of the OCI Cloud Services (Virtual Machine, Bare Metal or Exadata DB System), ZDM will need to take a new full backup. Existing backups can’t be used.
  • ExaCC or Exadata on-prem: You can either take a new full backup or use an existing backup that is made available on disk. In addition, if you are so fortunate to have a Zero Data Loss Recovery Appliance (ZDLRA), you don’t need to take a backup. ZDM can just restore directly from ZDLRA.

If you are targeting a DBCS your DATA_TRANSFER_MEDIUM is set to OSS (Object Storage Service). The backup is stored in Object Storage using Oracle Database Cloud Backup Module for OCI. The backup in the source database and the restore in the target database will happen via a special sbt channel which streams the backup directly to and from Object Storage. This means:

  • The backup never hits the disk, so you don’t need additional disk space to hold the backup.
  • The duration of the backup is depending on your network speed to OCI. Since the backup is streamed directly to OCI, the network can become a bottleneck. If you have a slow connection to OCI, the backup will run equally slow. The same applies about the restore, however, the target database is already in OCI and does have a good connection to Object Storage.

ZDM will by default use 10 RMAN channels for the restore and the backup. With your knowledge of the source database, you might know better. You can tweak the number of channels in the response file. Look for the parameters SRC_RMAN_CHANNELS and TGT_RMAN_CHANNELS.

Based on your knowledge or testing you can determine which RMAN Compression algorithm that gives the best benefit on your database. You can adjust the compression algorithm in the response file using the parameter ZDM_RMAN_COMPRESSION_ALGORITHM. The default is MEDIUM which is normally gives the best balance between compression ratio and CPU time. And remember, RMAN Compression normally require a license for Advanced Compression Option but when you migrate with ZDM, you can use it for free.

While ZDM is taking a backup of the source database, no other backups should be running. Be sure to put your regular backups, including archive backups, on hold.

Redo Apply

In the source database, you should keep archive logs on disk until the target database has been restored, Data Guard has been configured, and the target database has caught up with redo apply. If you have a slow network connection and a huge database, it can take days until the backup has completed, restored has completed and redo apply has caught up.

  • Imagine you start the backup at sequence 100.
  • The restore of the target database finishes two days later. The source database is now at sequence 200.
  • ZDM configure Data Guard and starts redo transfer and redo apply. The source database is now at sequence 220.
  • Sequences 100-220 must be available on disk on the source database host, so the source (primary) database can transfer them to the target (standby) database.

It is not uncommon for VLDBs to generate redo on a daily basis that are double-digit TB. Just the other day I talked to a customer whose database generated 15 TB of archive logs a day.

First, you must be able to transfer the redo from the source database (primary) to the target database (standby). This is simple math: If you have 15 TB redo a day, you should be able to transfer that using a 1,5 Gbps connection (amount of redo / 24 / 60 / 60 * 8). If transferring redo becomes a problem, you can look into using redo transport compression. This can reduce the amount of data that must be transferred at the cost of CPU cycles. You can read more about it in the MOS note Redo Transport Compression in a Data Guard Environment (Doc ID 729551.1). I learned from colleagues in the Maximum Availability Architecture (MAA) team that TDE Tablespace Encryption and redo transport compression doesn’t play very well together. If your source database is encrypted, you should not expect that much benefit from redo transport compression.

Next, redo must be applied on the target database. Is the target database capable of applying redo so fast? On Exadata the answer is most likely: YES – but as always in IT, it depends. Redo Apply benchmark of Oracle Database Data Guard The numbers of the above graph comes from Redo Apply Best Practices – Oracle Data Guard and Active Data Guard. Based on your database release and the type of workload you have in the database; you can see the amount of redo that can be applied daily (in TB). The last two columns are using Multi-Instance Redo Apply (MIRA) with either two or four active RAC nodes. The numbers were generated on an Exadata.

Backup of Target Database

How do you backup your target database in OCI? You want to have a valid backup from the very second that you switch over to the target database.

The target placeholder database that you originally created will be overwritten by ZDM. This means that you can’t configure and enable automatic backup in OCI in advance. You must wait until the migration has completed until you enable automatic backup.

You could:

  • Extend the downtime window to allow automatic backup to be configured after the migration. Also, allow enough time for the first backup to complete.
  • Or, do your own backup in OCI. This is more cumbersome but will allow you to open the database for business immediately after the switchover. But you are in charge of the backup now. All the various bits and pieces are available:
    • Original backup is still in object storage.
    • Archive logs are on disk – you can back them up manually.
    • Perform incrementals if needed – put them somewhere safe.
    • In case of emergency – glue it all together

Data Guard on Target Database

Similar to automatic backup, you can’t create a Data Guard Association until ZDM has completed the migration. The cloud tooling does not support creating the standby database through a cascading standby. This means that you can’t build your OCI standby database until after ZDM has completed the migration – and the OCI database is the primary database. You can start to create the OCI Data Guard as soon as ZDM is done, but

  • You must tolerate that the OCI Data Guard is missing
  • Or, take downtime

Conclusion

Even huge Oracle Databases can be migrated to OCI using Zero Downtime Migration. You might need to make a few adjustments from the standard flow, but it is absolutely doable. Automatic backup and data guard can’t be created in OCI until after the migration. This might force you to take downtime. Besides the actual migration, you should also do your best to ensure performance stability once the database is open for business. For that purpose, you should have a look at our webinar Performance Stability, Tips and Tricks and Underscores.

Other Blog Posts in This Series

Zero Downtime Migration – Physical Online Migration and Testing

Testing is an essential part of any migration project of your Oracle Database. With Zero Downtime Migration (ZDM) and the Physical Online method it has become a lot easier. Before going live (i.e. doing the Data Guard switchover) you can test on your production data on your future production system – the OCI database. That’s cool.

Concept

For the duration of your test convert the OCI target database into a snapshot standby database. A short recap on snapshot standby database:

  • A snapshot standby database is a type of updatable standby database that provides full data protection for a primary database.
  • A snapshot standby database receives and archives, but does not apply, redo data from its primary database. Redo data received from the primary database is applied when a snapshot standby database is converted back into a physical standby database, after discarding all local updates to the snapshot standby database.
  • A snapshot standby database diverges from its primary database over time because redo data from the primary database is not applied as it is received. Local updates to the snapshot standby database cause additional divergence. The data in the primary database is fully protected however, because a snapshot standby can be converted back into a physical standby database at any time, and the redo data received from the primary is then applied.

The plan is:

  1. Build the standby database in OCI.
  2. Migration is currently paused at ZDM_CONFIGURE_DG_SRC (-pauseafter ZDM_CONFIGURE_DG_SRC).
  3. Convert the OCI target database – which is a physical standby – into a snapshot standby.
  4. Do your test. Do whatever you want. The database is protected by Flashback Database, so you can insert and delete data, truncate, add tablespaces, you name it.
  5. When you are done with your tests, convert the OCI target database back into a physical standby database. Implicitly, the database is flashed back, and now the redo is getting applied again.
  6. After a little while the target database is now back in sync again.
  7. Complete migration at your will.

How

You find the steps needed to conver to snapshot standby and back again below. You can repeat the process as many times as you want.

Manual

To convert the standby database to snapshot standby:

alter database recover managed standby database cancel;
shutdown immediate
startup mount
alter database convert to snapshot standby;
alter database open;

Now, the database is opened in READ WRITE mode and you can use it for testing. To convert back to a physical standby database:

shutdown immediate
startup mount
alter database convert to physical standby;
shutdown immediate
startup
alter database recover managed standby database disconnect from session;

Manual on RAC

To convert the standby database to snapshot standby:

sqlplus / as sysdba <<EOF
   alter database recover managed standby database cancel;
EOF
srvctl stop database -d $ORACLE_UNQNAME
srvctl start database -d $ORACLE_UNQNAME -o mount
sqlplus / as sysdba <<EOF
   alter database convert to snapshot standby;
   alter database open;
EOF

Now, the database is opened in READ WRITE mode and you can use it for testing. The database is only opened on one node. You can open the other nodes by executing the following on each of the nodes:

alter database open;

To convert back to a physical standby database:

srvctl stop database -d $ORACLE_UNQNAME
sqlplus / as sysdba <<EOF
   startup mount
   alter database convert to physical standby;
   shutdown immediate
EOF
srvctl start database -d $ORACLE_UNQNAME
sqlplus / as sysdba <<EOF
   alter database recover managed standby database disconnect from session;
EOF

Data Guard Broker

To convert the standby database to a snapshot standby database using broker (CDB19_fra3zt is the target DB_UNIQUE_NAME):

convert database 'CDB19_fra3zt' to snapshot standby;

Now, the database is opened in READ WRITE mode and you can use it for testing. To convert back to a physical standby database:

convert database 'CDB19_fra3zt' to physical standby;

Conclusion

One of the really cool features of ZDM is that you can use my standby database for testing in OCI – before I decide to do the switchover. You can achieve this by converting to a snapshot standby database.

Finally, a thank you to my colleague Jose Bennani Pareja for helping out with RAC database information.

Other Blog Posts in This Series

Zero Downtime Migration – Physical Online Migration to ExaCS

This will be an easy blog post. To migrate your Oracle Database to Exadata DB System (ExaCS), just follow this procedure from the DBCS blog post. Plus, execute these two commands on the target after the migration:

[root@tgthost]$ dbaascli registerdb prereqs --dbname [db_name] --db_unique_name [db_unique_name]
[root@tgthost]$ dbaascli registerdb begin --dbname [db_name] --db_unique_name [db_unique_name]

More Details, Please

Granted – the above statement is bold and it is almost true. There are a few important details to share. First, have a look at Additional Information for Migrating to Exadata Cloud Service which you find the Release Notes

Target Environment

To get the full benefits of Exadata you should be running RAC databases. Exadata and RAC is a perfect match but it is up to you to decide. If your source database is already a RAC database (or RAC One Node) you must migrate to a RAC database. However, if your source database is a single instance you have to option to either stay single instance or go RAC. If you go RAC, just create the target placeholder database as a RAC database, and everything else will happen automatically.

You must create a placeholder database on the target system. The placeholder database gets overwritten ZDM during the migration but it is initially used by ZDM to get information on how you want to configure your target database in OCI. For example, the migrated database will be placed in the same Oracle Home as the target placeholder database. Also, the architecture is determined this way. In other words, if you create the target placeholder database as a RAC database; then your source database is automatically converted to a RAC database during migration. If you create a single instance placeholder database; you get a single instance database.

Just like any other migration, when creating the placeholder database there are some things you should be aware of. On the OCI webpage you have to:

  • Set Database name to the DB_NAME of the source database.
  • Set Database version to the same as the source database.
  • Ensure the patch level of the Oracle Home match that of the source system – or be higher.
  • Ensure that the Password matches the SYS password of the source database.

When using Zero Downtime Migration (ZDM) to migrate to Exadata DB System (ExaCS) be sure to create the target placeholder database in the correct manner.

You can choose your own DB_UNIQUE_NAME – it should differ from the source database. Select an Oracle Home that has the same or higher patch level than your source database. I recommend to always migrate to the latest Release Update. If necessary, ZDM will automatically invoke datapatch after the switchover. The other parameters don’t matter – the database gets overwritten anyway by ZDM. Also, be aware when using the OCI webpage you get a RAC database. There is no option to change it. But it is after all the perfect match for Exadata anyway.

Now, if you want more advanced options – like creating a single instance database, you can’t use the OCI webpage. You will have to use either dbaaspi or dbaascli. That gives you full control over the options – but they are not as easy to use as the webpage.

It’s A Wrap

I have created a video on YouTube that demos a migration to Exadata DB System.

Speaking of YouTube, I suggest that you subscribe to the Oracle Database Upgrades and Migrations YouTube channel so you never miss anything.

The Exadata Cloud Service is an awesome platform and it is really easy to migrate to it using Zero Downtime Migration. And converting to RAC is even easier.

Other Blog Posts in This Series

Zero Downtime Migration – Physical Online Migration to DBCS

Let me show you how you can migrate an Oracle Database into OCI. My source database is a CDB running on 12.1.0.2. I want to migrate to the same version using the Physical Online method which uses Data Guard.

This procedure can be used to migrate to:

  • VM DB Systems
  • Bare Metal DB Systems
  • Exadata DB System (ExaCS)

If you want to migrate to Exadata Cloud at Customer and Exadata on-prem, you can use a lot from this post, but there are some differences which I will not cover.

Prerequisites

  1. I need a ZDM service host. It is called zdmhost and the user is called zdmuser.
  2. My source database is a CDB called SALES (DB_NAME), and the host is named srchost.
  3. I want to migrate to a VM DB System, and it is already created. I name it tgthost. I use the same name for the database, SALES. I must use ASM as storage management software.

Overview of the components in this demo

Prepare Source

Ensure the database is in ARCHIVELOG mode:

select log_mode from v$database;

If database version is 12.2 or higher, a TDE keystore must be present – even if the database is not encrypted. A license is not needed to create a TDE keystore – not until data is getting encrypted. The keystore must be OPEN and the type is either AUTOLOGIN, LOCAL_AUTOLOGIN or PASSWORD. In a CDB, this applies to CDB$ROOT and all PDBs:

SELECT con_id, status, wallet_type FROM v$encryption_wallet;

If status is OPEN_NO_MASTER_KEY it means that no TDE master encryption key has been created and I will need to create one. Instructions are also in the documentation.

Ensure that the source host can resolve the network name of the target host. It is important to add two entries – one with the host name and one with the SCAN name (they should both point to the target host):

[root@srchost]$ echo -e "[ip address] tgthost" >> /etc/hosts
[root@srchost]$ echo -e "[ip address] tgthost-scan" >> /etc/hosts

Test connection over SQL*NET to the target:

[oracle@srchost]$ sqlplus system@tgthost-scan/[target-cdb-service-name]

Prepare Target

I need a target placeholder database. The placeholder target database is overwritten during migration, but it retains the overall configuration.

  • VM DB System: When I create a new system, a database is automatically created. This database is my target placeholder database.
  • Bare Metal and Exadata DB Systems: Create a new database on an existing system.

In any case:

  • DB_NAME must exactly match that of the source database – also in the same case.
  • DB_UNIQUE_NAME must be different.
  • Use the same SYS password as the source database.
  • Any database parameters for the target database, including SGA parameters, are maintained during the migration, and the migrated database runs with this same configuration.
  • Do not enable automatic backup (until after the migration has completed).
  • The architecture (single instance or RAC) of the target placeholder database determine the architecture of the OCI database. If I want my OCI database to be a RAC database, simply create the target placeholder database as a RAC database, and the end result will be a RAC database.
  • COMPATIBLE must match that of the source database.
  • The patch level of the target database must be equal to or higher than the source database. If they differ, ZDM will automatically execute datapatch as part of the migration. Use $ORACLE_HOME/OPatch/opatch lsinventory.
  • The source and target database must use the same time zone file version. If they don’t match, I must upgrade the time zone file in the source database. If I have a lot of data of type TIMESTAMP WITH TIMEZONE and I can’t afford the downtime it takes to do the upgrade, then consider using the logical online method: select * from v$timezone_file;

There are a few other requirements listed in the document, but these are the most important ones.

Like the source host, the target host must be able to resolve the network name of its counterpart:

[root@tgthost]$ echo -e "[ip address] srchost" >> /etc/hosts
[root@tgthost]$ echo -e "[ip address] srchost-scan" >> /etc/hosts

And I test the connection:

[oracle@tgthost]$ sqlplus system@srchost-scan/[source-cdb-service-name]

Prepare ZDM

The ZDM service host must be able to resolve the host names of the source and target database host:

[root@zdmhost]$ echo "<source IP address> srchost" >> /etc/hosts
[root@zdmhost]$ echo "<target IP address> tgthost" >> /etc/hosts

I put my private SSH keys to the source and target database host into ~/.ssh directory. Ensure permissions are set properly:

[zdmuser@zdmhost]$ cp srchost_key_file ~/.ssh
[zdmuser@zdmhost]$ chmod 400 ~/.ssh/srchost_key_file 
[zdmuser@zdmhost]$ cp tgthost_key_file ~/.ssh
[zdmuser@zdmhost]$ chmod 400 ~/.ssh/tgthost_key_file 

Test the connection. I connect as opc, but you might have a different user. Read more about access to the database host in the documentation:

[zdmuser@zdmhost]$ ssh -i ~/.ssh/srchost_key_file opc@srchost date
[zdmuser@zdmhost]$ ssh -i ~/.ssh/tgthost_key_file opc@tgthost date

Prepare Response File

I will use a template response file as the basis for my response file:

[zdmuser@zdmhost]$ cp $ZDM_HOME/rhp/zdm/template/zdm_template.rsp ~/physical_online.rsp
[zdmuser@zdmhost]$ chmod 700 ~/physical_online.rsp

This is the response file that I end up with (see appendix B for an explanation):

TGT_DB_UNIQUE_NAME=SALES_fra2t4
MIGRATION_METHOD=ONLINE_PHYSICAL
DATA_TRANSFER_MEDIUM=OSS
PLATFORM_TYPE=VMDB
HOST=https://swiftobjectstorage.[region].oraclecloud.com/v1/[my-tenancy-object-storage-namespace]
OPC_CONTAINER=zdm-staging
SKIP_FALLBACK=TRUE
SHUTDOWN_SRC=TRUE

Perform Evaluation

I am now ready to perform a migration evaluation. It is a dry run of the migration and performs various sanity checks. Nothing is changed during the evaluation:

[zdmuser@zdmhost]$ $ZDM_HOME/bin/zdmcli migrate database \
   -rsp /home/zdmuser/physical_online.rsp \
   -sourcenode srchost \
   -sourcedb SALES_fra3fw \
   -srcauth zdmauth \
   -srcarg1 user:opc \
   -srcarg2 identity_file:/home/zdmuser/.ssh/srchost_key_file \
   -srcarg3 sudo_location:/usr/bin/sudo \
   -targetnode tgthost \
   -tgtauth zdmauth \
   -tgtarg1 user:opc \
   -tgtarg2 identity_file:/home/zdmuser/.ssh/tgthost_key_file \
   -tgtarg3 sudo_location:/usr/bin/sudo \
   -targethome /u01/app/oracle/product/12.1.0.2/dbhome_1 \
   -backupuser "daniel.overby.hansen@oracle.com" \
   -eval

A few comments:

  • sourcenode and targetnode are the host names of the source and target database host. Those names must be resolvable which I why I put them into /etc/hosts already.
  • sourcedb is the DB_UNIQUE_NAME of the source database. My database is on ASM. If I didn’t use ASM I should use sourcesid instead and specify the database SID.
  • srcarg1 is the name of the user that I connect as to the source database host. You might need to change that.
  • srcarg2 and tgtarg2 is the location of the private key files to the source and target database host respectively. The private key file must be usable by the user that is specified in srcarg1 and tgtarg1 which in this case is opc.

I am prompted for the SYS password to the source database and also the password for the backupuser (which is my OCI user). For the latter, please note that this password is not my user password, however, it is an auth token. I also get a job ID: When using OCI oject storage this is NOT your user password, but an auth token

The migration evaluation is then started, and I can use the job ID to monitor it:

[zdmuser@zdmhost]$ $ZDM_HOME/bin/zdmcli query job -jobid <job ID>

Output from zdmcli query job command

My colleague Sinan Petrus Toma showed how to loop:

[zdmuser@zdmhost]$ while :; do $ZDM_HOME/bin/zdmcli query job -jobid <job ID>; sleep 10; done

However, I prefer to get more details, so I tail the ZDM log file instead. This little one-liner finds the newest file and tails it:

[zdmuser@zdmhost]$ tail -n 50 -f "`ls -td /u01/app/oracle/chkbase/scheduled/*log | head -1`"

And I get this level of detail:

zdmhost: 2021-06-03T07:19:50.317Z : Starting zero downtime migrate operation ...
zdmhost: 2021-06-03T07:19:53.445Z : Executing phase ZDM_GET_SRC_INFO
zdmhost: 2021-06-03T07:19:53.446Z : Retrieving information from source node "srchost" ...
zdmhost: 2021-06-03T07:19:53.447Z : retrieving information about database "SALES_fra3fw" ...
zdmhost: 2021-06-03T07:20:02.743Z : Execution of phase ZDM_GET_SRC_INFO completed
zdmhost: 2021-06-03T07:20:02.826Z : Executing phase ZDM_GET_TGT_INFO
zdmhost: 2021-06-03T07:20:02.827Z : Retrieving information from target node "tgthost" ...
zdmhost: 2021-06-03T07:20:12.197Z : Determined value for parameter TGT_DATADG is '+DATA'
zdmhost: 2021-06-03T07:20:12.198Z : Determined value for parameter TGT_REDODG is '+RECO'
zdmhost: 2021-06-03T07:20:12.198Z : Determined value for parameter TGT_RECODG is '+RECO'
zdmhost: 2021-06-03T07:20:12.203Z : Execution of phase ZDM_GET_TGT_INFO completed

Eventually, I end up with a successful evaluation: When status of an evaluation is SUCCEEDED

Start Migration

When the evaluation passes, I can start the real migration. I am re-using the same command line, but I have removed the -eval option. Instead – and this is important – I am using -pauseafter to tell ZDM to pause the migration, just before the switchover takes place. Downtime has not started yet. The database is still open for business, but ZDM will copy the data and configure Data Guard:

[zdmuser@zdmhost]$ $ZDM_HOME/bin/zdmcli migrate database \
   -rsp /home/zdmuser/physical_online.rsp \
   -sourcenode srchost \
   -sourcedb SALES_fra3fw \
   -srcauth zdmauth \
   -srcarg1 user:opc \
   -srcarg2 identity_file:/home/zdmuser/.ssh/srchost_key_file \
   -srcarg3 sudo_location:/usr/bin/sudo \
   -targetnode tgthost \
   -tgtauth zdmauth \
   -tgtarg1 user:opc \
   -tgtarg2 identity_file:/home/zdmuser/.ssh/tgthost_key_file \
   -tgtarg3 sudo_location:/usr/bin/sudo \
   -targethome /u01/app/oracle/product/12.1.0.2/dbhome_1 \
   -backupuser "daniel.overby.hansen@oracle.com" \
   -pauseafter ZDM_CONFIGURE_DG_SRC

Again, I have to input the passwords as in eval mode. Use the job ID to monitor the progress or tail the log file. Note down the job ID. I need it later on to resume the migration.

When ZDM completes the phase ZDM_CONFIGURE_DG_SRC it will pause and wait. I can verify it with zdmcli query job: ZDM is currently paused

Now the standby database has been built in OCI. Redo gets transferred from my source database to the target database in OCI and is applied (see appendix D for monitoring queries). Also, this is a good time to test your new database.

Complete Migration

Now it is time to finalize the migration and switch over to the OCI target database. All I need to do, is to resume the paused ZDM job. I use the job ID that was created when I started the migration:

[zdmuser@zdmhost]$ $ZDM_HOME/bin/zdmcli resume job -jobid <job ID>

ZDM will now ensure that all redo is sent and applied before switching over to the OCI target database. Again, I use the zdmcli query job command to monitor the progress and I can tail the log file. After a short while the migration completes. ZDM migration completed

That’s it. I have now migrated into OCI!

I have a few post-migration tasks to carry out:

  • Configure automatic backup
  • Protect my new OCI database with a Data Guard in OCI
  • Take a peek at the alert log (just be sure)

Other Blog Posts in This Series

Appendix

A – RAC Databases

If your source or target database is a RAC database, then you only need access to one of the nodes. Select the node that you want to use and use the host name and SSH keys to that host. Also, ensure that the IP address you are using is not the floating one (VIP). In OCI that is referred to as the private IP.

B – Response File

Allow me to put some comments on the values:

Setting Comment
DATA_TRANSFER_MEDIUM When migrating to a DB System in OCI OSS is the only option. It is also the default value, so I could omit the parameter. The other options are applicable for Exadata Cloud at Customer and Exadata on-prem.
                                                                                                                                                                     HOST                                                                                                                                                                      Getting the right URL for the HOST parameter might be a little tricky, but check the documentation. In my demo I use the Frankfurt data center, and, thus, the region is set to eu-frankfurt-1. Visit the API documentation for a list of regions. Use OCI CLI and oci os ns get to find the tenancy object storage namespace. Alternatively, in the OCI Console (the web page) open the Profile menu and click Tenancy: <your tenancy name>. The namespace string is listed under Object Storage Settings. To read more about have a look at the OCI documentation. Thanks to Bartlomiej Sowa for putting in a comment with this information – much appreciated!
OPC_CONTAINER The OCI Object Storage bucket that will be used as a staging area for the backup of the source database. I recommend using a separate bucket for each migration. It makes it a lot easier to clean up afterwards.
PLATFORM_TYPE VMDB covers Virtual Machine and Bare Metal DB Services. It is also the default value, so I could omit the parameter.
SHUTDOWN_SRC I choose to shut down the source database to ensure no one uses it unintentionally after the migration.
SKIP_FALLBACK To make my demo simple I choose not to configure fallback. Remember falling back to the source database requires a license for Advanced Security Option.
TGT_DB_UNIQUE_NAME The DB_UNIQUE_NAME of the target placeholder database. It must be different than that of the source database.

C – Backup Strategy

During a migration with ZDM, you should keep your regular backup strategy. Keep doing the same backups as you did before. But avoid having ZDM backups and regular backups run at the same time. Also, if you are dealing with a RAC database be sure to put the snapshot control file on shared storage. Otherwise, you might get ORA-00245 errors during backups.

D – Monitoring

Use these queries to monitor the redo apply. On source/primary database:

SELECT 
   host_name, instance_name, db_unique_name, status, database_role, open_mode 
FROM 
   v$database, v$instance;
SELECT thread#, max(sequence#) FROM v$archived_log GROUP BY thread#;

Target/standby database:

SELECT 
   host_name, instance_name, db_unique_name, status, database_role, open_mode 
FROM 
   v$database, v$instance;
SELECT thread#, max(sequence#) FROM v$archived_log WHERE applied='YES' GROUP BY thread#;
--MRP process should be 'APPLYING_LOG'
SELECT process, status, sequence# FROM v$managed_standby;
SELECT * FROM v$archive_gap;

E – Troubleshooting

PRCZ-4001

If you run into this error:

PRCZ-4001 : failed to execute command "/bin/uname" using the privileged execution plugin "zdmauth" on nodes "doverbyh-zdm-tgt" within 120 seconds
PRCZ-2006 : Unable to establish SSH connection to node "doverbyh-zdm-tgt" to execute command "/bin/uname"
No more authentication methods available

Check your key files. They must be in RSA/PEM format (the private key must start with -----BEGIN RSA PRIVATE KEY-----).

ZDM_OBC_INSTALL_CREDENTIALS_INVALID

If you run into this error:

srchost: 07:08:00.000: Validating object store credentials..
srchost: <ERR_FILE><Facility>PRGO</Facility><ID>ZDM_OBC_INSTALL_CREDENTIALS_INVALID</ID></ERR_FILE>

Check your credentials to OCI. Remember when prompted for the password of your OCI account it is an auth token, not your password (even though the prompt text is misleading).

ZDM_GET_SRC_INFO

If the phase ends in PRECHECK_FAILED and there is no real clue about the error, ensure that the source database host is added to the known_hosts file on the ZDM service host. Also, you can verify connectivity by trying to log on via SSH:

[zdmuser@zdm]$ ssh -i <specified-key-file> opc@<name-of-source-host>

ZDM_GET_TGT_INFO

If the phase ends in PRECHECK_FAILED and there is no real clue about the error, ensure that the target database host is added to the known_hosts file on the ZDM service host.

[zdmuser@zdm]$ ssh -i <specified-key-file> opc@<name-of-target-host>

ZDM_SWITCHOVER_SRC

If the migration is stuck at the phase ZDM_SWITCHOVER_SRC and you see the following in the ZDM log file:

zdmhost213: 2022-05-10T18:01:22.565Z : Executing phase ZDM_SWITCHOVER_SRC
zdmhost213: 2022-05-10T18:01:22.566Z : Switching database SALES2_fra3cx on the source node srchost to standby role ...
zdmhost213: 2022-05-10T18:01:22.566Z : checking if source database is ready for switching role...
srchost: 2022-05-10T18:01:35.340Z : Validating database SALES2_fra3cx role is PRIMARY...
srchost: 2022-05-10T18:01:35.742Z : Validating database SALES2_fra3cx is in open mode...
srchost: 2022-05-10T18:01:36.144Z : Waiting for SALES2_fra3cx to catch up, this could take a few minutes...

Then proceed to primary and standby database and investigate whether there are any log shipping or log apply problems. The root cause can be many different things, for instance conflicts when negotiating network security options.

PRGZ-3420 / ZDM_MANIFEST_TO_CLOUD

Check the ZDM log file stored on the ZDM service host in $ZDM_BASE/chkbase/scheduled. If you find this error:

zdmhost213: 2022-05-10T18:20:34.330Z : Executing phase ZDM_MANIFEST_TO_CLOUD
zdmhost213: 2022-05-10T18:20:34.331Z : registering database "SALES2_OCI" to Oracle Cloud
tgthost: 2022-05-10T18:20:50.895Z : Updating DBAAS wallet keys ...
####################################################################
PRGZ-3420 : failed to modify keystore entry "oracle.security.client.password1" in wallet "/var/opt/oracle/dbaas_acfs/sales2/db_wallet".

The proceed to the target host and find the log file from that phase. It is located in $ORACLE_BASE/zdm/zdm_<target_db_unique_name>_<zdm_job_id>/zdm/log. Do a grep:

cd /u02/app/oracle/zdm/zdm_SALES2_OCI_12/zdm/log
grep "cloud registry" zdm_manifest_to_cloud_*.log

If you find the following lines in the log file:

grep "Error: Missing cloud registry" zdm_manifest_to_cloud_*.log
zdm_manifest_to_cloud_337115.log:Error: Missing cloud registry file SALES2.ini
zdm_manifest_to_cloud_337115.log:Error: Missing cloud registry file SALES2.ini

You have most likely used the wrong case of DB_NAME on the target database. You choose that in the OCI console in the field Database name. In this case, DB_NAME in source database is SALES2, but I had incorrectly used sales2 (lowercase) for the target database. Normally, DB_NAME is case insensitive, and it is to the database itself. But the OCI cloud tooling is not case insensitive. Looking in /var/opt/oracle/creg on the target database, I can find a file called sales2.ini but ZDM is looking for SALES2.ini. How do you solve this? If this is a test database, then scratch everything and start all over. But bear in mind that the switchover has already taken place, and the OCI database is now the primary database. For a production database this is tricky. First, the database has been migrated to OCI. It works. All your data is fine on the target database. But the OCI cloud tooling does not work properly. You need to open a SR and get assistance from support.

Zero Downtime Migration – Physical Online Migration

You can migrate a database with Zero Downtime Migration (ZDM) using the Physical Online method. A standby database is built in OCI and kept in sync via redo apply. At your will, a switchover is all it takes to complete the migration. This way even very large databases can be migrated with no or very little downtime.

Concept of physical online migration

With the Physical Online method, you can target:

  • Virtual Machine DB System
  • Bare Metal DB System
  • Exadata DB System (ExaCS)
  • Exadata Cloud at Customer (ExaCC)
  • Exadata (on-prem)

Since the method uses Data Guard as the migration vehicle, this can only be only with Enterprise Edition databases.

Benefits

  • Personally, I like this approach because it builds on technology that most of us know already. Data Guard is heavily in use in most organizations. Although ZDM does all the heavy lifting, it is nice to know what happens underneath the hood.

  • To complete the migration all you need is a regular Data Guard switchover (which ZDM also takes care of). A switchover operation ensures that there will be no data loss at all. In addition, if you have a properly configured application, it won’t experience downtime. Just a brown-out while the switchover takes place.

  • This method has an excellent fallback possibility. When you switch over to the OCI database, the redo flow is reversed. Now, the on-prem database is a standby database, and it is kept in sync via redo from the OCI database. However, this option requires a license for Advanced Security Option on your on-prem system. Read more about it later on.

  • You migrate the entire database. All the internals are brought to the cloud as well. This includes:

    • AWR
    • SQL Plan Baseline
    • SQL Profiles
    • Public objects
    • Etc.
  • The initial backup of the source, on-prem database will use RMAN Compression. It will drastically reduce the size of the backup, and thus the amount of data that you must transport to the cloud. And the really good thing: ZDM can use RMAN Compression even if you don’t have a license for the Advanced Compression Option. You are allowed to use RMAN Compression with ZDM for migration purposes without paying license for Advanced Compression Option.

  • You can customize the RMAN backup. You know your system best, so it is possible to tweak the number of RMAN channels used and the compression algorithm applied. Default is 10 and medium.

  • When ZDM takes the full backup of your source database (Exadata on-prem and ExaCC excluded), it uses Oracle Database Cloud Backup Module for OCI to allocate a special sbt channel. The backup is sent directly to OCI Object Storage. Thus, you don’t need any additional disk space on your source system to hold the backup. The same applies on the target system.

  • When migrating to ExaCC or Exadata on-prem you can use an existing backup. Use the response file option DATA_TRANSFER_MEDIUM=EXTBACKUP. Also, if you have a Zero Data Loss Recovery Appliance (ZDLRA) you can restore directly from it, without taking a full backup first. If you are migrating to any other platform, ZDM will need to take a full backup as part of the workflow.

  • The standby database is kept in sync via redo apply. This means that there are no restrictions in supported data types. Further, DDL in any form is supported and even with a heavy workload on the primary database, your standby database should be able to keep up.

  • You can migrate to a higher patch level. ZDM will automatically invoke datapatch for you on the OCI database. But you must handle the on-prem database yourself. After switching over to the OCI database and datapatch has been executed, you should patch the on-prem Oracle Home to the same patch level.

Considerations

  • You can migrate to the same version only. This is a restriction of Data Guard. If you also need to upgrade the database, then you must do that after the migration. But it will incur additional downtime.
  • The same applies to PDB conversion. ZDM can run the noncdb_to_pdb.sql script for you. But that will incur downtime as well. If you decide to convert to PDB, you need to develop an alternate fallback plan (which is Data Pump or Transportable Tablespaces). The PDB conversion is irreversible, and your on-prem database will be useless as a fallback option.
  • It is not possible to migrate between editions.
  • The entire database is migrated. I listed this as a benefit as well, but it has a flipside as well. It is not possible to perform any transformation during the migration, e.g. converting any old BasicFile LOBs to SecureFile LOBs. Also, you bring over any old baggage in your database. Sometimes it is nice to start from scratch because garbage tends to accumulate in a database over time.
  • If you have a Standard Edition database, you can’t use Physical Online method. Only the Physical Offline is supported.
  • You can’t configure Automatic backup on your target database until after the migration has been completed.
  • If you need to protect your OCI database with Data Guard, then you must build the standby database after the migration has been completed. Normally, you would use a cascading standby database to keep the target database protected by Data Guard even after the migration has been completed. But currently the OCI tooling does not support that option.
  • If your source database is not encrypted, you must create a keystore that can be used later on for TDE Tablespace Encryption. The keystore is created in the source database with no downtime, and you don’t need a separate license just for creating a keystore. You do not have to actually encrypt your source database until it reaches the cloud.

Fallback

Using this approach, you have a great fallback option. When you switch over to the OCI database, then the redo flow is reversed and the source, on-prem database is now the standby database. If you need to fallback, simply issue another switchover (you could call that a switchback) and start to use the on-prem database again – with no data loss. See step 5 in MAA Practices for Cloud Migration Using ZDM (Doc ID 2562063.1).

There is a catch, however. The OCI database is encrypted using TDE Tablespace Encryption. Any redo generated for an encrypted tablespace is also encrypted. This means that the on-prem database must be able to decrypt the redo before it can be applied. That requires a license for the Advanced Security Option. When you are migrating your database with ZDM, you can get a 1-year limited license for Advanced Security Option. This will allow you to use decrypt the encrypted redo on the source database, thus, providing you with a great fallback option.

Conclusion

Using the Physical Online method in ZDM is a straight-forward way of migrating your database to OCI. It uses Data Guard which is very familiar to most of us. The method does, however, have some limitations, and you can’t target Autonomous Databases.

Want to Know More

If you want to know more about migrations in general, I suggest that you take a look at our webinar Migration Strategies – Insights, Tips and Secrets

In addition, these links contain additional useful information:

Other Blog Posts in This Series