Zero Downtime Migration – Logical Online Migration of Very Large Databases

Things always work on PowerPoint slides. And they almost always work in our lab environments as well. And recorded demos are also pretty bullet-proof.

But what happens when you have a huge database to migrate? A Very Large Database (VLDB). First, things get more exciting. Second, there are more knobs to turn and the smallest thing might become a big thing. Here are some things to consider when you migrate VLDBs using the Logical Online method.

Existing Data Guard No need to worry about your existing on-prem Oracle Data Guard environment. The Logical Migration uses Oracle Data Pump and Oracle GoldenGate. Both tools can be used on a database in a Data Guard and does not interfere with Data Guard operations.

But no switchovers or failovers are allowed during replication. GoldenGate can be configured to fully support a Data Guard environment and seamlessly reconnect in case of a Data Guard switchover. But it requires advanced configuration of GoldenGate. Zero Downtime Migration (ZDM) does the GoldenGate configuration but does so in a basic way that does not support Data Guard switchovers.

Data Pump Data Pump is used for the initial load of data.

The initial load happens before downtime, so generally you shouldn’t worry too much about the time it takes to perform the Data Pump export and import. However, the longer it takes, the bigger a replication will GoldenGate need to catch up. If the gap is too big, it might become a problem.

Apply a proper degree of parallelism. Rule-of-thumb:

On-prem: 2 x number of physical cores OCI: Number of OCPUs You can change the Data Pump parallel setting using the ZDM response file parameters DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXPORTPARALLELISMDEGREE and DATAPUMPSETTINGS_DATAPUMPPARAMETERS_IMPORTPARALLELISMDEGREE

When you are importing in OCI, consider scaling up on CPUs. More CPUs, more Data Pump parallel, faster import. Bare Metal and Exadata DB Systems scales online, whereas Virtual Machines needs around 10 minutes of downtime.

ZDM applies Data Pump compression automatically. Remember you don’t need a license for Advanced Compression Option to use compression when you are migrating with ZDM. Our experience is that MEDIUM most of the time is the best alternative. It provides a good compression ratio at a low CPU footprint. Although HIGH can compress better, it often comes at a much higher CPU footprint. ZDM uses MEDIUM compression algorithm. It’s not possible to change the Data Pump compression algorithm. The interested reader may find a benchmark below, that compares the different compression algorithms:

ZDM automatically transforms BasicFile LOBs into SecureFile LOBs – which is very good. SecureFile LOBs are superior in many ways, also when it comes to importing. Based on a customer case, we did a benchmark that measures the time it takes to import LOB data. The customer saw a 3x improvement during by transforming to SecureFile LOBs – and got the benefits of SecureFile LOBs afterwards. ZDM transforms to SecureFile LOBs automatically. Don’t turn it off. Here are the log entries from the two imports. Notice the elapsed time:

BasicFile LOBs

10-OCT-20 21:43:21.848: W-3 . . imported "SCHEMA"."TABLE" 31.83 GB 681025 rows in 804 seconds using direct_path

SecureFile LOBs

15-OCT-20 18:16:48.663: W-13 . . imported "SCHEMA"."TABLES" 31.83 GB 681025 rows in 261 seconds using external_table Don’t use import over network link. Use dump files. Performance-wise there are some restrictions when you import over network link. These restrictions really hurt on a VLDB.

The Data Pump works will automatically be started on all nodes, if the target database is a RAC database. That should allow for the fastest possible import. If needed, you can force all workers on to the same node using DATAPUMPSETTINGS_DATAPUMPPARAMETERS_NOCLUSTER

But when you export to dump files, remember to ensure you have adequate disk space on the source database host to hold the dump file. Use mount points that can deliver good write performance and ideally, they should be separated from the storage that holds the database data files.

A Data Pump export does not affect your existing backup operations. But the export and the backup will fight over the same resources, so I recommend suspending major backup operations (level 0+1) until after the export.

Speaking of fighting over resources. You should run the export at off peak hours. That will allow the export to use many more resources. The export is not a fully consistent export. Each table is consistent, but ZDM does not use FLASHBACK_TIME or FLASHBACK_SCN to make the entire export consistent. This helps avoid ORA-01555 snapshot too old during the export. The SCN of each table is recorded in the dump file and GoldenGate later on uses that information to start replication on each table individually from the appropriate SCN.

GoldenGate Replication How much data will GoldenGate need to transfer from source database to GoldenGate hub and from GoldenGate hub to target database? GoldenGate stores the replication data in trail files. The trail files are smaller than redo logs. The size of trail files is typically 30-40 % of the size of the redo logs. Imagine a database generating 15 TB of redo a day. The size of the trail files will be 4,5-6 TB. If you further apply compression on the trail files, you can typically reduce the size to around 1/8. Using our example, the trail files are now 550-750 GB. Which is significantly less than the initial 15 TB.

In the ZDM response file there is a parameter called GOLDENGATESETTINGS_EXTRACT_PERFORMANCEPROFILE. The default value is MEDIUM, but you have the option of changing it do HIGH. This should increase the performance of the extract process.

ZDM configure GoldenGate to use Parallel Replicat. This gives you a few options. First, you can control the number of mappers by using the parameter GOLDENGATESETTINGS_REPLICAT_MAPPARALLELISM (default is 4). Also, apply parallelism is auto-tuned. You can set a minimum and maximum value using GOLDENGATESETTINGS_REPLICAT_MINAPPLYPARALLELISM and GOLDENGATESETTINGS_REPLICAT_MAXAPPLYPARALLELISM=50. Defaults are 4 and 50 so I don’t think it will be necessary to change.

By default, ZDM does not configure GoldenGate with DDL replication. Replicating without DDL gives the best performance. If you must have DDL replication in your migration project, you can look at the ZDM response file parameter GOLDENGATESETTINGS_REPLICATEDDL. But be prepared for a performance degradation.

In our webinar, Migrating Very Large Databases my team and I discuss what to consider when migrating databases with GoldenGate.

Last, you can find more advice in Administering Oracle GoldenGate 19.1, Tuning the Performance of Oracle GoldenGate.

GoldenGate Health Check Oracle GoldenGate comes with health check scripts that you can use on the source and target database. This will help you monitor and troubleshoot the process. Here is information on how to install and use the health check scripts. You can also find information in Administering Oracle GoldenGate 19.1, Using Healthcheck Scripts to Monitor and Troubleshoot.

Backup of Target Database When you use logical online migration in ZDM, you create the target database in advance. In contrast to physical migration, the target database is not overwritten. This means that you can configure automatic backup of your target OCI database before you complete the migration and switch over to OCI. You will even have time to and possibility of testing the backup.

I recommend that you configure automatic backup after the Data Pump initial load. This way there is less archive logs to back up as well.

Data Guard on Target Database Similar to automatic backup, you can also create your Data Guard association before the switchover. Likewise, wait until after the initial load has completed.

Conclusion Even huge Oracle Databases can be migrated to OCI using Zero Downtime Migration. You might need to make a few adjustments from the standard flow, but it is absolutely doable. Your database can be protected by backup and Data Guard from the very second you switch over to OCI. Besides the actual migration, you should also do your best to ensure performance stability once the database is open for business. For that purpose, you should have a look at our webinar Performance Stability, Tips and Tricks and Underscores.

Other Blog Posts in This Series Introduction Install And Configure ZDM Physical Online Migration Physical Online Migration to DBCS Physical Online Migration to ExaCS Physical Online Migration and Testing Physical Online Migration of Very Large Databases Logical Online Migration Logical Online Migration to DBCS Logical Offline Migration to Autonomous Database Logical Online Migration and Testing Logical Online Migration of Very Large Databases Logical Online and Sequences Logical Offline Migration How To Minimize Downtime Logical Migration and Statistics Logical Migration and the Final Touches Create GoldenGate Hub Monitor GoldenGate Replication The Pro Tips

14 thoughts on “Zero Downtime Migration – Logical Online Migration of Very Large Databases

  1. Hi Daniel. We have an Active DB it produces around 600gb per week it uses a lot of partitions and sub partitions and MV and stuff like that. it’s around 230TB on AIX. We try month later to install GG and initial load via exdp was a Nightmare., Datapump jobs were cancel for the impact over production and the client wasn’t happy at all. The thing is for this big VLDB you still agree ZDM logical mig expdp initial load is the way to go?

    Like

    1. Hi,
      That’s a big database with huge data growth as well. A migration like this will always be challenging and there will be many bumps on the road and things to sort out. Depending on the requirements for downtime other options (like cross-platform transportable tablespace) might be worth investigating.
      I would need more details to answer your question in detail. Feel free to e-mail with additional detail on daniel.overby.hansen@oracle.com and I will see what I can find for you.
      Regards,
      Daniel

      Like

  2. Yes, that’s exactly what we’re testing. For now using xtts v4 perl rman tooling with just 2 tablespaces to gain knowhow of the procedure with the QA DB is 16TB. And yes, the plan for production, is move to OCI Cloud over ExaCS. Yes we’re using Fast Connect Onprem-oci

    Like

  3. Hi Daniel,

    Any reason why the DDL replication is slow with ZDM, I understand ZDM is a wrapper on top of the golden gate, is it a limitation of the golden gate or ZDM?

    Regards
    Baiju.R

    Like

    1. Hi,
      This is a limitation of GoldenGate DDL replication. It’s described in the ZDM documentation:
      “During the migration job it is recommended that your database avoid Data Definition Language (DDL) operations to provide the most optimal environment for fast database replication. When DDL is replicated, Oracle GoldenGate Replicat serializes data to ensure that there are no locking issues between DML and DDL on the same objects.”
      https://docs.oracle.com/en/database/oracle/zero-downtime-migration/21.4/zdmug/preparing-logical-database-migration1.html#GUID-004B4498-49B9-45A8-BD1B-5449857139A0

      Regards,
      Daniel

      Like

  4. Hi Daniel,

    Many thanks, what I have noticed is that ZDM is just creating one replicat, Can we create multiple replicats while using the ZDM, if we do so, we can put schemas/tables having huge numbers of DDL to those replicats and it doesn’t impact the overall performance

    Regards
    Baiju.R

    Like

    1. Yes, you can do that. But now you’re on your own, so to say. ZDM configure GoldenGate in the defined way. If you want to utilize many more capabilities in ZDM, so can do that. But then you are in charge of the GoldenGate configuration.

      Regards,
      Daniel

      Like

  5. I will explore this option. If we pause/stop the ZDM job, split replicat into multiple replicats, and restart the ZDM job, would ZDM be able to identify the newly created replicat jobs and resume from that point onwards.

    ZDM_ADD_CHECKPOINT_TGT ……………. COMPLETED –> Pause here and add more replicats, then resume
    ZDM_CREATE_GG_REPLICAT_TGT ………… COMPLETED –> Can ZDM identify and integrate the newly created replicat ?
    ZDM_MONITOR_GG_LAG ……………….. COMPLETED
    ZDM_RM_GG_EXTRACT_SRC …………….. PENDING

    Regards
    Baiju.R

    Like

  6. Hi Daniel, I have a scenario where I need to migrate a 11.2.0.4 database on Solaris Spark to an Exadata X8M x86-64, keeping the database at the same version. We used datapump to perform the intial load, but when we started applying the captured transactions, we received the error
    ORA-19202: Error occurred in XML processing
    LPX-00283: document encoding is UTF-16-based but default input encoding is not.
    There are several schemas with tables that have CLOB and XMLTYPE columns. After researching a little, we identified that Solaris uses UTF-16BE, while the target database uses UTF-8.
    We tried unsuccessfully to use CHARMAP on GoldenGate to convert the encoding, but the error continues to occur.
    Do you have any advice or documentation that we can rely on to get around this conversion?

    Like

  7. Hi Daniel,

    Where exactly are the Golden Gate Trail files stored? Is it stored in the GG hub filesystem? Can we change the location to the NFS?

    Could you please check if my below understanding is correct?

    For an online migration of ZDM using NFS, the ZDM hosts connect to the source database, take an export backup, and write it to the NFS filesystem( if we configure NFS as DATA_TRANSFER_MEDIUM) The ZDM host then uses the backups from NFS and import into the target database. I assume that NFS needs to be mounted only on the source and target, and is not required on the ZDM host.

    From the GG point of view, I assume GG hub connects to the Source database creates the extract file, and writes into the GG hub files system, later, these files directly write into the target database by replicat microservice. I assume the extract trail files do not transfer to the target system as the GG hub( replicat) will write data directly to the target database.

    In nutshell
    NFS needs to be mounted on the source and target database
    GG trail file will be stored only on the GG hub filesystem

    Regards
    Baiju.R

    Like

    1. Hi,

      GoldenGate store the trail files on the hub itself in the “Data home”. Normally, there is no need to change the location. If you need to store a large amount of trail files, you can simply extend the file system.

      I’ve never tried to move the “Data home” somewhere else, but it appears to be doable:
      https://www.dbasolved.com/2019/12/updating-the-trail-file-location-for-oracle-goldengate-microservices/

      Regards,
      Daniel

      Like

Leave a reply to William Laverde Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.