Files to Move During Oracle Grid Infrastructure Out-Of-Place Patching

Like with Oracle Database, I strongly recommend patching Oracle Grid Infrastructure using the out-of-place method. It has many advantages over in-place patching

A while ago, I wrote about the files you must move during Oracle Database out-of-place patching. There are quite many files to consider. A blog reader left a comment asking about a similar blog post for Oracle Grid Infrastructure, so here it is.

Listener

The listener runs out of the GI home. When you patch out-of-place, the listener must also start in the new GI home.

If you patch using any of the below methods, it will move the listener for you:

The two methods also move listener.ora as part of the process.

If you use opatchauto you should note that the tool moves listener.ora when preparing the new GI using opatchauto apply ... -prepare-clone. You can run that command hours or days before you move the listener. If you add things to listener.ora in between, you must also add it to listener.ora in the new GI home.

Conclusion

There is really nothing to worry about when you patch Oracle Grid Infrastructure out-of-place. The above-mentioned two tools will take care of it for you.

Happy Patching

How to Apply Patches Out-of-place to Oracle Grid Infrastructure and Oracle Data Guard Using Standby-First

I strongly recommend that you always patch out-of-place. Here’s an example of how to do it on Oracle Grid Infrastructure (GI) and Oracle Data Guard using Standby-First Patch Apply.

Standby-First Patch Apply allows you to minimize downtime to the time it takes to perform a Data Guard switchover. Further, it allows you to test the apply mechanism on the standby database by temporarily converting it into a snapshot standby database.

The scenario:

  • Oracle Grid Infrastructure 19c and Oracle Database 19c
  • Patching from Release Update 19.17.0 to 19.19.0
  • Vertical patching – GI and database at the same time
  • Data Guard setup with two RAC databases
    • Cluster 1: copenhagen1 and copenhagen2
    • Cluster 2: aarhus1 and aarhus2
    • DB_NAME: CDB1
    • DB_UNIQUE_NAME: CDB1_COPENHAGEN and CDB1_AARHUS
  • Using Data Guard broker
  • Patching GI using SwitchGridHome method

Let’s get started!

Step 1: Prepare

I can make the preparations without interrupting the database.

  • I ensure my environment meets the requirements for Standby-First Patch Apply.

  • I deploy new GI homes to all four hosts.

    • I use the SwitchGridHome method.
    • Very important: I only perform step 1 (Prepare a New Grid Home).
    • I apply the Release Update 19.19.0 as part of the deployment using gridSetup.sh ... -applyRU ... -applyOneOffs as described in the blog post.
  • I deploy new database homes to all four hosts.

  • I also recompile invalid objects. This can make it easier for Datapatch later in the process:

    PRIMARY SQL> @?/rdbms/admin/utlrp
    

Step 2: Restart Standby in New Oracle Homes

Now, I can move the standby database to the new GI and database homes.

  • On the standby hosts, aarhus1 and aarhus2, I first move the database configuration files from the old database home to the new one.

  • I change the database configuration in GI. Next time the database restarts, it will be in the new Oracle Home:

    [oracle@aarhus1]$ $OLD_ORACLE_HOME/bin/srvctl modify database \
       -db $ORACLE_UNQNAME \
       -oraclehome $NEW_ORACLE_HOME
    
  • I switch to the new GI on all standby hosts, aarhus1 and aarhus2, by executing step 2 (Switch to the new Grid Home) of the SwitchGridHome method.

    • It involves running gridSetup.sh ... -switchGridHome and root.sh.
    • You can perform the switch in a rolling manner or all at once.
    • The switch restarts the standby database instance. The standby database instance restarts in the new Oracle Home.
    • If the profile of grid (like .bashrc) sets the ORACLE_HOME environment variable, I ensure to update it.
  • If I had multiple standby databases, I would process all standby databases in this step.

Step 3: Test Standby Database

This is an optional step, but I recommend that you do it.

  • I convert the standby database (CDB1_AARHUS) to a snapshot standby database:
    DGMGRL> convert database CDB1_AARHUS to snapshot standby;
    
  • I test Datapatch on the standby database. It is important that I run the command on the standby database:
    [oracle@aarhus1]$ $ORACLE_HOME/OPatch/datapatch -verbose
    
  • I can also test my application on the standby database.
  • At the end of my testing, I revert the standby database to a physical standby database. The database automatically reverts all the changes made during testing:
    DGMGRL> convert database CDB1_AARHUS to physical standby;
    

Step 4: Switchover

I can perform the previous steps without interrupting my users. This step requires a maintenance window because I am doing a Data Guard switchover.

  • I check that my standby database is ready to become primary. Then, I start a Data Guard switchover:
    DGMGRL> connect sys/<password> as sysdba
    DGMGRL> validate database CDB1_AARHUS;
    DGMGRL> switchover to CDB1_AARHUS;
    

A switchover does not have to mean downtime.

If my application is configured properly, the users will experience a brownout; a short hang, while the connections switch to the new primary database.

Step 5: Restart New Standby in New Oracle Homes

Now, the primary database runs on aarhus1 and aarhus2. Next, I can move the new standby hosts, copenhagen1 and copenhagen2, to the new GI and database homes.

  • I repeat step 2 (Restart Standby In New Oracle Homes) but this time for the new standby hosts, copenhagen1 and copenhagen2.

Step 6: Complete Patching

Now, both databases in my Data Guard configuration run out of the new Oracle Homes.

Only proceed with this step once all databases run out of the new Oracle Home.

I need to run this step as fast as possible after I have completed the previous step.

  • I complete the patching by running Datapatch on the primary database (CDB1_AARHUS). I add the recomp_threshold parameter to ensure Datapatch recompiles all objects that the patching invalidated:

    [orale@aarhus1]$ $ORACLE_HOME/OPatch/datapatch \
       -verbose \
       -recomp_threshold 10000
    
    • I only need to run Datapatch one time. On the primary database and only on one of the instances.
  • I can run Datapatch while users are connected to my database.

  • Optionally, I can switch back to the original primary database on copenhagen1 and copenhagen2, if I prefer to run it there.

That’s it. Happy patching!

Appendix

Further Reading

Other Blog Posts in This Series

Patching Oracle Grid Infrastructure And Oracle Data Guard

How do you patch Oracle Grid Infrastructure 19c (GI) when Oracle Data Guard protects your Oracle Database?

I had a talk with Ludovico Caldara, the product manager for Oracle Data Guard, about it:

To provide more details, I will use the following setup as an example:

  • Data Guard setup with two databases.
  • Each database is a 2-node RAC database.
  • Sites are called copenhagen and aarhus.

Patching Oracle Grid Infrastructure Only

  1. Prepare new GI homes on all nodes in both sites (copenhagen and aarhus).
  2. Disable Fast-Start Failover (FSFO) for the reasons described below. You can leave the observer running.
  3. Start with the standby site, aarhus.
  4. Complete the patching process by switching to the new GI home in a rolling manner on all nodes at aarhus site.
  5. If you use Active Data Guard and have read-only sessions in your standby database, you should ensure that instances are properly drained before restarting the GI stack (via root.sh).
  6. Proceed with the primary site, copenhagen.
  7. Complete the patching process by switching to the new GI home in a rolling manner on all nodes at copenhagen site.
  8. Be sure to handle draining properly to ensure there are no interuptions.
  9. Re-enable FSFO.

Later, when you want to patch the database, you can follow up the standby-first method described in Oracle Patch Assurance – Data Guard Standby-First Patch Apply (Doc ID 1265700.1). If the database patches you install are RAC Rolling Installable (like Release Updates), you should choose option 1 in phase 3 to avoid any downtime or brownout.

Alternative Approach

If you have many nodes in your cluster and an application that doesn’t behave well during draining, consider switching over to the standby site instead of patching the primary site in a rolling manner. When you switch over, there is only one interruption, whereas many interruptions in a rolling patch apply.

  1. Patch standby site, aarhus.
  2. Switch over to aarhus.
  3. Patch former primary, copenhagen.

What If You Want to Patch the Database At the Same Time?

Out-of-place SwitchGridHome

You get complete control over the process with Out-of-place SwitchGridHome. It is my preferred method. There are more commands to execute, but it doesn’t matter if you automate it.

Here is an overview of the process. You can use many of the commands from this blog post:

  1. Prepare new GI homes using gridSetup. Be sure to apply the needed patches. Do it on one node in both primary (copenhagen) and standby site (aarhus). The process will copy the new GI home to all other nodes in the cluster. Do not execute root.sh.
  2. Prepare new database homes. Be sure to apply the needed patches. Here is an example. Do it on one node in both primary (copenhagen) and standby site (aarhus). The process will copy the new database home to all other nodes in the cluster. Remember to execute root.sh.
  3. Disable FSFO.
  4. Start with the standby site, aarhus.
  5. Configure the standby database to start in the new database home:
    $ $OLD_ORACLE_HOME/bin/srvctl modify database \
         -db $STDBY_ORACLE_UNQNAME \
         -oraclehome $NEW_ORACLE_HOME
    
  6. If you use Active Data Guard and have read-only sessions connected, drain the instance.
  7. Switch to the new GI home using gridSetup.sh -switchGridHome ... and root.sh.
    1. root.sh restarts the entire GI stack. When it restarts the database, the database instance runs in the new database home.
    2. Repeat the process on all nodes in the standby site (aarhus).
  8. Proceed with the primary site, copenhagen.
  9. Configure the primary database to start in the new database home:
    $ $OLD_ORACLE_HOME/bin/srvctl modify database \
         -db $PRMY_ORACLE_UNQNAME \
         -oraclehome $NEW_ORACLE_HOME
    
  10. Be sure to drain the instance.
  11. Switch to the new GI home using gridSetup.sh -switchGridHome ... and root.sh.
    1. root.sh restarts the entire GI stack. When it restarts the database, the database instance runs in the new database home.
    2. Repeat the process on all nodes in the primary site (copenhagen).
  12. Execute datapatch -verbose on one of the primary database instances to finish the patch apply.
  13. Re-enable FSFO.

Out-of-place OPatchAuto

Out-of-place OPatchAuto is a convenient way of patching because it also automates the database operations. However, I still recommend using Out-of-place SwitchGridHome method because it gives you more control over draining.

Here is an overview of the process:

  1. Deploy new GI and database homes using opatchauto apply ... -prepare-clone. Do it on all nodes in both primary (copenhagen) and standby site (aarhus). Since you want to patch GI and database homes, you should omit the -oh parameter.
  2. Disable FSFO.
  3. Start with the standby site, aarhus.
  4. Complete patching of all nodes in the standby site (aarhus) using opatchauto apply -switch-clone.
    1. When OPatchAuto completes the switch on a node, it takes down the entire GI stack on that node, including database instance.
    2. GI restarts using the new GI home. But the database instance still run on the old database home.
    3. On the last node, after the GI stack has been restarted, all database instances restart again to switch to the new database home. This means that each database instance will restart two times.
  5. Proceed with the primary site, copenhagen.
  6. Complete patching of all nodes in the primary site (copenhagen) using opatchauto apply -switch-clone.
    1. The procedure is the same as on the standby site.
    2. In addition, OPatchAuto executes Datapatch to complete the database patching.
  7. Re-enable FSFO.

Fast-Start Failover

When you perform maintenance operations, like patching, consider what to do about Fast-Start Failover (FSFO).

If you have one standby database

  • Single instance standby I recommend disabling FSFO. If something happens to the primary database while you are patching the standby site, you don’t want to switch over or fail over automatically. Since the standby site is being patched, the standby database might restart shortly. You should evaluate the situation and determine what to do rather than relying on FSFO handling it.
  • RAC standby I recommend disabling FSFO for the same reasons as above. Now, you could argue that the standby database is up all the time if you perform rolling patching. That’s correct, but nodes are being restarted as part of the patching process, and services are being relocated. Having sessions switching over or failing over while you are in the middle of a rolling patch apply is a little delicate situation. Technically, it works; the Oracle stack can handle it. But I prefer to evaluate the situation before switching or failing over. Unless you have a super-cool application that can transparently handle it.

Nevertheless, leaving FSFO enabled when you patch GI or a database is fully supported.

If you have more standby databases

I recommend keeping FSFO enabled if you have multiple standby databases.

When you patch one standby database, you can set FastStartFailoverTarget to the other standby database. When patching completes, you can set FastStartFailoverTarget to the first standby database and continue patching the second standby database. This keeps your primary database protected at all times.

The Easy Way

As shown above, you can patch Oracle Grid Infrastructure even when you have Oracle Data Guard configured. But why not take the easy way and use Oracle Fleet Patching and Provisioning (FPP)?

FPP automatically detects the presence of Data Guard and executes the commands in the appropriate order, including invoking Datapatch when needed.

If you need to know more, you can reach out to Philippe Fierens, product manager for FPP. He is always willing to get you started.

Happy Patching

Appendix

Other Blog Posts in This Series

Pro Tips

Here’s a collection of good tips and tricks I found while writing this series of blog posts.

Pro Tip #1: How Do You Determine Grid Infrastructure Patch Level?

To determine the GI patch level:

[grid@copenhagen1]$ $ORACLE_HOME/OPatch/opatch lspatches | grep "OCW"

34444834;OCW RELEASE UPDATE 19.17.0.0.0 (34444834)

The inventory registers the GI Release Updates as OCW RELEASE UPDATE. In this example, GI is running on 19.17.0.

Sometimes critical one-off patches are delivered as merge patches with the GI Release Update. It can mess up the patch description. This example is from a Base Database Service in OCI:

[grid@copenhagen1]$ $ORACLE_HOME/OPatch/opatch lspatches | grep "OCW"

34122773;OCW Interim patch for 34122773

The patch description no longer contains the name of the Release Update. In this case, you can trawl through MOS to find the individual patches in the merge patch to identify which Release Update it contains. Or, you can often look at the ACFS patch instead:

[grid@copenhagen1]$ $ORACLE_HOME/OPatch/opatch lspatches | grep "ACFS"

34139601;ACFS RELEASE UPDATE 19.16.0.0.0 (34139601)

Pro Tip #2: Where Can You Find the Log Files?

Logging happens in different places depending on which method you use. Here are a few locations to browse when there are problems:

  • $GRID_HOME/install
  • $GRID_HOME/cfgtoollogs
  • $GRID_BASE/crsdata/<node>/crsconfig
  • /u01/app/oraInventory/logs

Pro Tip #3: Where Can You Find Information On Troubleshooting?

A few good MOS notes:

OPatchAuto enables you to control the logging granularity. If you run into problems, increase the logging level to get more information:

$ORACLE_HOME/OPatch/opatchauto ... -logLevel FINEST

In addition, OPatchAuto can resume a broken session. Fix the issue and restart OPatchAuto. It will pick up from where it left off:

$ORACLE_HOME/OPatch/opatchauto resume

Pro Tip #4: How Can I Install Patches Manually?

If you don’t want to use the automation tools (like OPatchAuto), you can install the patches manually using OPatch.

The details are in Supplemental Readme – Grid Infrastructure Release Update 12.2.0.1.x / 18c /19c (Doc ID 2246888.1).

The GI patch bundle contains several sub patches that must be installed in the correct order using opatch apply.

Pro Tip #5: How Do You Roll Back A Patch?

In-place OPatchAuto

You can find patch rollback (or deinstallation) instructions in the patch readme file. In short, you execute the following command:

$ORACLE_HOME/OPatch/opatchauto \
   rollback <unzipped_patch_location>/<patch_dir>

Note you might need to reboot the server.

Out-of-place OPatchAuto

You find rollback instructions in MOS note Grid Infrastructure Out of Place ( OOP ) Patching using opatchauto (Doc ID 2419319.1). In short, you execute the following command:

$NEW_ORACLE_HOME/OPatch/opatchauto \
   rollback \
   -switch-clone

Out-of-place SwitchGridHome

Check out this blog post.

Zero Downtime Oracle Grid Infrastructure Patching

You find rollback instructions in MOS note Step by Step Zero Downtime Oracle Grid Infrastructure Patching in Silent Mode (Doc ID 2865083.1). You need to execute a few commands. Check the MOS note for details.

Pro Tip #6: The FAQ

On My Oracle Support there is an extensive FAQ. Bookmark it: RAC: Frequently Asked Questions (RAC FAQ) (Doc ID 220970.1)

Appendix

Other Blog Posts in This Series

Which Method Should I Choose When Patching Oracle Grid Infrastructure 19c

I have shown you a few ways to patch Oracle Grid Infrastructure 19c (GI). Which one should you choose? Here’s an overview of the pros and cons of each method.

Just Grid Infrastructure Or Also The Database

You can patch:

  • Just GI and later on the database
  • Or GI and the database at the same time

If possible, I recommend patching GI and database in a separate maintenance operation. Proceed with the database when you are confident the new GI runs fine. If you do it a week apart, you should have enough time to kick the tires on the new GI.

I like to keep things separate. If there is a problem, I can quickly identify whether the GI or database patches are causing problems. The more patches you put in simultaneously, the more changes come in, and the harder it is to keep things apart.

The downside is that you now have two maintenance operations; one for GI and one for the database. But if your draining strategy works and/or you are using Application Continuity, you can complete hide the outage from your end users.

If you have a legacy application or draining is a nightmare for you, then it does make sense to consider patching GI and database at the same time.

In-place vs. Out-of-place

In-place OPatchAuto Out-of-place OPatchAuto Out-of-place SwitchGridHome Out-of-place ZDOGIP
Space usage Just for the new patches A new GI home and the new patches A new GI home and the new patches A new GI home and the new patches
Additional system resources No No No Yes
Node downtime Significant Short Short None (1)
Install multiple patches in one go Yes Yes Yes Yes
Grid Home change                      No                                                                                                   Yes. New Grid Home location. Scripts and profiles must be updated. New Grid Home to maintain and monitor Yes. New Grid Home location. Scripts and profiles must be updated. New Grid Home to maintain and monitor Yes. New Grid Home location. Scripts and profiles must be updated. New Grid Home to maintain and monitor
Grid Home origin N/A Existing Grid Home is cloned Fresh Grid Home from base release Fresh Grid Home from base release
Rollback complexity Significant Simple Simple Simple
Rollback node outage Significant Short Short None (1)

Notes:

  1. If you are using ACFS or ASM Filter Driver, you must restart the GI stack at one point in time. The node is down for a short period.

I recommend out-of-place patching. There are multiple ways of doing that. Choose the one that suits you best. My personal favorite is the SwitchGridHome method.

Happy Patching

There are even more methods than I have shown in this blog post series. I have demonstrated the methods that most people would consider. Evaluate the pros and cons yourself and choose what works best for you.

What’s your favorite? Why did you choose a specific method? Leave a comment and let me know.

Appendix

Further Reading

OPatchAuto Out-of-place

If you decide to patch GI and database at the same time, be aware of the following. The database instance will need to restart two times. First, each instance goes does to switch to the new GI. The second time is when you switch on the last node. Then all database instances are brought down again in a rolling manner and restarted in the new Oracle Home. If you want to control draining yourself, don’t use this method. The second database restarts happens completely automated one after the other. Without any possibility for you to intervene to control draining.

Other Blog Posts in This Series

My Best Advice on Patching Oracle Grid Infrastructure

I started this blog post series to learn about patching Oracle Grid Infrastructure 19c (GI). After spending quite some time patching GI, I got a pretty good feeling about the pros and cons. Here’s my advice on patching Oracle Grid Infrastructure.

Daniel’s Top Tips

  1. Always use the latest version of OPatch. Even if the patch you install does not require it, it is always a good idea to get the latest version of OPatch.

  2. Use out-of-place patching. Out-of-place patching allows you to prepare in advance and keep node downtime minimal. In addition, it makes fallback so much easier. Using the SwitchGridHome method, you can install multiple patches in one operation.

  3. Apply the latest Release Update or the second-latest Release Update (N-1 approach). Apply the latest Monthly Recommended Patches (MRP) on top.

  4. Use Application Continuity. It is such a cool feature. You can completely hide node outages from your end users. No more late-night patching. Patch your systems when it suits you.

  5. Patch GI and database in separate maintenance operations. My personal preference is to keep things separate. First, you patch GI. When you know it works fine, you proceed with the database, for instance, the week after.

  6. Keep GI and database patch level in sync. This means that you must patch GI and your Oracle Database at the same cadence. Ideally, that cadence is quarterly. If you follow advice no. 5, your patch levels will be out of sync for a week or so, but that’s perfectly fine as long as you patch GI first.

  7. Complete a rolling patch installation as quick as possible. Regardless of whether you install patches to GI or the database, you should complete the rolling patch operation as quick as possible. I have heard of customers that patch one node a day. In an 8-node RAC it will take more than a week to complete the patch operation. When in a rolling state certain things are disabled in GI. I strongly recommend that you complete the patch on all nodes as soon as possible.

Special Guest Star – Anil Nair

I had a chat with Mr. RAC himself, Anil Nair. Anil is Distinguished Product Manager at Oracle and responsible for RAC.

Anil’s top tips for patching Oracle Grid Infrastructure

  1. Use out-of-place patching
  2. Use Cluster Verification Utility (CVU) before and after patching
  3. Understand and set drain timeout
  4. Use Oracle Fleet Patching and Provisioning

Happy Patching

Thanks

I made it really far, and I learned a lot while writing this blog post series. I received good feedback on Twitter and LinkedIn, plus a lot of comments on my blog. Thank you so much. This feedback is really helpful, and I can use it to make the content so much better.

Also, a big thank-you to my colleagues that answered all my questions and helped me on my learning experience.

Appendix

Other Blog Posts in This Series

How to Patch Oracle Grid Infrastructure 19c Using Zero Downtime Oracle Grid Infrastructure Patching

Let me show you how I patch Oracle Grid Infrastructure 19c (GI) using Zero Downtime Oracle Grid Infrastructure Patching (ZDOGIP).

My demo system:

  • Is a 2-node RAC
  • Runs Oracle Linux
  • Is currently on 19.16.0, and I want to patch to 19.17.0
  • Uses neither ACFS nor ASM Filter Driver

I patch only the GI home. If I want to patch the database as well, I must do it separately.

I suggest you read A Word about Zero Downtime Oracle Grid Infrastructure Patching, especially if your system uses ACFS or ASM Filter Driver.

Preparation

I need to download:

  1. Download the base release of Oracle Grid Infrastructure (LINUX.X64_193000_grid_home.zip) from oracle.com or Oracle Software Delivery Cloud.
  2. Latest OPatch from My Oracle Support (6880880).
  3. The 19.17.0 Release Update for Oracle Grid Infrastructure from My Oracle Support. I will use the combo patch (34449117).

You can use AutoUpgrade to easily download GI patches.

I place the software in /u01/software.

How to Patch Oracle Grid Infrastructure 19c

1. Prepare a New Grid Home

I can do this in advance. It doesn’t affect my current environment and doesn’t cause any downtime.

  1. I need to create a folder for the new Grid Home. I must do this as root on all nodes in my cluster (copenhagen1 and copenhagen 2):

    [root@copenhagen1]$ mkdir -p /u01/app/19.17.0/grid
    [root@copenhagen1]$ chown -R grid:oinstall /u01/app/19.17.0
    [root@copenhagen1]$ chmod -R 775 /u01/app/19.17.0
    
    [root@copenhagen2]$ mkdir -p /u01/app/19.17.0/grid
    [root@copenhagen2]$ chown -R grid:oinstall /u01/app/19.17.0
    [root@copenhagen2]$ chmod -R 775 /u01/app/19.17.0
    
  2. I switch to the Grid Home owner, grid.

  3. I ensure that there is passwordless SSH access between all the cluster nodes. It is a requirement for the installation, but sometimes it is disabled to strengthen security:

    [grid@copenhagen1]$ ssh copenhagen2 date
    
    [grid@copenhagen2]$ ssh copenhagen1 date
    
  4. I extract the base release of Oracle Grid Infrastructure into the new Grid Home. I work on one node only:

    [grid@copenhagen1]$ export NEWGRIDHOME=/u01/app/19.17.0/grid
    [grid@copenhagen1]$ cd $NEWGRIDHOME
    [grid@copenhagen1]$ unzip -oq /u01/software/LINUX.X64_193000_grid_home.zip
    
  5. I update OPatch to the latest version:

    [grid@copenhagen1]$ cd $NEWGRIDHOME
    [grid@copenhagen1]$ rm -rf OPatch
    [grid@copenhagen1]$ unzip -oq /u01/software/p6880880_190000_Linux-x86-64.zip
    
  6. Then, I check the Oracle Grid Infrastructure prerequisites. I am good to go, if the check doesn’t write any error messages to the console:

    [grid@copenhagen1]$ export ORACLE_HOME=$NEWGRIDHOME
    [grid@copenhagen1]$ $ORACLE_HOME/gridSetup.sh -executePrereqs -silent
    
  7. I want to apply the 19.17.0 Release Update while I install the Grid Home. To do that, I must extract the patch file:

     [grid@copenhagen1]$ cd /u01/software
     [grid@copenhagen1]$ unzip -oq p34449117_190000_Linux-x86-64.zip -d 34449117
    
    • The combo patch contains the GI bundle patch which consists of:
      • OCW Release Update (patch 34444834)
      • Database Release Update (34419443)
      • ACFS Release Update (34428761)
      • Tomcat Release Update (34580338)
      • DBWLM Release Update (33575402)
    • I will apply all of them.
  8. Finally, I can install the new Grid Home:

    • I need to update the environment variables.
    • CLUSTER_NODES is a comma-separated list of all the nodes in my cluster.
    • The parameter -applyRU must point to the directory holding the OCW Release Update.
    • The parameter -applyOneOffs is a comma-separated list of the paths to each of the other Release Updates in the GI bundle patch.
    [grid@copenhagen1]$ export ORACLE_BASE=/u01/app/grid
    [grid@copenhagen1]$ export ORA_INVENTORY=/u01/app/oraInventory
    [grid@copenhagen1]$ export CLUSTER_NAME=$(olsnodes -c)
    [grid@copenhagen1]$ export CLUSTER_NODES=$(olsnodes | tr '\n' ','| sed 's/,\s*$//')
    [grid@copenhagen1]$ cd $ORACLE_HOME
    [grid@copenhagen1]$ ./gridSetup.sh -ignorePrereq -waitforcompletion -silent \
       -applyRU /u01/software/34449117/34449117/34416665 \
       -applyOneOffs /u01/software/34449117/34449117/34419443,/u01/software/34449117/34449117/34428761,/u01/software/34449117/34449117/34580338,/u01/software/34449117/34449117/33575402 \
       -responseFile $ORACLE_HOME/install/response/gridsetup.rsp \
       INVENTORY_LOCATION=$ORA_INVENTORY \
       ORACLE_BASE=$ORACLE_BASE \
       SELECTED_LANGUAGES=en \
       oracle.install.option=CRS_SWONLY \
       oracle.install.asm.OSDBA=asmdba \
       oracle.install.asm.OSOPER=asmoper \
       oracle.install.asm.OSASM=asmadmin \
       oracle.install.crs.config.ClusterConfiguration=STANDALONE \
       oracle.install.crs.config.configureAsExtendedCluster=false \
       oracle.install.crs.config.clusterName=$CLUSTER_NAME \
       oracle.install.crs.config.gpnp.configureGNS=false \
       oracle.install.crs.config.autoConfigureClusterNodeVIP=false \
       oracle.install.crs.config.clusterNodes=$CLUSTER_NODES
    
    • Although the script says so, I don’t run root.sh yet.
    • I install it in silent mode, but I could use the wizard instead.
    • You need to install the new GI home in a way that matches your environment.
    • For inspiration, you can check the response file used in the previous Grid Home on setting the various parameters.
    • If I have one-off patches to install, I can add them to the -applyOneOffs parameter.

2. Switch to the new Grid Home

Now, I can complete the patching process by switching to the new Grid Home. I do this one node at a time. Since I am using ZDOGIP there is no downtime.

  1. First, on copenhagen1, I switch to the new Grid Home:
    [grid@copenhagen1]$ export ORACLE_HOME=/u01/app/19.17.0/grid
    [grid@copenhagen1]$ export CURRENT_NODE=$(hostname)
    [grid@copenhagen1]$ $ORACLE_HOME/gridSetup.sh \
       -silent -switchGridHome \
       oracle.install.option=CRS_SWONLY \
       ORACLE_HOME=$ORACLE_HOME \
       oracle.install.crs.config.clusterNodes=$CURRENT_NODE \
       oracle.install.crs.rootconfig.executeRootScript=false
    
  2. Then, I run the root.sh script as root:
    • I must use the -transparent and -nodriverupdate parameters.
    • This step restarts the entire GI stack, but the resources it manages (databases, listener, etc.) stay up.
    • This step also restarts the ASM instance. Database instances on this node will switch to a remote ASM instance (Flex ASM). The database instances do not switch back to the local ASM instance after the GI restart.
    • In that period, the services stay ONLINE on this node.
    • The database instance on this node stays up all the time.
    • If my database listener runs out of the Grid Home, GI will move it to the new Grid Home, including copying listener.ora.
    [root@copenhagen1]$ /u01/app/19.17.0/grid/root.sh \
       -transparent \
       -nodriverupdate
    
  3. I update any profiles (e.g., .bashrc) and other scripts referring to the Grid Home.
  4. I connect to the other node, copenhagen2, and repeat steps 1-3. I double-check that the CURRENT_NODE environment variable gets updated to copenhagen2.

That’s it! I have now patched my Grid Infrastructure deployment.

Later on, I can patch my databases as well.

A Word about the Directory for the New Grid Home

Be careful when choosing a location for the new Grid Home. The documentation lists some requirements you should be aware of.

In my demo environment, the existing Grid Home is:

/u01/app/19.0.0.0/grid

Since I am patching to 19.17.0, I think it makes sense to use:

/u01/app/19.17.0/grid

If your organization has a different naming standard, that’s fine. Just ensure you comply with the requirements specified in the documentation.

Don’t Forget to Clean Your Room

At a future point, I need to remove the old Grid Home. I use the deinstall tool in the Grid Home. I execute the command on all nodes in my cluster:

$ export OLD_GRID_HOME=/u01/app/19.0.0.0/grid
$ export ORACLE_HOME=OLD_GRID_HOME
$ $ORACLE_HOME/deinstall/deinstall -local

I will wait until:

  • I have seen the new Grid Home run without problems for a week or two.
  • I have patched my Oracle Databases managed by GI.
  • I have seen my Oracle Databases run without GI-related problems for a week or two.

Happy Patching!

Appendix

Important Patches

  • Bug 37033171 : ZERO DOWNTIME PATCHING HANGS ON ROOT SCRIPT EXECUTION.
  • Bug 37827355 : ZERO DOWNTIME SWITCH GRID HOME HANGS ON ‘CRSCTL START CRS -WAIT -TGIP’ WITH BUG 37033171 FIX APPLIED

Further Reading

Other Blog Posts in This Series

A Word about Zero Downtime Oracle Grid Infrastructure Patching

In previous blog posts, I have shown you how to patch Oracle Grid Infrastructure 19c (GI) in a rolling manner. However, all those methods require a shutdown of the GI stack on each node. The database remains up all the time, but individual nodes will go down for a period.

You can change that with Zero Downtime Oracle Grid Infrastructure Patching (ZDOGIP):

Zero-downtime Oracle Grid Infrastructure patching enables patching of Oracle Grid Infrastructure without interrupting database operations. Patches are applied out-of-place and in a rolling fashion, with one node being patched at a time, while the database instances on this node remain operational. Zero-downtime Oracle Grid Infrastructure patching supports Oracle Real Application Clusters (Oracle RAC) databases on clusters with two or more nodes.

ZDOGIP achieves this with a bit of trickery:

When using Zero Downtime Patching, only the binaries in the Oracle Grid Infrastructure user space are patched. Additional Oracle Grid Infrastructure OS system software, kernel modules and system commands including ACFS, AFD, OLFS, and OKA, are not updated. These commands continue to run the version previous to the patch version.

Questions and Answers

Several questions came to my mind when I read about ZDOGIP. Here’s a summary:

How Can the Database Survive Without the ASM Instance?

ZDOGIP uses out-of-place patching and switches to a new, patched home. The database remains up while the GI stack restarts. The ASM instance restarts as well. How can the database survive that? The answer is Oracle Flex ASM. While the GI stack restarts, the database can access an ASM instance on another hub and access the shared storage directly.

This is an extract of the alert log. It shows how a database switches to a remote ASM instance (+ASM2) during a zero downtime patching session:

2023-02-24T09:18:03.663616+00:00
ALTER SYSTEM RELOCATE CLIENT TO '+ASM2'
2023-02-24T09:18:09.855879+00:00
NOTE: ASMB (9427) relocating from ASM instance +ASM1 to +ASM2 (User initiated)
NOTE: ASMB (index:0) registering with ASM instance as Flex client 0x1409d79b6620c71c (reg:163770215) (startid:1129292367) (reconnect)
NOTE: ASMB (index:0) (9427) connected to ASM instance +ASM2, osid: 53135 (Flex mode; client id 0x1409d79b6620c71c)
NOTE: ASMB (9427) rebuilding ASM server state for all pending groups
NOTE: ASMB (9427) rebuilding ASM server state for group 2 (RECO)

What Are the Minimum Requirements?

  • Source GI must be on 19.16.0 or above.
  • Important fixes have been added in later Release Updates, so I recommend that the source GI is on 19.19.0 or above.
  • The databases managed by GI must be on Oracle Database 19c. If the database is older, then it must restart during patching.

Can I Use It with ASM Filter Driver and ASM Cluster File System?

Yes, but if you are using ASM Filter Driver (AFD) or ASM Cluster File System (ACFS) and the patch you are applying updates these components, special attention is needed. You can’t update the kernel drivers when GI is running. However, the kernel drivers must be updated. With ZDOGIP you can postpone the update of the kernel drivers, and thus postpone the restart of the entire GI stack including the database that it manages.

In short, if you are using AFD or ACFS, you more of less lose the benefit of ZDOGIP because the database must restart anyway (although you can postpone it to a later time).

This feature is recommended for the configurations that do not have (ACFS/AFD/OKA/OLFS).

GI installs ACFS by default and that doesn’t prevent the use of ZDOGIP. You should interpret the above quote as when you have configured or use an ACFS volume.

You should expect that every Release Update contains patches for AFD and ACFS.

If you use AFD or ACFS, I recommend relying on rolling patch installation instead and investing your time in Application Continuity.

The quotes are from MOS note Zero-Downtime Oracle Grid Infrastructure Patching (ZDOGIP). (Doc ID 2635015.1).

How Do I Use Zero Downtime Oracle Grid Infrastructure Patching Together with ASM Filter Driver and ASM Cluster File System?

After patching with ZDOGIP, you must restart the entire GI stack, including the local database instance. You must do this shortly after the patch apply.

The procedure involves executing root.sh -updateosfiles. You will find the full details in the documentation.

How Can I Tell Whether ASM Filter Driver or ACFS Is Installed?

To see whether your system uses AFD:

$ORACLE_HOME/bin/asmcmd afd_state

To see whether your system uses ACFS:

$ORACLE_HOME/bin/crsctl query driver activeversion -all

GI installs ACFS by default and I don’t recommend trying to remove it. You can still use ZDOGIP and avoid a restart even when you have ACFS installed. This changes, however, when you configure or use an ACFS volume. Only then, do you need to pay extra care.

Does It Work for Single Instance Databases as Well?

No, this feature is for Oracle RAC databases only.

Can I Use Zero Downtime Oracle Grid Infrastructure Patching with Oracle Fleet Patching and Provisioning

Yes, you can. There is a simple command line parameter that you can use, which tells Oracle Fleet Patching and Provisioning (FPP) to use ZDOGIP:

rhpctl move gihome ... -tgip

Patching can becomes slightly more complicated when you use ZDOGIP. You can alleviate that complexity by using FPP.

What Do All the Abbreviations Mean?

When you read the documentation and the MOS notes, you will come across several abbreviations. Here’s a handy list of some of them:

Abbreviation Meaning
ACFS ASM Cluster File System
ADVM ASM Dynamic Volume Manager
AFD ASM Filter Driver
OKA OS Kernel extensions
OLFS Oracle Layered File System

Appendix

Further Reading

Other Blog Posts in This Series

Why You Need to Use Oracle Fleet Patching and Provisioning

First, what is Oracle Fleet Patching and Provisioning (FPP)?

Oracle Fleet Patching & Provisioning (formerly known as Oracle Rapid Home Provisioning) is the recommended solution for performing lifecycle operations (provisioning, patching & upgrades) across entire Oracle Grid Infrastructure and Oracle RAC Database fleets and the default solution used for Oracle Database Cloud services.

Oracle Fleet Patching and Provisioning automates your lifecycle activities

In my own words, a central server that provision, patch, and upgrade your Oracle Databases, Oracle Grid Infrastructure, and Oracle Exadata stack.

With FPP, the lifecycle operations are automated and handled centrally. The more systems you have, the greater the benefit of FPP.

If you are constantly busy trying to keep up with all the patching, or struggling to maintain your tooling, then it is time to look into FPP!

Many Good Reasons To Use Oracle Fleet Patching and Provisioning

Oracle Fleet Patching and Provisioning:

  • Uses a gold image approach to provision new Oracle Database and Grid Infrastructure homes.
  • Manages your software centrally. Install a new database or GI home based on a gold image centrally with one command.
  • Patches your database or GI home, including running datapatch.
  • Uses out-of-place patching to minimize downtime.
  • Makes it easy to use Zero Downtime Oracle Grid Infrastructure Patching (ZDOGIP).
  • Patches your entire Oracle Exadata stack, including the nodes, storage cells, and RoCE and Infiniband network.
  • Upgrades your databases using different approaches. Of course, FPP uses AutoUpgrade underneath the hood.
  • Creates new databases.
  • Adds nodes to your Oracle RAC cluster.
  • Detects missing bugfixes. If prior to patching database or GI home doesn’t have the expected patches, you are notified. Optionally, you can add the missing fixes to your gold image.
  • Checks for configuration drift. Compares the bug fixes in a gold image to the homes deployed.
  • Adheres to the MAA best practices. Provides options to control session draining, and perfectly integrates with Transparent Application Continuity. It will always use the latest HA and MAA features.
  • Does it the same way every time. No more human errors.

All of the above is done centrally from the FPP server.

Imagine how much time you spend on patching your Oracle stack. Now, take into consideration that you are doing it every quarter. Now, take Monthly Recommended Patches into account. Now, take all the other tasks into account. You can really save a lot of time using Oracle Fleet Patching and Provisioning.

But …

I need to learn a new tech!

You’re right. But there are so many resources to get you started:

It Requires a License

True, but it comes included in your Oracle RAC license. If you don’t have that, you must license the Enterprise Manager Lifecycle Management pack. Check the license guide for details.

My Database Is On Windows

OK, bummer. Then you can’t use FPP. It supports Linux, Solaris, and AIX only.

What’s the Result?

Settling the score is easy. There are many more pros than cons.

I recommend using Oracle Fleet Patching and Provisioning when you manage more than a few systems. It will make your life so much easier.

I interviewed Philippe Fierens, the product manager for Oracle Fleet Patching and Provisioning, on the benefits of using FPP.

Appendix

Further Reading

Other Blog Posts in This Series

Use Cluster Verification Utility (cluvfy) and Avoid Surprises

I guess no one really wants nasty surprises. That’s also true when you patch Oracle Grid Infrastructure 19c (GI). Luckily, you can prepare yourself using the Cluster Verification Utility and its oddly sounding command line interface, cluvfy (I pronounce it cluffy).

The Grid Infrastructure Release Update readme says:

2.1.3 Patch Installation Checks The Cluster Verification Utility (CVU) command line interface (CLUVFY) may be used to verify the readiness of the Grid Home to apply the patch. … The CLUVFY command line for patching ensures that the Grid Home can receive the new patch and also ensures that the patch application process completed successfully leaving the home in the correct state.

That sounds like a good idea. You can check the GI home before and after patching.

Where Is Cluster Verification Utility?

Although Cluster Verification Utility (CVU) is present in my Grid Infrastructure (GI) home, I always get the latest version from My Oracle Support. I find it via patch 30839369.

CVU is backward compatible, so I download the release Oracle Database 21.0.0.0.0.

  1. I must install CVU on one node only. CVU connects to the other nodes and checks them as well. I log on as grid:
    [grid@copenhagen1]$ export CVUHOME=/u01/app/grid/cvu
    [grid@copenhagen1]$ mkdir $CVUHOME
    
  2. I extract the zip file I downloaded from My Oracle Support patch 30839369:
    [grid@copenhagen1]$ cd $CVUHOME
    [grid@copenhagen1]$ unzip /u01/app/grid/patches/cvupack_linux_ol7_x86_64.zip   
    
  3. I check that it works:
    [grid@copenhagen1]$ export PATH=$CVUHOME/bin:$PATH
    [grid@copenhagen1]$ cluvfy -version
    Version 21.0.0.0.0 Build 011623x8664
    Full version 21.9.0.0.0
    

Check Before Patching

Before I start patching GI, I check my cluster with cluvfy. I log on as grid:

[grid@copenhagen1]$ export PATH=$CVUHOME/bin:$PATH
[grid@copenhagen1]$ cluvfy stage -pre patch

cluvfy prints the report to the screen. In this case, all is good, no problems were found:

Performing following verification checks ...

  cluster upgrade state ...PASSED
  OLR Integrity ...PASSED
  Hosts File ...PASSED
  Free Space: copenhagen2:/ ...PASSED
  Free Space: copenhagen1:/ ...PASSED
  OPatch utility version consistency ...PASSED
  Software home: /u01/app/19.0.0.0/grid ...PASSED
  ORAchk checks ...PASSED

Pre-check for Patch Application was successful.

Here are a few examples of problems detected by cluvfy:

  • No SSH connection between the nodes:
    User Equivalence ...FAILED (PRVG-2019, PRKC-1191)
    PRVF-4009 : User equivalence is not set for nodes: copenhagen2
    Verification will proceed with nodes: copenhagen1
    
  • Different versions of OPatch on the individual nodes:
    Performing following verification checks ...
    
       cluster upgrade state ...PASSED
       OLR Integrity ...PASSED
       Hosts File ...PASSED
       Free Space: copenhagen2:/ ...PASSED
       Free Space: copenhagen1:/ ...PASSED
       OPatch utility version consistency ...WARNING (PRVH-0668)
    

Check After Patching

The patch readme advises me to rerun cluvfy after patching:

[grid@copenhagen1]$ export PATH=$CVUHOME/bin:$PATH
[grid@copenhagen1]$ cluvfy stage -post patch

Luckily, I patched GI without any problems. cluvfy tells me all is good:

Performing following verification checks ...

  cluster upgrade state ...PASSED

Post-check for Patch Application was successful.

Happy Cluvfy’ing!

Appendix

Further Reading

Cluster Verification Utility in Your Grid Infrastructure Home

I can also find Cluster Verification Utility (CVU) in my GI home.

[grid@copenhagen1]$ cd $ORACLE_HOME/bin
[grid@copenhagen1]$ ls -l cluvfy
-rwxr-xr-x 1 root oinstall 10272 Feb 20 11:47 cluvfy

The tool gets updated with Release Updates. However, when I try to use it, it prints this warning:

[grid@copenhagen1]$ cd $ORACLE_HOME/bin
[grid@copenhagen1]$ ./cluvfy stage -pre patch
This software is "223" days old. It is a best practice to update the CRS home by downloading and applying the latest release update. Refer to MOS note 756671.1 for more details.

Even if I patch my GI home with the newest Release Update, I am not guaranteed to get the latest version of CVU.

Thus, I recommend always getting the latest version of CVU from My Oracle Support.

Other Blog Posts in This Series