Short answer: Yes!
Recently, I have been asked a few similar questions:
- What happens if my SSH session with AutoUpgrade is lost? (see appendix)
- What happens if AutoUpgrade crashes?
- What happens if I exit the console by mistake?
First, don’t panic. Second, just restart AutoUpgrade using the same command line. During startup, AutoUpgrade will figure out that it should recover the lost session, and will restart the upgrades.
When AutoUpgrade dies or is terminated, the database upgrades that it started, dies with it. This could happen if you lost your SSH session. The database upgrade stops, but the database is still running, most likely in UPGRADE mode.
If you exit AutoUpgrade by mistake (typing
exit in the job console), it will first stop the upgrade, and then shutdown the database.
In any case, when you afterwards restart AutoUpgrade, it will figure out that a previous AutoUpgrade session was running. It will recover information from the previous session, and if needed restart the database. After that, it will restart the upgrade. If the previous database upgrade was at phase 54, AutoUpgrade will restart from phase 54. This means that all previous work in the upgrade is preserved, and you can resume as if nothing had happened.
Don’t Recover Previous Session
If you for some reason don’t want AutoUpgrade to recover the previous session. Let’s imagine that AutoUpgrade crashed, and you decided to restore the database. Now you want to start all over. In that case, you need to clear the recovery data, otherwise, AutoUpgrade will get confused.
You can read more about the parameters in the documentation.
The Little Hammer (Preferred)
You can clear the recovery for a specific job by adding
clear_recovery_data on the command line and use
jobs parameter to specific exactly for which jobs recovery data must be cleared.
$ java -jar autoupgrade.jar -config PROD.cfg -mode analyze -clear_recovery_data -jobs 100,101,102
Now, AutoUpgrade will start right from the beginning again but only for the specified jobs.
The Big Hammer
If you don’t specify
jobs parameter then AutoUpgrade will clear recovery data for all jobs:
$ java -jar autoupgrade.jar -config PROD.cfg -mode analyze -clear_recovery_data
Be advised, that this will happen for all the upgrades that are specified in the config file. Remember, that one of the big benefits of AutoUpgrade is that one config file can be used to upgrade 10s or 100s of databases.
I would recommend the previous hammers but use this approach as the last solution: Delete all files that are used by AutoUpgrade.
First, delete the directory specified in
global.autoupg_log_dir. Next, delete the directory specified in
.log_dir. Typically and by default, the second directory is a subdirectory to the first one, so in most cases you just have to delete the first directory. If you have multiple upgrades specified in the same config file you potentially need to delete multiple directories for
prefix2.log_dir and so forth.
Be aware that you are clearing out all information that is used by AutoUpgrade. If you use the same global logging directory for multiple AutoUpgrade sessions (which I would not recommend), then you will be seriously messing things up. But if you are only upgrading this specific database on the server, then you can safely delete the directories to start all over.
Restoring a Test Database – Starting All Over
Very often a test database is upgraded multiple times. Even after a successful upgrade, you might want to retry the upgrade with different settings. If you use AutoUpgrade you must clear the recovery data as specified above. AutoUpgrade doesn’t know that you have restored the database. For all it know, the previous upgrade was successful.
Resuming an AutoUpgrade session is very simple. Just start AutoUpgrade with the same command line. It identifies the previous AutoUpgrade session, and resumes automatically. All the previous work is recovered, and the upgrade will resume from where it was stopped.
Lost SSH Session
I heard from several people that they experienced the SSH session timing out because AutoUpgrade didn’t produce any screen output while the upgrade took place. We have put into a our plans to make some sort of regular screen output, so this should be avioded.
Before it is implemented, I would suggest that you look at the keep alive options in SSH:
$ man ssh
Personally, I always start SSH this way, and you can put it into your SSH config:
ssh -o ServerAliveInterval=300