If Importing Statistics Using DBMS_STATS Is Slow

When migrating Oracle Databases, you often transport the statistics using dbms_stats. Our team got involved in an interesting case the other day, and I would like to share the solution with you.

Problem

A customer migrated to Oracle Database 19c and decided to move the statistics using dbms_stats.import_schema_stats. They noticed that the procedure started to burn a lot of CPU, and after 38 hours, they gave up.

A SQL Monitor report showed one statement which spent almost all the time. An insert statement into sys.dbms_stats_id_map_tab with a subselect referencing the staging table. The staging table is the one you specify in the argument stattab in the call to dbms_stats.import_schema_stats. The staging holds the optimizer statistics in a portable format. From the staging table, the procedure can insert it into the data dictionary in the proper format.

Obviously, they could see already that the procedure would use far too much time.

Diagnosing

The first shot was to look at the code behind dbms_stats. But this specific code hadn’t changed since its introduction many releases ago.

Recursive statements that touch the data dictionary immediately brought our attention to dictionary statistics. But the customer told us that they were not stale.

The customer ran SQL Tuning Advisor on the offending statement, and one of the findings was about stale statistics. SQL Tuning Advisor recommended gathering statistics on the staging table and corresponding indexes.

One of our performance experts looked at the execution plan and found a pattern he had seen before. He tried to disable Join Predicate Push Down in the session. It helped, but this was just a workaround. We wanted to find the root cause.

Solution

The SQL Tuning advisor came up with the real problem. Stale statistics on the staging table and corresponding indexes. Once the customer gathered statistics on the staging table and indexes, the import of statistics finished in 2 hours and 27 minutes which was acceptable.

We also discovered that the dictionary statistics were not as accurate as the customer had initially concluded. In fact, by mistake, they had misinformed us. A fresh run of dbms_stats.gather_dictionary_stats gave a slight performance improvement as well.

Recommendations

These recommendations are now part of our best practices for migrations.

  1. Before importing statistics using dbms_stats.import_schema_stats, gather stats on the staging table.

  2. Immediately after importing a lot of data, and before you import statistics with dbms_stats or anything else, you should gather dictionary statistics. This applies to regular Data Pump imports and transportable tablespace metadata imports.

The run book should look like this:

$ impdp parfile=import.par ...
$ sqlplus / as sysdba

SQL> exec dbms_stats.gather_dictionary_stats; 
SQL> exec dbms_stats.gather_table_stats(
		ownname=>'SYSTEM',
		tabname=>'MY_STG_TAB_1',
		cascade=>TRUE);
SQL> exec dbms_stats.import_schema_stats(
		ownname=>'SYSTEM', 
		stattab=>'MY_STG_TAB_1', 

In the above example, the staging table is called SYSTEM.MY_STG_TAB_1.

The recommendation applies as well if you are using the procedures dbms_stats.import_database_stats or dbms_stats.import_table_stats.

Dictionary Statistics

Why is it important to gather dictionary statistics immediately after the import?

When you import data or use transportable tablespaces, you often have a brand new, empty database. Then you import a lot of objects. Those objects are represented as rows in the data dictionary. For instance, the tables you import now appear as rows in SYS.TAB$, the partitions in SYS.TABPART$, the indexes in SYS.IND$, and so forth.
Those internal tables were almost empty before – there were not a lot of tables. Now they have a lot of rows. This means that the statistics are stale. When you start to use functionality in the database, like importing statistics, recursive queries using the internal tables will be executed. With stale statistics on the dictionary, you can have suboptimal execution plans and bad performance. Gathering dictionary statistics can fix this for you.

Conclusion

Statistics are always vital, whether optimizer statistics on user data or internally in the data dictionary. Be sure to verify the accuracy of your statistics when you have problems.

Also, SQL Tuning Advisor is a great tool. It can quickly come up with suggestions for fixing problems. Use the recommendations as input to your troubleshooting. SQL Tuning Advisor also works on internal tables.

Additional Information

We have a few videos on our YouTube channel which have more information about transporting statistics with dbms_stats.

Leave a comment