A misleading ORA-16047
Posted by Harald van Breederode on February 19, 2009
Last week I was teaching a Data Guard course and ran a demo how to setup a physical standby database on a Real Application Clusters (RAC) database. Normally all my demos run smoothly but not this time. First I received an ORA-16052 error:
SQL> alter system set log_archive_dest_2 2 = 'SERVICE=tutor1d_nledu02 LGWR SYNC AFFIRM'; alter system set log_archive_dest_2 * ERROR at line 1: ORA-02097: parameter cannot be modified because specified value is invalid ORA-16052: DB_UNIQUE_NAME attribute is required
The error message seemed pretty clear and was quickly corrected:
SQL> alter system set log_archive_dest_2 2 = 'SERVICE=tutor1d_nledu02 LGWR SYNC AFFIRM db_unique_name=tutor1d_nledu02'; System altered.
Note: Nothing was changed on my demo environment except upgrading from 126.96.36.199.0 to 188.8.131.52.0 But that isn’t considered a change is it?
Secondly by the time the standby should receive redo from the primary nothing happened. No matter how often I switched redo logs on the primary, the standby did not receive a single redo log entry. Looking in the alert.log on the standby showed:
Sun Feb 15 16:22:57 2009 Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST Redo Shipping Client Connected as PUBLIC -- Connected User is Valid RFS: Assigned to RFS process 25545 RFS: Identified database type as 'physical standby' Primary database is in MAXIMUM PERFORMANCE mode RFS LogMiner: Client disabled from further notification
This seems to indicate that the primary database connects successfully to the standby and redo should come in. Looking in the alert.log on the primary however revealed:
Sun Feb 15 16:22:50 2009 ALTER SYSTEM SET log_archive_dest_2='SERVICE=tutor1d_nledu02 LGWR SYNC AFFIRM db_unique_name=tutor1d_nledu02' SCOPE=BOTH; Sun Feb 15 16:22:53 2009 ****************************************************************** LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2 ****************************************************************** Sun Feb 15 16:22:54 2009 LNSb started with pid=41, OS id=5228 Errors in file /home/tutor1d/diag/rdbms/tutor1d/tutor1d1/trace/tutor1d1_lgwr_30073.trc: ORA-16047: DGID mismatch between destination setting and standby LGWR: Error 16047 creating archivelog file 'tutor1d_nledu02' LGWR: Failed to archive log 2 thread 1 sequence 35 (16047) Thread 1 advanced to log sequence 35 (LGWR switch) Current log# 2 seq# 35 mem# 0: /sd01/oradata/TUTOR1D/onlinelog/o1_mf_2__bbb38e65.log Thread 1 cannot allocate new log, sequence 36 Checkpoint not complete Current log# 2 seq# 35 mem# 0: /sd01/oradata/TUTOR1D/onlinelog/o1_mf_2__bbb38e65.log Sun Feb 15 16:23:02 2009 ARC0: Archivelog destination LOG_ARCHIVE_DEST_2 disabled: Data Guard configuration identifier mismatch
Thus the standby tells me that the primary connected successfully whilst the primary tells me it can not connect to the standby due to an ORA-16047 error. Here is the description of an ORA-16047:
$ oerr ora 16047 16047, 00000, "DGID mismatch between destination setting and standby" // *Cause: The DB_UNIQUE_NAME specified for the destination does not match // the DB_UNIQUE_NAME at the destination. // *Action: Make sure the DB_UNIQUE_NAME specified in the LOG_ARCHIVE_DEST_n // parameter defined for the destination matches the DB_UNIQUE_NAME // parameter defined at the destination.
You already saw that I specified tutor1d_nledu02 as the
LOG_ARCHIVE_DEST_2 and here is the prove that my standby database has indeed the correct
SQL> select db_unique_name from v$database; DB_UNIQUE_NAME ------------------------------ tutor1d_nledu02
As you can imagine I was completely puzzled by this conflicting information and having students awaiting your actions doesn’t give you much time for troubleshooting. So I decided to continue my demo and see what would happen if the Data Guard Broker was given controll over my setup. Much to my surprise redo started to flow once the broker was in charge! My first reaction was “What can the broker do what I can’t?” I did not have time to figure this out during class but I did during the weekend following.
The good news is that I was able to reproduce it but I still could not get redo flowing myself. I re-created my RAC database and suddenly my demo worked fine. The joy was only short because the second time I ran the demo the same problem showed up again. This led to the conclusion that it was probably database parameter related and peeking in the RAC spfile revealed that
log_archive_config was configured. I did not set it when I re-created my RAC database nor did I set it in my demo. .It turned out that it was set by the broker and that I did not clean things up after running the demo. Thus all I had to do was setup
log_archive_config in my demo to fix it permanently.
SQL> alter system set log_archive_config 2 = 'dg_config=(tutor1d,tutor1d_nledu02)'; System altered.
Had I paid more attention to the first ORA-16052 error it would have safe me a lot of time, because the error description is pretty clear:
$ oerr ora 16052 16052, 00000, "DB_UNIQUE_NAME attribute is required" // *Cause: The DB_UNIQUE_NAME attribute is required when DG_CONFIG is enabled. // *Action: Use the DB_UNIQUE_NAME attribute to specify a valid Data Guard // Name for the destination. The list of valid DB_UNIQUE_NAMEs can // be seen with the V$DATAGUARD_CONFIG view.
Sometimes things are not what they seems! In this case the primary database tried to connect to the standby database to verify its
DB_UNIQUE_NAME. But the missing
log_archive_config parameter on the standby database prevented the primary from logging in. Resulting in a misleading ORA-16047.
In order to prevent lengthy troubleshooting sessions one should either look error messages up or use the Data Guard Broker ;-)