The Dutch Prutser's Blog

By: Harald van Breederode

  • Disclaimer

    The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.
  • Subscribe

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 243 other followers

Password file maintenance in a Data Guard environment

Posted by Harald van Breederode on June 13, 2011

In a previous posting I wrote about password file maintenance in a clustered ASM and RAC environment.
This article raised another question: Is there anything specific about password file maintenance in a Data Guard environment?

Yes, updating a password file in a Data Guard environment isn’t as straight forward as one might think. In this posting I will shed some light on this subject and show you how to properly update a password file in a Data Guard environment.

Let’s get started by taking a look at my Data Guard setup before diving into password file maintenance procedures.

DGMGRL> show configuration;

Configuration - PeppiEnKokki

  Protection Mode: MaxPerformance
  Databases:
    peppi - Primary database
    kokki - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

This output shows that I have a primary database called peppi, a physical standby database called kokki and that the overall status of my Data Guard configuration is healthy.

Note: peppi runs on host prutser and kokki runs on host el5.

Revealing the problem

The first question is: Are updates to the password file on the primary database propagated to the standby database(s)? We can easily figure this out by triggering an update to the password file on peppi followed by querying the password file on kokki.

SQL> connect sys/oracle@peppi as sysdba
Connected.
SQL> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE

Currently there is only one entry in the password file on peppi. By granting SYSDBA to SYSTEM we can trigger an update to the password file as shown below:

SQL> grant sysdba to system;

Grant succeeded.

SQL> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE
SYSTEM                         TRUE  FALSE FALSE

There are now two entries in the password file on the primary database. How many entries are there on the standby database?

SQL> connect sys/oracle@kokki as sysdba
Connected.
SQL> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE

The output above makes clear that updates to the password file on the primary database are not propagated to the standby database.

Before continuing, we take away SYSDBA from SYSTEM because we don’t want SYSTEM to become too powerful, do we?

SQL> connect sys/oracle@peppi as sysdba
Connected.
SQL> revoke sysdba from system;

Revoke succeeded.

Does this affect Data Guard?

The next question is: Does this affect Data Guard? The primary database sends its redo to its standby database(s) and this redo transport is authenticated by the password of the SYS user, or another user if configured, of the primary database. That is, the primary database logs into a standby database by using the password stored in the password file for the user who ships the redo, which is most likely SYS. If the (encrypted) passwords of the primary database and the standby database(s) don’t match redo transport will (eventually) be in trouble.

To demonstrate this behavior we will change the SYS password on peppi and see if or how it affects kokki.

SQL> alter user sys identified by prutser;

User altered.
SQL> connect sys/prutser@peppi as sysdba
Connected.
SQL> connect sys/prutser@kokki as sysdba
ERROR:
ORA-01017: invalid username/password; logon denied


Warning: You are no longer connected to ORACLE.

The above output shows (again) that an update to the password file on the primary database is not automatically propagated to the standby database(s).

SQL> connect sys/oracle@kokki as sysdba
Connected.

As a matter of fact, we can still connect to kokki using the old SYS password as shown above.

The next question is: Is redo transport still possible now that the passwords are no longer the same? Let’s switch a few logs on the primary database and see what happens.

SQL> connect sys/prutser@peppi as sysdba
Connected.
SQL> alter system switch logfile;

System altered.

SQL> alter system switch logfile;

System altered.

SQL> alter system switch logfile;

System altered.

DGMGRL> show configuration;

Configuration - PeppiEnKokki

  Protection Mode: MaxPerformance
  Databases:
    peppi - Primary database
    kokki - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

According to the above output everything is still fine. Apparently the password change didn’t affect the redo transport. This is because the primary database was already logged into the standby database and as long as this connection remains open there is no need for the standby database to re-authenticate the incoming redo. However if we disable and re-enable redo transport it becomes clear that we indeed have a problem as shown below.

DGMGRL> edit database kokki set property LogShipping=off;
Property "logshipping" updated

DGMGRL> edit database kokki set property LogShipping=on;
Property "logshipping" updated

DGMGRL> show configuration;

Configuration - PeppiEnKokki

  Protection Mode: MaxPerformance
  Databases:
    peppi - Primary database
      Error: ORA-16778: redo transport error for one or more databases

    kokki - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
ERROR

It is pretty clear that redo transport ceased indicated by the ORA-16778. It is important to realize that a problem caused by updating the password file on the primary database doesn’t necessarily show up immediately but may show up much later in time.

What is an ORA-16778 anyway?

$ oerr ora 16778
16778, 00000, "redo transport error for one or more databases"
// *Cause:  The redo transport service was unable to send redo data to one
//          or more standby databases.
// *Action: Check the Data Guard broker log and Oracle alert log for
//          more details. Query the LogXptStatus property to see the
//          errors.

Because we just changed the SYS password on the primary database we already know what caused the ORA-16778. So the question is: How do we fix this?

In search of a solution

Maybe we can simply re-create the password file on kokki with the new SYS password? Let’s give it a try:

el5$ rm $ORACLE_HOME/dbs/orapwv1120

el5$ orapwd file=$ORACLE_HOME/dbs/orapwv1120 password=prutser

SQL> connect sys/prutser@kokki as sysdba
Connected.

That seems to work! The question is of course: Does Data Guard agree with my optimism?

SQL> connect sys/prutser@peppi as sysdba
Connected.
SQL> alter system switch logfile;

System altered.

SQL> alter system switch logfile;

System altered.

SQL> alter system switch logfile;

System altered.

DGMGRL> show configuration;

Configuration - PeppiEnKokki

  Protection Mode: MaxPerformance
  Databases:
    peppi - Primary database
      Error: ORA-16778: redo transport error for one or more databases

    kokki - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
ERROR

Hmmm, this doesn’t look too good, does it? Maybe we need to disable and re-enable redo shipment?

DGMGRL> edit database kokki set property LogShipping=off;
Property "logshipping" updated

DGMGRL> edit database kokki set property LogShipping=on;
Property "logshipping" updated

DGMGRL> show configuration;

Configuration - PeppiEnKokki

  Protection Mode: MaxPerformance
  Databases:
    peppi - Primary database
      Error: ORA-16778: redo transport error for one or more databases

    kokki - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
ERROR

No, this didn’t help either. Re-creating the password file on the standby database is not the solution. The reason for this is the way the passwords are encrypted in the password file. Even if the passwords are the same, the result of the encryption is not.

How about copying the password file from the primary database to the standby database?

The solution

We will copy the password file from peppi to kokki and see if it works:

$ scp $ORACLE_HOME/dbs/orapwv1120 el5:$ORACLE_HOME/dbs/orapwv1120
orapwv1120                                    100% 1536     1.5KB/s   00:00    

SQL> connect sys/prutser@kokki as sysdba
Connected.

Again we can connect to kokki with the updated SYS password, but is Data Guard happy now?

SQL> connect sys/prutser@peppi as sysdba
Connected.
SQL> alter system switch logfile;

System altered.

SQL> alter system switch logfile;

System altered.

SQL> alter system switch logfile;

System altered.

DGMGRL> show configuration;

Configuration - PeppiEnKokki

  Protection Mode: MaxPerformance
  Databases:
    peppi - Primary database
      Error: ORA-16778: redo transport error for one or more databases

    kokki - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
ERROR

Hmmm, again Data Guard doesn’t look like a happy camper. But things are not what they seem. If we wait long enough Data Guard will re-open its archive destinations and redo will automatically flow again. We can trigger this by disabling and re-enabling redo shipment:

DGMGRL> edit database kokki set property LogShipping=off;
Property "logshipping" updated

DGMGRL> edit database kokki set property LogShipping=on;
Property "logshipping" updated

DGMGRL> show configuration;

Configuration - PeppiEnKokki

  Protection Mode: MaxPerformance
  Databases:
    peppi - Primary database
    kokki - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

Finally Data Guard is back in business again because the primary database is able to login on its standby database(s) and thereby successfully ship redo to them.

In summary

In order to keep Data Guard going, we must copy the password file from the primary database to the standby database(s) after an update is made to the password file on the primary database. Happy Data Guarding ;-)
-Harald

Posted in Oracle | 19 Comments »

Clustered ASM and RAC password file maintenance

Posted by Harald van Breederode on January 14, 2011

A recurring question during Grid Infrastructure and RAC courses I teach is “How do you manage Oracle password files in a clustered environment?”. The answer isn’t as straight forward as you might think because there are significant differences between ASM and RAC (==clustered database) environments. Additionally, in recent releases changes were made concerning password file maintenance procedures. The goal of this posting is to shed some light on these matters.

Password file maintenance prior to 11gR2

Before discussing the recent changes, let us first see how Oracle password file maintenance is performed in a clustered environment. An Oracle password file is created by means of the orapwd utility, is named orapw$ORACLE_SID and is stored in the $ORACLE_HOME/dbs directory. Because an $ORACLE_HOME is usually not shared in a clustered environment, this results in each instance having its own password file. This may lead to inconsistencies between the password files.

Before demonstrating such possible inconsistencies, let us first take a look at what things look like when everything is as it should be:

SQL> connect sys/oracle@asm1 as sysasm
Connected.
SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
PL/SQL Release 11.1.0.7.0 - Production
CORE    11.1.0.7.0      Production
TNS for Linux: Version 11.1.0.7.0 - Production
NLSRTL Version 11.1.0.7.0 - Production

SQL> select * from gv$pwfile_users order by inst_id;

   INST_ID USERNAME                       SYSDB SYSOP SYSAS
---------- ------------------------------ ----- ----- -----
         1 SYS                            TRUE  TRUE  TRUE
         2 SYS                            TRUE  TRUE  TRUE

The output above, captured on a clustered ASM instance, shows that there is an Oracle 11.1.0.7.0 environment and that each instance has a password file entry for the SYS account. What happens when we add another user to the password file and grant him the SYSASM system privilege?

SQL> create user harald identified by prutser;

User created.

SQL> grant sysasm to harald;

Grant succeeded.

As shown above, the user harald was created successfully and was granted the SYSASM system privilege without problems. The question is however if this password file change was performed on each node.

SQL> select * from gv$pwfile_users order by inst_id;

   INST_ID USERNAME                       SYSDB SYSOP SYSAS
---------- ------------------------------ ----- ----- -----
         1 SYS                            TRUE  TRUE  TRUE
         1 HARALD                         FALSE FALSE TRUE
         2 SYS                            TRUE  TRUE  TRUE

It is clear that the password file was only updated on the cluster node at which the change was executed.

So not sharing Oracle password files between the nodes in a cluster will result in inconsistencies between the password files if a change is made at only one of the associated instances.

In order to avoid this you can either store the password file on a shared filesystem and create symbolic links from $ORACLE_HOME/dbs to this shared filesystem or execute the commands that update the password file at each instance as shown below:

SQL> connect sys/oracle@asm2 as sysasm
Connected.
SQL> create user harald identified by prutser;

User created.

SQL> grant sysasm to harald;

Grant succeeded.

Now that the user harald was also created and granted the SYSASM system privilege on the second instance everything should be consistent again.

SQL> select * from gv$pwfile_users order by inst_id;

   INST_ID USERNAME                       SYSDB SYSOP SYSAS
---------- ------------------------------ ----- ----- -----
         1 SYS                            TRUE  TRUE  TRUE
         1 HARALD                         FALSE FALSE TRUE
         2 SYS                            TRUE  TRUE  TRUE
         2 HARALD                         FALSE FALSE TRUE

As long as we remember to execute password file maintenance commands on each and every instance in a clustered environment, the password files remain consistent as shown below:

SQL> drop user harald;

User dropped.

SQL> connect sys/oracle@asm1 as sysasm
Connected.
SQL> drop user harald;

User dropped.

SQL> select * from gv$pwfile_users order by inst_id;

   INST_ID USERNAME                       SYSDB SYSOP SYSAS
---------- ------------------------------ ----- ----- -----
         1 SYS                            TRUE  TRUE  TRUE
         2 SYS                            TRUE  TRUE  TRUE

Password file maintenance in 11gR2

Wouldn’t it be nice if a password file change on one node would automatically be propagated to all other nodes within the same cluster?

YES it would be nice indeed and this is implemented in Oracle11gR2. So, lets have a look in a 11.2.0.1.0 clustered environment:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

SQL> select * from gv$pwfile_users order by inst_id;

   INST_ID USERNAME                       SYSDB SYSOP SYSAS
---------- ------------------------------ ----- ----- -----
         1 SYS                            TRUE  TRUE  TRUE
         2 SYS                            TRUE  TRUE  TRUE

This output was taken from a clustered ASM with local password files stored in $ORACLE_HOME/dbs and shows that each instance has one password file entry for the SYS account. Now see what happens if a new user is created and granted the SYSASM system privilege:

SQL> create user harald identified by prutser;

User created.

SQL> grant sysasm to harald;

Grant succeeded.

The creation of a new user followed by granting it a system privilege succeeds without warnings, but this time the password file change is automatically propagated to the other node as shown below:

SQL> select * from gv$pwfile_users order by inst_id;

   INST_ID USERNAME                       SYSDB SYSOP SYSAS
---------- ------------------------------ ----- ----- -----
         1 SYS                            TRUE  TRUE  TRUE
         1 HARALD                         FALSE FALSE TRUE
         2 SYS                            TRUE  TRUE  TRUE
         2 HARALD                         FALSE FALSE TRUE

Isn’t this a really cool feature? One might ask though what happens if updating a remote password file fails? This can easily be answered by removing the password file on one node and triggering a password file update from another node. Lets give it a try:

SQL> alter user harald identified by Prutser;
alter user harald identified by Prutser
                                *
ERROR at line 1:
ORA-15306: ASM password file update failed on at least one node

It is clear that updating the password file failed on at least one node, but it isn’t clear on which node. How do we figure this out? First lets check what the description is for an ORA-15306 error message:

$ oerr ora 15306
15306, 00000, "ASM password file update failed on at least one node"
// *Cause:  A CREATE USER, ALTER USER, DROP USER, GRANT, or REVOKE
//          command failed on at least one node of the Automatic Storage
//          Management (ASM) cluster.
// *Action: Check the ASM alert logs for more information.

So you need to check the alert log to find out on which node (or nodes) the password file update failed:

$ tail -3 alert_+ASM1.log
Thu Jan 13 14:43:51 2011
SUCCESS: ASM password file update succeeded on node 0
ERROR: ASM password file update failed on node 1

This shows that the password file update failed on node 1. Note: the nodes are numbered 0 and 1, while the instances are named +ASM1 and +ASM2 respectively.

If updating all password files succeeds without errors, this is what you will see in the alert log:

$ tail -3 alert_+ASM1.log
Thu Jan 13 14:44:19 2011
SUCCESS: ASM password file update succeeded on node 0
SUCCESS: ASM password file update succeeded on node 1

It goes without saying that updating password files automatically on remote nodes isn’t restricted to adding new entries, but also works for deleting (or changing) entries as shown below:

SQL> drop user harald;

User dropped.

SQL> select * from gv$pwfile_users order by inst_id;

   INST_ID USERNAME                       SYSDB SYSOP SYSAS
---------- ------------------------------ ----- ----- -----
         1 SYS                            TRUE  TRUE  TRUE
         2 SYS                            TRUE  TRUE  TRUE

RAC password file maintenance

Unfortunately the recent change in password file maintenance in a cluster is currently only implemented for clustered ASM and not (yet?) for RAC.

New password file naming conventions

Another change that was made in Oracle 11g release 2 is a new password file name convention. In prior releases the password file name was based on $ORACLE_SID including the node suffix, but since Oracle 11g release 2 the password file name is based on $ORACLE_SID without the node suffix. Thus orapw+ASM instead of orapw+ASM1 for an ASM instance or orapwORCL instead of orapwORCL_1 for a RAC instance. However, this change is only implemented for RAC databases that are policy based managed. Policy based managed instances have an underscore between the instance name and the node suffix.

In summary

For ASM releases prior to 11gR2 password files should either be placed on a shared filesystem (and symlinked from $ORACLE_HOME/dbs) or password file changes should be performed manually on all instances in order to keep the password file contents synchronized on all nodes.

For ASM releases from 11gR2 onwards there is no need to place the password files on a shared filesystem because password file changes are automatically propagated to remote instances.

For RAC environments password files should either be placed on a shared filesystem (and symlinked from $ORACLE_HOME/dbs) or password file changes should be performed manually on all instances in order to keep the password file contents synchronized on all nodes.

Hopefully you enjoyed reading this article and learned something new ;-)
-Harald

Posted in Oracle | 8 Comments »

Executing SQL statements from within the Data Guard broker (DGMGRL)

Posted by Harald van Breederode on December 4, 2010

While teaching the Oracle11g release 2 Data Guard course last week my students and I found a new feature of the Data Guard broker (DGMGRL). We discovered that it is now possible to execute SQL statements directly from within DGMGRL. This new feature became available after installing the first Oracle Database 11g release 2 patchset (11.2.0.2.0).

Before demonstrating this useful addition to DGMGRL let us have a look at my demo Data Guard setup:

$ dgmgrl /
DGMGRL for Linux: Version 11.2.0.2.0 - Production

Copyright (c) 2000, 2009, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected.
DGMGRL> show configuration;

Configuration - PeppiEnKokki

  Protection Mode: MaxAvailability
  Databases:
    peppi - Primary database
    kokki - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

The output above shows that I have a primary database called “peppi” and a physical standby database called “kokki”. Regular visitors of my blog recognize this setup from my previous Data Guard postings. Please note the version number (11.2.0.2.0) of DGMGRL.

Lets explore the new “sql” statement by issuing a “help sql” statement to DGMGRL:

DGMGRL> help sql

Executes a SQL statement

Syntax:

  SQL "<sql statement>";

As shown above the usage for the “sql” DGMGRL statement is very straightforward. So let’s give it a try by asking for a log switch on my primary database:

DGMGRL> sql "alter system switch logfile";
Succeeded.

And yes, it indeed works as advertised! This feature is not restricted to the primary database. We can also issue SQL statements against a standby database. Before issuing a SQL statement to “kokki” let’s have a look at its current state:

DGMGRL> show database kokki 

Database - kokki

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds
  Apply Lag:       0 seconds
  Real Time Query: OFF
  Instance(s):
    v1120

Database Status:
SUCCESS

The output above shows that “kokki” is doing just fine and that Real Time Query (RTQ) is currently inactive. We can activate RTQ by issuing an “alter database open” statement to “kokki”. Up until version 11.2.0.1.0 this had to be done using SQL*Plus, or a similar tool, but starting from version 11.2.0.2.0 this can now be issued directly from within DGMGRL as shown below:

DGMGRL> connect sys/oracle@kokki
Connected.
DGMGRL> sql "alter database open";
Succeeded.

In order to issue a SQL statement to “kokki” we first need to connect to it by means of the DGMGRL “connect” statement after which we can issue the “alter database open” SQL statement. The output above shows that this succeeds without errors but we can verify this ourselves by executing a “show database” DGMGRL statement:

DGMGRL> show database kokki 

Database - kokki

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds
  Apply Lag:       0 seconds
  Real Time Query: ON
  Instance(s):
    v1120

Database Status:
SUCCESS

My physical standby database is now in Real Time Query mode, which proves that the SQL statement indeed executed on “kokki”

Let’s see what happens if the SQL statement being executed encounters an error:

DGMGRL> sql "alter database open";
ORA-01531: a database already open by the instance

Failed.

The output we receive from DGMGRL is pretty clear that the SQL statement encountered an error by showing us the resulting error message.

So far everything behaved as I expected, but I was a little bit disappointed after attempting to perform a query from within DGMGRL:

DGMGRL> sql "select * from hr.employees";
ORA-24374: define not done before fetch or execute and fetch

Failed.

The above output makes it pretty clear that we cannot execute queries from within DGMGRL. Apparently this is because DGMGRL is not expecting results back from the database. I don’t know if this is by design or by mistake, but it seems to me that currently queries and DML statements are not supported by DGMGRL.

However being able to execute DDL statements directly from within DGMGRL is a real cool feature to have! Thanx Larry ;-)
-Harald

Posted in Oracle | 3 Comments »

 
Follow

Get every new post delivered to your Inbox.

Join 243 other followers