Thursday, December 13, 2018

Manual Physical Standby Failover – Activate Standby Database or DR sync with standby database and need to open database


One:

First Check database listener, it should be running:

[oracle@DRDG ~]$ lsnrctl status

Two:

Check database mode and role, It should be MOUNTED and PHYSICAL STANDBY:

SQL> select name,open_mode,database_role from v$database;

NAME OPEN_MODE DATABASE_ROLE
--------- -------------------- ----------------
ABC MOUNTED PHYSICAL STANDBY

Three:

Copy archive logs from primary database, those yet to apply on standby database before activate it.

SQL> recover standby database until cancel;
AUTO

Four:

Activate standby database as primary database:

[oracle@DRDG ~]$ sqlplus / as sysdba

SQL> alter database activate standby database;
Database altered.
Five:

Shutdown standby database and startup:

SQL> shut immediate;

ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.


SQL> startup
ORACLE instance started.
Total System Global Area 755769344 bytes
Fixed Size 2217184 bytes
Variable Size 448793376 bytes
Database Buffers 301989888 bytes
Redo Buffers 2768896 bytes
Database mounted.
Database opened.

 Standby database successfully activated as Primary database. Failover successful.

Post activity check, ensure database mode and role, it should be READ WRITE and PRIMARY.

SQL> select name,open_mode,database_role from v$database;

NAME OPEN_MODE DATABASE_ROLE
--------- -------------------- ----------------
ABC READ WRITE PRIMARY

Note: Please take full database backup after failover.

Note: Once old primary database is available for use then you can reconfigure standby database.





Monday, December 10, 2018

Oracle – Find sql Id from process id

If CPU usage high due to Oracle Process, then we can use respective PID to identify SQL id, SQL_text and much more.

Query-

SELECT SQL_TEXT
FROM v$sql
WHERE sql_id = (SELECT sql_id FROM v$session WHERE paddr = (SELECT addr
FROM v$process
WHERE spid = '&process_id'));


We can find the process id using either top (linux), topas -P (aix), or task manager (windows). Equally you can use Enterprise Manager to find this information, by looking under the performance monitor and any alerts regarding tuning / high CPU consumption.

To get more detail we can use below query-

SELECT    'USERNAME : ' || s.username  || CHR (10)   
        || 'SCHEMA : '  || s.schemaname || CHR (10) 
        || 'OSUSER : '  || s.osuser    || CHR (10)  
        || 'PROGRAM : ' || s.program   || CHR (10)  
        || 'MACHINE : ' || s.machine   || CHR (10)  
        || 'TERMINAL : ' || s.terminal  || CHR (10)  
        || 'SPID : '    || p.spid      || CHR (10)  
        || 'SID : '     || s.sid       || CHR (10)  
        || 'SERIAL# : ' || s.serial#   || CHR (10)  
        || 'TYPE : '    || s.TYPE      || CHR (10)  
        || 'SQL ID : '  || q.sql_id    || CHR (10)  
        || 'CHILD_NUMBER : '  || q.child_number    || CHR (10)  
        || 'SQL TEXT : ' || q.sql_text 
          RESULT
  FROM v$session s, v$process p, v$sql q
 WHERE s.paddr = p.addr AND s.sql_id = q.sql_id(+) AND p.spid = '&&MY_PID';

Tuesday, November 13, 2018

Drop all objects of a schema-


Pre-requisite-  (Very Important)

We have to login in particular schema of which we want to drop all objects. Otherwise it will drop all the objects of SYSTEM and other important table and views.


begin

for i in 1..3 loop
for r in (select object_name, object_type from user_objects
          order by object_type, object_name)
  loop
  begin
    if (r.object_type = 'MINING MODEL') then
     execute immediate ' begin dbms_data_mining.drop_model('||''''||
       r.object_name||''''||'); end;';
    elsif (r.object_type = 'TABLE') then
      execute immediate 'drop table "'||r.object_name
                    ||'" cascade constraints purge';
    else
       execute immediate 'drop '||r.object_type||' "'||r.object_name||'"';
    end if;
    exception when others then null;
  end;
  end loop;
end loop;
end;
/

Sunday, November 11, 2018

What happens at backend when we fire alter database commit to switchover to physical standby-

Below Process will take place in background

1.) Notifies the primary database that a switchover is about to occur

2.) Disconnect all users from the primary database

3.) Generate a special redo record that signals the End of Redo (EOR)

4.) Converts the primary database into a standby database

5.) Once the standby database applies the final EOR record, guaranteeing that no data loss has been lost, converts the standby database into the primary database.


The new standby database (old primary) starts to receive the redo records and continues process until we switch back again. It is important to remember that both databases receive the EOR record so both databases know the next redo that will be received.

Friday, June 22, 2018

Steps to create ASM DISK and adding it to ASM DISKGROUP

Step 1:

Login to the server host01 switch yourself to the root execute below commands.

/etc/init.d/oracleasm createdisk ASMDISK_VD046p1 '/dev/mapper/host1_VD046p1'
/etc/init.d/oracleasm createdisk ASMDISK_VD047p1 '/dev/mapper/host1_VD047p1'
/etc/init.d/oracleasm createdisk ASMDISK_VD048p1 '/dev/mapper/host1_VD048p1'


Step 2:

Login as grid

/etc/init.d/oracleasm listdisks


Step 3:

run scan as root
/etc/init.d/oracleasm scandisks

run as grid
/etc/init.d/oracleasm listdisks

Step 4:
Login to grid user setup ASM environment and login as “sys as sysasm” execute below command in SQL environment.

Sql> alter diskgroup DATA add disk 'ORCL:ASMDISK_VD046p1','ORCL:ASMDISK_VD046p1','ORCL:ASMDISK_VD046p1' rebalance power 10;

Step 5:

Sql>select *From v$asm_operation;

Note: Keep running until you see no rows.

ORACLE DATA GUARD INTERVIEW QUESTION - ANSWER

What is data guard?

Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable production Oracle databases to survive disasters and data corruptions. Data Guard maintains these standby databases as copies of the production database. Data Guard can be used with traditional backup, restoration, and cluster techniques to provide a high level of data protection and data availability.

What are the advantages in using Oracle Data Guard?

Following are the different benefits in using Oracle Data Guard feature in your environment.
  • High Availability.
  • Data Protection.
  • Off loading Backup operation to standby database.
  • Automatic Gap detection and Resolution in standby database.
  • Automatic Role Transition using Data Guard Broker.

What are the Protection Modes in Dataguard?

Data Guard Protection Modes

This section describes the Data Guard protection modes.
In these descriptions, a synchronized standby database is meant to be one that meets the minimum requirements of the configured data protection mode and that does not have a redo gap.

Maximum Availability

This protectionmode provides the highest level of data protection that is possible without compromising the availability of a primary database. Transactions do not commit until all redo data needed to recover those transactions has been written to the online redo log and to at least one synchronized standby database. If the primary database cannot write its redo stream to at least one synchronized standby database, it operates as if it were in maximum performance mode to preserve primary database availability until it is again able to write its redo stream to a synchronized standby database.

This mode ensures that no data loss will occur if the primary database fails, but only if a second fault does not prevent a complete set of redo data from being sent from the primary database to at least one standby database.


Maximum Performance

This protectionmode provides the highest level of data protection that is possible without affecting the performance of a primary database. This is accomplished by allowing transactions to commit as soon as all redo data generated by those transactions has been written to the online log. Redo data is also written to one or more standby databases, but this is done asynchronously with respect to transaction commitment, so primary database performance is unaffected by delays in writing redo data to the standby database(s).

This protection mode offers slightly less data protection than maximum availability mode and has minimal impact on primary database performance.
This is the default protection mode.

Maximum Protection

This protection mode ensures that zero data loss occurs if a primary database fails. To provide this level of protection, the redo data needed to recover a transaction must be written to both the online redo log and to at least one synchronized standby database before the transaction commits. To ensure that data loss cannot occur, the primary database will shut down, rather than continue processing transactions, if it cannot write its redo stream to at least one synchronized standby database.
Because this data protection mode prioritizes data protection over primary database availability, Oracle recommends that a minimum of two standby databases be used to protect a primary database that runs in maximum protection mode to prevent a single standby database failure from causing the primary database to shut down.


What is the difference between Physical standby and Logical standby database?

Data Guard Apply process in standby database can apply redo information directly and in that case it will be called physical standby.
OR It can apply SQL and in that case it will be called Logical standby.

Physical Standby:

In this case standby database is an exact, block-by-block, physical replica of the primary database.
The change vectors received by RFS process are directly applied to the standby database by using media recovery.so here the apply process read data blocks, assemble redo changes from mappings, and then apply redo changes to data blocks directly.
Physical Standby is the best choice for disaster recovery (DR) based upon their simplicity, transparency, high performance, and good data protection.

Logical Standby:

In this case standby database uses SQL Apply method to “mine” the redo by converting it to logical change records, and then building SQL
transactions and applying SQL to the standby database.
As this process of replaying the workload is more complex than the Physical Standby’s process, so it requires more memory, CPU, and I/O.
One good advantage here is that a logical standby database can be opened read-write while SQL Apply is active which means you can update (create/insert/delete etc) local tables and schemas in the logical standby database.


Explain the Dataguard Architecture

Data Guard architecture incorporates the following items:

• Primary Database - A production database that is used to create standby databases. The archive logs from the primary database are transfered and applied to standby databases. Each standby can only be associated with a single primary database, but a single primary database can be associated with multiple standby databases.

• Standby Database - A replica of the primary database.

• Log Transport Services - Control the automatic transfer of archive redo log files from the primary database to one or more standby destinations.

• Network Configuration - The primary database is connected to one or more standby databases using      Oracle Net.

• Log Apply Services - Apply the archived redo logs to the standby database. The Managed Recovery      Process (MRP) actually does the work of maintaining and applying the archived redo logs.

• Role Management Services - Control the changing of database roles from primary to standby. The services include switchover, switchback and failover.

• Data Guard Broker - Controls the creation and monitoring of Data Guard. It comes with a GUI and command line interface.

Primary Database:
A Data Guard configuration contains one production database, also referred to as the primary database, that functions in the primary role. This is the database that is accessed by most of your applications.

Standby Database:
A standby database is a transactionally consistent copy of the primary database. Using a backup copy of the primary database, you can create up to nine standby databases and incorporate them in a Data Guard configuration. Once created, Data Guard automatically maintains each standby database by transmitting redo data from the primary database and then applying the redo to the standby database.
The types of standby databases are as follows:

Physical standby database:
Provides a physically identical copy of the primary database, with on disk database structures that are identical to the primary database on a block-for-block basis. The database schema, including indexes, are the same. A physical standby database is kept synchronized with the primary database, through Redo Apply, which recovers the redo data received from the primary database and applies the redo to the physical standby database.

Logical standby database:
Contains the same logical information as the production database, although the physical organization and structure of the data can be different. The logical standby database is kept synchronized with the primary database through SQL Apply, which transforms the data in the redo received from the primary database into SQL statements and then executes the SQL statements on the standby database.

What are the Steps to create Physical Standby database?

1.Take a full hot backup of Primary database

2. Enable force logging to the database

3. Prepare parameter file for primary database

4. Enable archiving

5.Create standby control file

6.Transfer full backup, init.ora, standby control file to standby node.

7.Modify init.ora file on standby node.

8.Restore database

9.Recover Standby database

10.Put Standby database in Managed Recover mode


What are the DATAGUARD PARAMETERS in Oracle?

Set Primary Database Initialization Parameters
----------------------------------------------
On the primary database, you define initialization parameters that control redo transport services while the database is in the primary role. There are additional parameters you need to add that control the receipt of the redo data and log apply services when the primary database is transitioned to the standby role.

DB_NAME=chicago
DB_UNIQUE_NAME=chicago
LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)'
CONTROL_FILES='/arch1/chicago/control1.ctl', '/arch2/chicago/control2.ctl'
LOG_ARCHIVE_DEST_1=
 'LOCATION=/arch1/chicago/
  VALID_FOR=(ALL_LOGFILES,ALL_ROLES)
  DB_UNIQUE_NAME=chicago'
LOG_ARCHIVE_DEST_2=
 'SERVICE=boston LGWR ASYNC
  VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
  DB_UNIQUE_NAME=boston'
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_DEST_STATE_2=ENABLE
REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE
LOG_ARCHIVE_FORMAT=%t_%s_%r.arc
LOG_ARCHIVE_MAX_PROCESSES=30

Primary Database: Standby Role Initialization Parameters

FAL_SERVER=boston
FAL_CLIENT=chicago
DB_FILE_NAME_CONVERT='boston','chicago'
LOG_FILE_NAME_CONVERT=  '/arch1/boston/','/arch1/chicago/','/arch2/boston/','/arch2/chicago/'
STANDBY_FILE_MANAGEMENT=AUTO

Prepare an Initialization Parameter File for the Standby Database
-----------------------------------------------------------------
Create a text initialization parameter file (PFILE) from the server parameter file (SPFILE) used by the primary database; a text initialization parameter file can be copied to the standby location and modified. For example:
CREATE PFILE='/tmp/initboston.ora' FROM SPFILE;

Modifying Initialization Parameters for a Physical Standby Database.

DB_NAME=chicago
DB_UNIQUE_NAME=boston
LOG_ARCHIVE_CONFIG='DG_CONFIG=(chicago,boston)'
CONTROL_FILES='/arch1/boston/control1.ctl', '/arch2/boston/control2.ctl'
DB_FILE_NAME_CONVERT='chicago','boston'
LOG_FILE_NAME_CONVERT= '/arch1/chicago/','/arch1/boston/','/arch2/chicago/','/arch2/boston/'
LOG_ARCHIVE_FORMAT=log%t_%s_%r.arc
LOG_ARCHIVE_DEST_1=  'LOCATION=/arch1/boston/
VALID_FOR=(ALL_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=boston'
LOG_ARCHIVE_DEST_2=  'SERVICE=chicago LGWR ASYNC  VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)   DB_UNIQUE_NAME=chicago'
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_DEST_STATE_2=ENABLE
REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE
STANDBY_FILE_MANAGEMENT=AUTO
FAL_SERVER=chicago
FAL_CLIENT=boston


What are the services required on the primary and standby database ?

The services required on the primary database are:

• Log Writer Process (LGWR) - Collects redo information and updates the online redo logs. It can also      create local archived redo logs and transmit online redo to standby databases.

• Archiver Process (ARCn) - One or more archiver processes make copies of online redo logs either locally or remotely for standby databases.

• Fetch Archive Log (FAL) Server - Services requests for archive redo logs from FAL clients running on multiple standby databases. Multiple FAL servers can be run on a primary database, one for each FAL request. .

The services required on the standby database are:

• Fetch Archive Log (FAL) Client - Pulls archived redo log files from the primary site. Initiates transfer of archived redo logs when it detects a gap sequence.

• Remote File Server (RFS) - Receives archived and/or standby redo logs from the primary database.

• Archiver (ARCn) Processes - Archives the standby redo logs applied by the managed recovery process (MRP).

• Managed Recovery Process (MRP) - Applies archive redo log information to the standby database.


What is RTS (Redo Transport Services) in Dataguard?

It controls the automated transfer of redo data from the production database to one or more archival destinations. The redo transport services perform the following tasks:

a) Transmit redo data from the primary system to the standby systems in the configuration.

b) Manage the process of resolving any gaps in the archived redo log files due to a network failure.

c) Automatically detect missing or corrupted archived redo log files on a standby system and automatically retrieve replacement archived redo log files from the
primary database or another standby database.

What is a Snapshot Standby Database?

Oracle 11g introduces the Snapshot Standby database which essentially is an updateable standby database which has been created from a physical standby database.

We can convert a physical standby database to a snapshot standby database, do some kind of testing on a database which is a read write copy of the current primary or production database and then finally revert it to  it’s earlier state as a physical standby database.

While the snapshot standby database is open in read-write mode, redo is being received from the primary database, but is not applied.

After converting it back to a physical standby database, it is resynchronized with the primary by applying the accumalated redo data which was earlier shipped from the primary database but not applied.

Using a snapshot standby, we are able to do real time application testing using near real time production data. Very often we are required to do production clones for the purpose of testing. But using snapshot standby databases we can meet the same requirement sparing the effort,time,resources and disk space.

A snapshot standby database is a fully updatable standby database that is created by converting a physical standby database into a snapshot standby database.

Like a physical or logical standby database, a snapshot standby database receives and archives redo data from a primary database. Unlike a physical or logical standby database, a snapshot standby database does not apply the redo data that it receives. The redo data received by a snapshot standby database is not applied until the snapshot standby is converted back into a physical standby database, after first discarding any local updates made to the snapshot standby database.

How to delay the application of logs to a physical standby? 

A standby database automatically applies redo logs when they arrive from the primary database. But in some cases, we want to create a time lag between the archiving of a redo log at the primary site, and the application of the log at the standby site.

Modify the LOG_ARCHIVE_DEST_n initialization parameter on the primary database to set a delay for the standby database.

Example: For 60min Delay:
ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=stdby_srvc DELAY=60';
The DELAY attribute is expressed in minutes.
The archived redo logs are still automatically copied from the primary site to the standby site, but the logs are not immediately applied to the standby database. The logs are applied when the specified time interval expires.

What is the usage of DB_FILE_NAME_CONVERT parameter in Oracle Data Guard setup?

DB_FILE_NAME_CONVERT parameter is used in Oracle Data Guard setup that to in standby databases. DB_FILE_NAME_CONVERT parameter are used to update the location of data files in standby database. These parameter are used when you are using different directory structure in standby database compare to primary database data files location.

What is the usage of LOG_FILE_NAME_CONVERT parameter in Oracle Data Guard setup?

LOG_FILE_NAME_CONVERT parameter is used in Oracle Data Guard setup that to in standby databases. LOG_FILE_NAME_CONVERT parameter are used to update the location of redo log files in standby database. These parameter are used when you are using different directory structure in standby database compare to primary database redo log file location.

Your standby database was out of reach because of network issue. How will you synchronize it with primary database again?

Data Guard automatically resynchronizes the standby following network or standby outages using redo data that has been archived at the primary.


What is the difference between SYNC and ASYNC redo transport method?

Synchronous transport (SYNC)

Also known as a “zero data loss” redo transport menthod.

Below is how it works:

1) Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database

2) Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database

3) Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database

4) Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database

5) Remote File Server (RFS) transmits an acknowledgement back to the LNS process on the primary database

6) Log Network Server (LNS) notifies the LGWR that transmission is complete on the primary database.

7) Log Writer (LGWR) acknowledges the commit to the user.

Asynchronous transport (ASYNC)

Unlike SYNC, Asynchronous transport (ASYNC) eliminates the requirement that the LGWR wait for acknowledgement from the LNS. This removes the performance impact on the primary database irrespective of the distance between primary and standby locations. So if the LNS is unable to keep pace and the log buffer is recycled before the redo can be transmitted to the standby, the LNS automatically transitions to reading and sending from the Online Redo logs. Once the LNS is caught up, it automatically transitions back to reading & sending directly from the log buffer.

Below is how it works:

1) Log Network Server (LNS) reads redo information from the redo buffer in SGA of PRIMARY Database

2) Log Network Server (LNS) passes redo to Oracle Net Services for transmission to the STANDBY database

3) Remote File Server (RFS) records the redo information transmitted by the LNS at the STANDBY database

4) Remote File Server (RFS) writes it to a sequential file called a standby redo log file (SRL) at the STANDBY database

so step 5, 6 & 7 as discussed above for SYNC are not applicable here.

The only drawback of ASYNC is the increased potential for data loss. Say a failure destroyed the primary database before any transport lag was reduced to zero, this means any committed transactions that were a part of the transport lag will be lost. So it is highly advisable to have enough network bandwidth to handle peak redo
generation rates when using ASYNC method.

How Synchronous transport (SYNC) can impact the primary database performance?

SYNC guarantees protection for every transaction that the database acknowledges as having been committed but at the same time LGWR must wait for confirmation that data is protected at the standby before it can proceed with the next transaction. It can impact primary database performance and it depends on factors like
  • the amount of redo information to be written
  • available network bandwidth
  • round-trip network latency (RTT)
  • standby I/O performance writing to the SRL.
  • distance betweeen primary and standby databases as network RTT increases with distance.


What is Data Guard’s Automatic Gap Resolution?

Your database is using ASYNC transport method and the instance load is at the peak. The LNS is unable to keep pace and the log buffer is recycled before the redo can be transmitted to the standby, the LNS automatically transitions to reading and sending from the Online Redo logs. Once the LNS is caught up, it automatically transitions back to reading & sending directly from the log buffer.

Now in some cases there can be two or more log switches before the LNS has completed sending the redo information from online redo log files and in meantime if any such required online redo log files were archived then those redo information will be transmitted via Data Guard’s gap resolution process “Automatic Gap Resolution”.

OR

In some other case when your network or the standby database is down and your primary system is one busy system, so before the connection between the primary and standby is restored, a large log file gap will be formed.
Automatic Gap Resolution will take care of such scenarios by following below action plan:

1) ARCH process on the primary database continuously ping the standby database during the outage to determine its status.
2) As soon as the standby is restored, the ARCH ping process queries the standby control file (via its RFS process) to determine the last complete log file that the standby received from the primary database.
3) Data Guard determines which log files are required to resynchronize the standby database and immediately begins transmitting them using additional ARCH processes.
4) LNS process at primary database will also attempt and succeed in making a connection to the standby database and will begin transmitting current redo. So first all the ARCH files are applied and then current redo log.

The Data Guard architecture enables gaps to be resolved quickly using multiple background ARCH processes


How is Data Guard Apply process works if primary and secondary database involves Oracle RAC?

If Primary database is RAC but standby is Non-RAC:

Each primary Oracle RAC instance ships its own thread of redo that is merged by the Data Guard apply process at the standby and applied in SCN order to the standby database.

If both Primary and standby databases are RAC:

If the standby is also an Oracle RAC database, only one instance (the apply instance) will merge and apply changes to the standby database. If the apply instance fail for any reason, the apply process will automatically failover to a surviving instance in the Oracle RAC standby database when using the Data Guard broker.

What is Active Data Guard Option (Oracle Database 11g Enterprise Edition)?

For physical standby database, prior to 11g, the database would have to be in the mount state when media recovery was active which means you were not able to query the standby database during media recovery stage as there was no read-consistent view.

Active Data Guard 11g features solves the read consistency problem by use of a “query” SCN. The media recovery process on the standby database will advance the query SCN after all the changes in a transaction have been applied . The query SCN will appear to user as the CURRENT_SCN column in the V$DATABASE view on the standby database. So Read-only users will only be able to see data up to the query SCN, and hence guaranteeing the same read consistency as the primary database.
This enables a physical standby database to be open as read-only while media recovery is active, making it useful for doing read-only workloads.

Also, if you need read-write access to the standby database, you can use SQL Apply method of dataguard.

What are the important database parameters related to Data Guard corruption prevention?

On the primary database:

a) DB_ULTRA_SAFE

Values can be DATA_AND_INDEX or DATA_ONLY. Setting DB_ULTRA_SAFE at the primary will also automatically set DB_ LOST_WRITE_PROTECT=TYPICAL on the primary database.
In Oracle Database 11g Release 2 (11.2), the primary database automatically attempts to repair the corrupted block in real time by fetching a good version of the same block from a physical standby database.

On the standby database:

a) DB_BLOCK_CHECKSUM=FULL
DB_BLOCK_CHECKSUM detects redo and data block corruptions and detect corruptions on the primary database and protect the standby database. This parameter requires minimal CPU resources.

b) DB_LOST_WRITE_PROTECT=TYPICAL
A lost write can occur when an I/O subsystem acknowledges the completion of a write, while in fact the write did not occur in persistent storage.
This will create a stale version of the data block. When the DB_LOST_WRITE_PROTECT initialization parameter is set, the database records buffer cache block reads in the redo log, and this information is used to detect lost writes.
You set DB_LOST_WRITE_PROTECT to TYPICAL in both primary and standby databases.

What is Switchover event?

Switchover is useful for minimizing downtime during planned maintenance. It is a planned event in which Data Guard reverses the roles of the primary and a standby database.

The primary database runs unaffected while we are making the required changes on our standby database (e.g. patchset upgrades, full Oracle version upgrades, etc).

Once changes are complete, production is switched over to the standby site running at the new release.

This means regardless of how much time is required to perform planned maintenance, the only production database downtime is the time required to execute a switchover, which can be less than 60 seconds

Below operations happens when switchover command is executed:
1. primary database is notified that a switchover is about to occur.
2. all users are disconnected from the primary.
3. a special redo record is generated that signals the End Of Redo (EOR).
4. primary database is converted into a standby database.
5. the final EOR record is applied to standby database, this guarantees that no data has been lost, and it converts the standby to the primary role.

What is Failover event?

The Failover process is similar to switchover event except that the primary database never has the chance to write an EOR record as this is an unplanned event.
Whether or not a failover results in data loss depends upon the Data Guard protection mode:

a) Maximum Protection >> No Data Loss
b) Maximum Availability >> No Data Loss (except when there was a previous failure (e.g. a network failure) that had INTERRUPTED REDO TRANSPORT and allowed the primary database to move ahead of standby)

c) Maximum Performance (ASYNC) >> may lose any committed transactions that were not transmitted to the standby database before the primary database failed.

Failover event can be of two types:
1) Manual
Administrator have complete control of primary-standby role transitions. It can lengthen the outage by the amount of time required for the administrator to be notified and manual execution of command.
2) Automatic
It uses Data Guard’s Fast-Start Failover feature which automatically detects the failure, evaluates the status of the Data Guard configuration, and, if appropriate, executes the failover to a previously chosen standby database.

Which tools can be used for Data Guard Management?

1) SQL*Plus – traditional method, can prove most tedious to use

2) Data Guard broker – automates and centralizes the creation, maintenance, and monitoring of a Data Guard configuration. Simplifies and automates many administrative
tasks. It has its own command line (DGMGRL) and syntax.

3) Enterprise Manager – requires that the Data Guard broker be enabled. a GUI to the Data Guard broker, replacing the DGMGRL command line and interfacing directly with the broker’s monitor processes.

What is the difference between Recovery Point Objective(RPO) and Recovery Time Objective (RTO)?

A) Recovery Point Objective(RPO)
RPO concerns with data. It is the amount of data you are willing to lose when the failure occurs in your database system. Usually people define data loss in terms of time, so possible values can be 5 seconds of data loss, 2 hours of data loss etc.

Remember that each standby database has its own set of attributes and parameters. It means you can mix zero data loss standby databases with minimal data loss standby
databases in the same Data Guard configuration
If you have decided that you want to implement zero data loss strategy, then you should really focus on Networks and Data Loss

B) Recovery Time Objective (RTO)
RTO is defined as how fast you can get back up and running (whereas RPO is concerned with data loss)

So with your RPO strategy you lost say only about 6 seconds of data as you committed to your client but with RTO you need to formulate how fast clients can connect back to the database system after the data loss has occurred.

What are Standby Redo Log (SRL) files?

The SRL files are where the Remote File Server (RFS) process at your standby database writes the incoming redo so that it is persistent on disk for recovery. SRL files are important for better redo transport performance and data protection.

SRL are MUST in Maximum Availability or Maximum Protection mode and OPTIONAL (but recommended) in Maximum Performance mode.

If there are no Standby Redo Log (SRL) files, then at each log switch in the primary database, the RFS process on the standby database that is serving an asynchronous standby destination has to create an archive log of the right size. While the RFS is busy doing creating the archive log file, the LNS process at the primary database has to wait, getting further and further behind the LGWR (in case of Maximum Performance mode). That is why it recommended to have Standby Redo Log (SRL) files in Maximum Performance mode also.

We generally configure them on our primary database as well in preparation for a role transition b/w primary-standby.

Also, do not multiplex SRLs. Since Data Guard will immediately request a new copy of the archive log if an SRL file fails, there is no real need to have more than one copy of each.

Oracle DBA Interview Questions and Answers - ASM


Explain ASM Architecture

The three major components of ASM are the ASM instance, ASM disk groups, and ASM files.

The ASM instance is a special Oracle instance—it does not have its own data files like a regularOracle database does. A single ASM instance on a server can manage the ASM file systems for all the Oracle databases on that server. The ASM instance looks after disk groups and gives the database access to the ASM files. The database makes the initial contact with the ASM instance to get information on the data files, but it accesses those files directly. The ASM instance must be running for an Oracle database to use the ASM file system, and the ASM instance can’t be shut down while the other Oracle databases using ASM file systems are still running, since those databases will crash without the ASM instance.

ASM disk groups are somewhat analogous to logical volumes created by a Logical Volume Manager. Unlike the usual Oracle database files, you don’t access ASM files directly. Disks in an ASM context are rather loosely defined and can include a partition of a disk spindle or the entire disk spindle itself. This depends on how the storage system represents the logical unit number (LUN) to the operating system. Any LUN or a disk represented to the operating system is called a disk. Since each operating system could have a different disk-naming system, check your disk-naming system.

 ASM files are part of an ASM disk group, which contains all your database files. ASM manages a disk group consisting of several disk drives as a single unit, and it spreads the data evenly among all the disks in the group. You don’t have to change the management of your database if you want to switch to an ASM system, because you can use your operating system–based files with the new ASM files. Logical concepts such as extents, segments, and tablespaces work the same way under an ASM
system.

Here’s a summary of an ASM storage system:
• A database is allowed to have multiple disk groups.
• You can store all of your Oracle database files as ASM files because Oracle sets up a one-toone
mapping between an Oracle database file (data files and control files, for example) and
an ASM file.
• An ASM disk group comprises a set of disk drives.
• ASM disk groups are permitted to contain files from more than one disk.
• ASM files always spread over every disk in an ASM disk group and belong to one disk
group only.
• ASM allocates disk space in allocation units of 1MB.

What is ASM(Automatic Storage Management)

Oracle ASM is Oracle’s volume manager specially designed for Oracle database data. It is available since Oracle database version 10g and many improvements have been made in versions 11g release 1 and 2. 

ASM offers support for Oracle RAC clusters without the requirement to install 3rd party software, such as cluster aware volume managers or filesystems.

ASM is shipped as part of the database server software (Enterprise and Standard editions) and does not cost extra money to run.

ASM simplifies administration of Oracle related files by allowing the administrator to reference disk groups
rather than individual disks and files, which are managed by ASM.

The ASM functionality is an extention of the Oracle Managed Files (OMF) functionality that also includes striping and mirroring to provide balanced and secure storage. The new ASM functionality can be used in combination with existing raw and cooked file systems, along with OMF and manually managed files.


What are the key benefits of ASM

ASM provides automatic load balancing over all the available disks, thus reducing hot spots
in the file system.

• ASM prevents fragmentation of disks, so you don’t need to manually relocate data to tune
I/O performance.

• Adding disks is straightforward—ASM automatically performs online disk reorganization
when you add or remove storage.

• ASM uses redundancy features available in intelligent storage arrays.

• The ASM storage system stores all types of database files.

• ASM makes your file management tasks easier, because you will be dealing with just a few
groups of disks, rather than a multitude of database files. ASM automatically creates the
database files and places them in appropriate disk groups.

• ASM does mirroring and striping, which in turn increases reliability and performance. You
can select different reliability and performance characteristics for various types of data. For
example, you can use fine-grained striping for redo log files and a coarser-grained striping
for regular data files.

• ASM is free!



ASM and Cluster Synchronization Service

An ASM storage system requires the use of an additional specialized database instance called ASM, which will actually manage the storage for a set of Oracle databases. In order to use ASM storage for your Oracle databases, you must first ensure that you have Oracle’s Cluster Synchronization Service (CSS) running on your databases.

CSS is responsible for synchronizing ASM instances and your database instances, and it is
installed as part of your Oracle software. CSS also synchronizes recovery from an ASM instance failure. You can find out if the CSS service is running by using the following command:

$ ps -ef | grep css
oracle 5506 1 1 Apr 11 ? 630:05 /u03/app/oracle/bin/ocssd.bin
oracle 12791 10525 2 16:38:39 pts/11 0:00 grep css
$

The preceding ps -ef output shows that the CSS service is indeed running. If you get the following result instead, it means that your CSS service hasn’t been started:

$ ps -ef | grep css
oracle 2207 19736 0 18:12:39 pts/6 0:00 grep css
$

What are the init parameters related to ASM?

• INSTANCE_TYPE: In an Oracle Database 10g database, you have two types of Oracle instances:RDBMS and ASM. RDBMS, of course, refers to the normal Oracle databases, and ASM refers to the new ASM instance. Set the INSTANCE_TYPE parameter to ASM. This will implicitly set the DB_UNIQUE_NAME parameter to +ASM.

• ASM_POWER_LIMIT: This is the maximum speed of this ASM instance during a rebalance disk operation. This operation redistributes the data files evenly and balances I/O load across the disks. The default is 1 and the range is from 1 to 11 (1 is slowest and 11 is fastest).

• ASM_DISKSTRING: This is the location where Oracle should look during a disk-discovery
process. The format of the disk string may vary according to your operating system. You
can specify a list of values as follows; this example limits the ASM discovery to disks whose
names end in s1 and s2 only:
ASM_DISKSTRING = '/dev/rdsk/*s1', '/dev/rdsk/*s2'

• ASM_DISKGROUPS: Here you specify the name of any disk group that you want to mount automatically at instance startup; the default value for this parameter is NULL

What is rebalancing (or) what is the use of ASM_POWER_LIMIT?

ASM_POWER_LIMIT is dynamic parameter, which will be useful for rebalancing the data across disks.
Value can be 1(lowest) to 11 (highest).

What are different types of redundancies in ASM & explain?

External redundancy,
Normal redundancy,
High redundancy.

I’m going to do add disks to my ASM diskgroup, how long will this rebalance take?

Rebalance time is heavily driven by the three items:
1) Amount of data currently in the diskgroup
2) IO bandwidth available on the server
3) ASM_POWER_LIMIT or Rebalance Power Level

What is stripping and mirroring.

Striping is spreading data across multiple disks so that IO is spread across multiple disks and hence increase in throughput. It provides read/write performance but fail over support.
ASM offers two types of striping, with the choice depending on the type of database file. Coarse striping uses a stripe size of 1MB, and you can use coarse striping for every file in your database, except for the control files, online redo log files, and flashback files. Fine striping uses a stripe size of 128KB. You can use fine striping for control files, online redo log files, and flashback files.

 Mirroring means redundancy. It may add performance benefit for read operations but overhead for write operations. It's basic purpose is to provide fail over support.
There are three ASM mirroring options:

High Redundancy - In this configuration, for each primary extent, there are two mirrored extents. For Oracle Database Appliance this means, during normal operations there would be three extents (one primary and two secondary) containing the same data, thus providing “high” level of protection. Since ASM distributes the partnering extents in a way that prevents all extents to be unable due to a component failure in the IO path, this configuration can sustain at least two simultaneous disk failures on Oracle Database Appliance (which should be rare but is possible).

Normal Redundancy - In this configuration, for each primary extent, there is one mirrored (secondary) extent. This configuration protects against at least one disk failure. Note that in the event a disk fails in this configuration, although there is typically no outage or data loss, the system operates in a vulnerable state, should a second disk fail while the old failed disk replacement has not completed. Many Oracle Database Appliance customers thus prefer the High Redundancy configuration to mitigate the lack of additional protection during this time.

External Redundancy - In this configuration there are only primary extents and no mirrored extents. This option is typically used in traditional non-appliance environments when the storage sub-system may have existing redundancy such as hardware mirroring or other types of third-party mirroring in place. Oracle Database Appliance does not support External Redundancy.8.  What is a diskgroup?
A disk group consists of multiple disks and is the fundamental object that ASM manages. Each disk group contains the metadata that is required for the management of space in the disk group. The ASM instance manages the metadata about the files in a Disk Group in the same way that a file system manages metadata about its files. However, the vast majority of I/O operations do not pass through the ASM instance. In a moment we will look at how file
I/O works with respect to the ASM instance.

ASM background processes

ASM rebalance master (RBAL) and ASM rebalance(ARBn). The RBAL process coordinates disk activity, and the ARBn processes perform the rebalancing work, which can include moving data extents.

In addition to ASM’s RBAL and ARBn, any Oracle database instance that uses ASM will have
two ASM-related background processes: RBAL and ASM background (ASMB). RBAL performs global opens of the disks that are part of the ASM disk group, and ASMB connects to the ASM instance as a foreground process and links the ASM instance and your database instance, sending information such as notifications when a data file is created or deleted, and when statistics are updated.


What are Diskgroup’s and Failuregroups?

Diskgroup is a terminology used for logical structure which holds the database files. Each Diskgroup consists of Disks/Raw devices where the files are actually stored. Any ASM file is completely contained within a single disk group. However, a disk group might contain files belonging to several databases and a single database can use files from multiple disk groups.
Failuregroups are used when using Normal/High Redundancy. They contain the mirrored ASM extents and must be containing different disks and preferably on separate disk controller.

Can ASM be used as replacement for RAID?

ASM is supposed to stripe the data and also mirror the data (if Using Normal, High Redundancy). So this can be used as a alternative for RAID 0+1 solutions

How does ASM provides Redundancy?

When you create a disk group, you specify an ASM disk group type based on one of the following three redundancy levels:
Normal for 2-way mirroring - When ASM allocates an extent for a normal redundancy file, ASM allocates a primary copy and a secondary copy. ASM chooses the disk on which to store the secondary copy in a different failure group other than the primary copy.
High for 3-way mirroring. In this case the extent is mirrored across 3 disks.
External to not use ASM mirroring. This is used if you are using Third party Redundancy mechanism like RAID, Storage arrays.

I am unable to open the ASM instance. What is the reason?

ASM instance does not have open stage. It has got only two options
Nomount- This starts the ASM instance
Mount- At this stage, Diskgroup defined in ASM_DISKGROUPS parameter are mounted
When you try to open the ASM instance , you get following error
<span style="font-size: small; font-family: arial,helvetica,sans-serif;">SQL&gt; alter database open;
alter database open
*
ERROR at line 1:
ORA-15000: command disallowed by current instance type</span>

Can ASM instance and database (rdbms) be on different servers?

ASM instance and Database (rdbms) have to be present on same server. Otherwise it will not work.

Can we see the files stored in the ASM instance using standard unix commands.

No, you cannot see the files using standard unix commands like ls. You need to use utility called asmcmd to do this. This is present in 10.2 and above.e.g
/home/oracle>asmcmd
Asmcmd>
You can use help command to see the options.

What is ASMLIB?

ASMLIB is the support library for the ASM. ASMLIB allows an Oracle database using ASM more efficient and capable access to diskgroups. The purpose of ASMLIB, is to provide an alternative interface to identify and access block devices. Additionally, the ASMLIB API enables storage and operating system vendors to supply extended storage-related features.

What is SYSASM role?

Starting from Oracle 11g, SYSASM role can be used to administer the ASM instances. You can continue using SYSDBA role to connect to ASM but it will generate following warning messages at time of startup/shutdown, create Diskgroup/add disk ,etc
<span style="font-size: small; font-family: arial,helvetica,sans-serif;">Alert entry
WARNING: Deprecated privilege SYSDBA for command 'STARTUP'</span>

How can we copy the files from/to ASM?

You can use RMAN or DBMS_FILE_TRANSFER.COPY_FILE procedure to copy the files to/from ASM from/to Filesystem. Starting from Oracle 11g, you can use cp command in asmcmd to perform the same between ASM Diskgroups and also to OS Filesystem.

Is it mandatory to use disks of same size and characteristics for Diskgroups?

No, it is not mandatory to use the disks of same size and characteristics for Diskgroups though it is a Recommended Practice.
Same size disk for Failuregroups in Normal/High redundancy will prevent issues like ORA-15041 as the file extents needs to be mirrored across the disks. Also as Oracle distributes data based on capacity, so larger disk will have more data stored in it and which will result in higher I/O to disk and eventually can lead to sub-optimal performance.
Moreover having disks of different characteristic like varying disk speed can impact the performance.
When managing disks with different size and performance capabilities, best practice is to group them into disk groups according to their characteristics. So you can use higher speed disks for your database files while other disks can be part of Diskgroup used for Flash Recovery Area.

Do we need to install ASM and Oracle Database Software in different ORACLE_HOME?

No. Again installing ASM and Oracle Database Software in different ORACLE_HOME is not mandatory but a best practice. This is useful in cases when we need to have multiple databases using same ASM instance and you need to patch only one of them. E.g You need to apply a CBO patch to one of 10.2 database while your other 10.1 database using different installation does not require it. In this case having a ASM_HOME separate from 10.2 ORACLE_HOME will allow your 10.1 database to keep running. Thus this approach is useful for High Availability.

What is the maximum size of Disk supported by ASM?

ASM supports disks upto 2Tb, so you need to ensure that lun size should be less then 2Tb. 10.2.0.4 and 11g database will give error if you try to create a diskgroup with ASM disks having disk size >2Tb.

I have created Oracle database using DBCA and having a different home for ASM and Oracle 

Database. I see that listener is running from ASM_HOME. Is it correct?
This is fine. When using different home for ASM, you need to run the listener from ASM_HOME instead of ORACLE_HOME.