Wednesday, April 22, 2020

EM agent start fails With 'java.lang.OutOfMemoryError'

I will show how to solve EM Agent failing to startup related to java memory problems.

Suddenly i found my agent is not starting with error Failed. So now its time to find issue and need to fix.

I have checked multiple logfiles but found this error in emagent.nohup, there was an error related to java memory:

It was an "Out of Memory Exception". To solve it, I had to increase the java heap space allocated to emagent.

(Reference Oracle Doc Id- 1399201.1)

Edit '$AGENT_BASE/agent_inst/sysman/config/' and increase the memory allocated to 512M.


cat | grep agentJavaDefines

agentJavaDefines=-Xmx128M -XX:MaxPermSize=96M


cat | grep agentJavaDefines
agentJavaDefines=-Xmx512M -XX:MaxPermSize=96M

Now Try to restart agent using "emctl start agent".

emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
Starting agent ..................... started.

 if you still face any issues, you will need to clear the agent state and retry.
Do the following:
  1. Stop and kill any remaining process related to EM Agent that is still running (perl or java).
  2. Clean all files from '$AGENT_BASE/agent_inst/sysman/emd/state/*' folder (or move them to a temporary place).
  3. Clear agent state by running: emctl clearstate agent.
  4. Increase the memory allocated editing, as showed before, if not already done.
  5. Try to start the agent again.

Monday, March 30, 2020

Redo and Undo Generation 

To find sessions generating lots of redo, you can use either of the following 
methods. Both methods examine the amount of undo generated. When a transaction 
generates undo, it will automatically generate redo as well.

The methods are:
1) Query V$SESS_IO. This view contains the column BLOCK_CHANGES which indicates
   how much blocks have been changed by the session. High values indicate a 
   session generating lots of redo.

   The query you can use is:

       SQL> SELECT s.sid, s.serial#, s.username, s.program,
         2  i.block_changes
         3  FROM v$session s, v$sess_io i
         4  WHERE s.sid = i.sid
         5  ORDER BY 5 desc, 1, 2, 3, 4;

Run the query multiple times and examine the delta between each occurrence
   of BLOCK_CHANGES. Large deltas indicate high redo generation by the session.

2) Query V$TRANSACTION. This view contains information about the amount of
   undo blocks and undo records accessed by the transaction (as found in the 
   USED_UBLK and USED_UREC columns).

  The query you can use is:

      SQL> SELECT s.sid, s.serial#, s.username, s.program, 
        2  t.used_ublk, t.used_urec
        3  FROM v$session s, v$transaction t
        4  WHERE s.taddr = t.addr
        5  ORDER BY 5 desc, 6 desc, 1, 2, 3, 4;

   Run the query multiple times and examine the delta between each occurrence
   of USED_UBLK and USED_UREC. Large deltas indicate high redo generation by 
   the session.

You use the first query when you need to check for programs generating lots of 
redo when these programs activate more than one transaction. The latter query 
can be used to find out which particular transactions are generating redo.

Friday, March 27, 2020

Unregistered Database From RMAN Catalog

If we need to unregister database from recovery catalog so we have two option-

1> Need to connect from target server and using RMAN easily we can unregister.

2> Suppose in a case if we dont have database(someone dropped) and we need to clear this from             rman catalog server then


Connect to rman catalog server, then connect to rman user or whatever is catalog user we have then do needful as below.

SQL>  Conn rman/rman

       Below is the query by which we will get command which we need to execute, so we need to    pass DB_NAME in my case DB_NAME is TEST.

SQL> select 'EXECUTE dbms_rcvcat.unregisterdatabase('||db_key||','||dbid||');' FROM                           rc_database WHERE name in('TEST') ;

EXECUTE dbms_rcvcat.unregisterdatabase(36631376,3102016920);

SQL> select db_key,DBID,NAME from rc_database where name='TEST';

    DB_KEY       DBID NAME
---------- ---------- --------
  36631376 3102016920 TEST

SQL>EXECUTE dbms_rcvcat.unregisterdatabase(36631376,3102016920);

PL/SQL procedure successfully completed.

Wednesday, July 17, 2019

How to understand AWR report in Oracle11g?

Let we discuss about How to understand AWR report in oracle11g?
* We already know the steps to generate AWR reports , but after you did this how can you read it .
To understand AWR report:
To check "DB Time"

Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 1235 23 May 2017 10:30 191 6.7
End Snap: 1236 23 May 2017  11:31    173 7.4
Elapsed: 59.23 (mins)
DB Time: 710.73 (mins) 

In above scenario " DB time " metric is higher than elapsed time so it meaning session are waiting for something
Note:if the Dbtime is higher than elapsed time so the session must be waiting for something

To View Instance Efficiency

 Instance Efficiency Percentages (Target 100%)
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 98.67 In-memory Sort %: 100.00
Library Hit %: 98.60 Soft Parse %: 99.69
Execute to Parse %: 5.26 Latch Hit %: 99.31
Parse CPU to Parse Elapsd %: 12.78 %Non-Parse CPU: 99.10 
Note: As per the thumb rule, Instance Efficieny Percentages should be ideally above 90%.

To Review Shared Pool Statistics

 Shared Pool Statistics
Begin End
Memory Usage %: 83.49 80.93
% SQL with executions>1: 42.46 82.96
% Memory for SQL w/exec>1: 47.77 81.03 
In above the memory Usage of shared pool statistics has been shown here now it seems 83 %
If suppose its higher than 90 % the conflict in the shared pool

To View Top 5 Events

 Top 5 Timed Events
Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
db file sequential read 4,076,086 28,532 7 66.9 User I/O
CPU time 11,214 26.3
Backup: sbtbackup 4 4,398 1,099,452 10.3 Administrative
log file sync 37,365 2,421 65 5.7 Commit
log file parallel write 37,928 1,371 36 3.2 System I/O 
Here, the significant wait is the db file sequential read which contributes to 67% of DB Time.

Then , SQL Statistics can be checked.

SQL Statistics
SQL ordered by Elapsed Time
SQL ordered by CPU Time
SQL ordered by Gets
SQL ordered by Reads 
SQL Statistics section would have commonly the above four sections.
Each section shows the list of SQLs based on the order of the respective metric.
For example, SQL ordered by Elapsed Time section shows the list of SQLs in the order of the Elapsed Time.
High resource consuming SQLs can be spotted out and meant for tuning.
Note: All the above four sections of SQL Statistics show the list of SQLs in descending order.
i.e, For ex: Highest elapsed time is shown as first.

Then comes the IO Stats section.
This shows the IO Statistics for each tablespaces in the database.

Note:As the thumb rule, the Av Rd(ms) [Average Reads in milliseconds] should not cross beyond 30, add myself(not greater that 30) which is considered to be IO bottleneck.

Tablespace IO Stats
ordered by IOs (Reads + Writes) desc 

Tablespace Reads     Av Reads/s Av Rd(ms) Av Blks/Rd Writes Av Writes/s Buffer Waits Av Buf Wt(ms)
TEMP        3,316,082 933         4.91      1.00              28,840 8 0 0.00
DATA1      520,120     146 16.06 1.21 185,846 52 902 13.00
DATA3      93,411       26 42.82 2.98 13,442 4 16 23.13
DATA2      98,171       28 91.97 7.97 5,333 2 325 34.89 

In the above example, the Av Rd(ms) is high in all tablespaces indicating the IO contention.

To verfiy Advisory Statistics

This section shows the following:-

Buffer Pool Advisory
PGA Aggr Summary
PGA Aggr Target Stats
PGA Aggr Target Histogram
PGA Memory Advisory
Shared Pool Advisory
SGA Target Advisory
Streams Pool Advisory
Java Pool Advisory 
It is very commonly used to check the advisories for the most important SGA structures like shared pool, buffer cache etc and PGA.

Then finally, init.ora Parameters is shown which shows the list of parameters set at instance level.

init.ora Parameters
All the above said sections except the DB Time can be checked from Statspack report also.
The statspack snapshots are not generated automatically as in AWR.It has to be generated during the problem period as follows:-
Take 2 snapshots between 60 minutes interval during the problem and generate the statspack report

exec statspack.snap
wait for 60 minutes
exec statspack.snap

Please run $ORACLE_HOME/rdbms/admin/spreport.sql
and specify BEGIN and END ID's of the snapshots taken during the problem.

Tuesday, July 16, 2019

Performance Tuning In Oracle 11g - Part 1

 Performance Tuning is the improvement of system performance. The motivation for such activity is called a performance problem, which can be real or anticipated.
 - Most systems will respond to increased load with some degree of decreasing
 - Oracle Performance method  is iterative  and DBA Core business is to improve performance
 - Performance tuning comes with  experience and resolving real-time bottleneck issues    
 I) Types of Performance Tuning:
 *Proactive  Tuning 
 - In design, development, testing stage -In production => ADDM(Automatic Database Diagnostic Monitor) -In this proactive, we have to check the database design, product development and test that.
 - After the  product  launch if we have any issues in building level run the ADDM adviser it gives appropriate built-in issue and recommendations to  DBA what is going on database background  
 *Reactive Tuning
 - In production, the problem has occurred
II) Resolving Performance issues:
 - If any performance issue occurred it can be resolved  by below
1)Tuning outside Database (DBOUTSIDE)
 - if any issues happened in performance level first  we need to  identify    the os level, network, storage  levels everything is going  good or not
- Once  you  confirmed there are no issues on outside the DB, then go  to inside the database level 2
2)Database tuning (DBINSIDE)
*Again we adopt the top-down approach
   - Tune the design
   - Tune the SQL queries
   - Tune the various memory structure of instance level
III) Methodology
  - Tuning Methodology to identify  the performance (problem ) issue where it happened and resolve using the below methodology
    *Monitor   - To monitor the issue where the problem has occurred
    *Diagnose - To identify the root cause
    *Tune - To resolve the issue
1)Monitoring Tools
Montior tools  
     *user Feedbacks
       - To get a query about the user  performance issue
      - If user  raise the complaints that time only we know the issue
     *Dynamic Performance View
      - These views owned by SYS users
      - Reside in-memory  values
      - All reads on these views are current reads
      - These cumulative values since  instance startup
      - We can refer to these views "V$dollar view names"
     - By using this view we can collect  statistics  collect tune investigation
  1)Instance Activity
  The statistics which collect information about the instance.
  - Overall statistics are listed in V$STATNAME
    *Parse time CPU
    *Physical reads
    *User commits
  - Cumulative Instance activity
    *Session level
    *System level
2)Wait Events
  The event  for which process is waiting  to be over  before it processed  
  - Provides information about sessions that  had to wait or must wait for different reasons
  * Overall wait events
  * Instance Level wait event in system level
  * Session Level wait events waited in past
- V$session_event currently waiting
- V$session
 - Rate of changes in cumulative statistics
 - Computed real-time  in MMON
  *Alert Log
 - Time to perform archiving
 - Instance recovery start and complete time
 - Incomplete checkpoints -Checkpoints start and end time
  *Enterprise Manager
 - We can access the database through the browser
 - Start and stop the database  
  Let us see one example of performance tunning   logged on sys user and check the user, account status from dba_users tables
   SQL>select username ,account_status from dba_users;
here we can choose Scott and hr  users and then we can give the permission emp table of Scott into hr
   SQL> Grant select ,update on scott.emp to hr;
here we logged into Scott user and tried to update the sal column of EMP
  SQL>update emp set sal=sal+5 where empno=7900;
     1 row updated
after this update we won't give commit or rollback next we go to other sessions in that name of HR login as the same thing we do
  SQL>update emp set sal=sal+5 where empno=7900;
now HR is waiting for Scott once we get  problem issue from development side first we ensure who are all in on-line
    SQL> select sid,serial#,username from V$session;

   ---------- ---------- ------------------------------
         1          5 SYS
         2          1
         3          1
         4          1
         5          1
         6          1
         7          1
         8          1
         9          1
        10          1
        11          1
        12          1
        13          1
        14          1
        15          1
        16          1
        20          2
        22          2
        24          3
        25         99
        29          3
        30         12
        31         16
        32         50
        33          3 HR
        34         15 SCOTT
  S-2: to wait out both user what is the status  
  SQL> select sid,event,seconds_in_wait,wait_time,state from V$session where sid in('33','34');

       SID EVENT                                                            SECONDS_IN_WAIT  WAIT_TIME STATE
       ---------- ---------------------------------------------------------------- --------------- ---------- -------------------
        33 enq: TX - row lock contention                                                829       0 WAITING
        34 SQL*Net message from client                                                  914       0 WAITING

  here the problem is "ENQ: TX- row lock contention" means that sid is waiting for someone
 S-3: let us find  CPU usage in system level  
    SQL> select  event,time_waited from V$system_event where event='enq: TX - row lock contention';

    EVENT                                                            TIME_WAITED
    ---------------------------------------------------------------- -----------
    enq: TX - row lock contention                                         154390

     SQL> /

     EVENT                                                            TIME_WAITED
    ---------------------------------------------------------------- -----------
    enq: TX - row lock contention                                         158326

  S-4: find which session is waiting for session-level
   SQL> select sid,event,time_waited from V$session_event where event='enq: TX - row lock contention' and sid in ('33','34');

       SID EVENT                                                            TIME_WAITED
      ---------- ---------------------------------------------------------------- -----------
        33 enq: TX - row lock contention                                         175386
  S-5: find out how the user has been logged
     SQL> select event,service_name,total_waits,time_waited,average_wait,max_wait from V$service_event where event='enq: TX - row lock contention';
         EVENT                                                            SERVICE_NAME                  TOTAL_WAITS TIME_WAITED AVERAGE_WAIT    MAX_WAIT
    --------------------------------------------------------------- ---------------------------------------------------------------- ----------- ----------- ------------ ----------
           enq: TX - row lock contention                                    SYS$USERS                       1       190688       190688          0
S-6: to find the blocked  session user
        SQL> select  BLOCKING_SESSION_STATUS, BLOCKING_INSTANCE, BLOCKING_SESSION ,username from V$session where sid in('33','34');

          ----------- ----------------- ---------------- ------------------------------
           VALID                       1               34 HR
          NO HOLDER                                      SCOTT
S-7: to find the object _id because using this id we can find which  has been trouble
       SQL> select ROW_WAIT_OBJ#   "Object_id" ,ROW_WAIT_FILE# "realtive fileno", ROW_WAIT_BLOCK#   "Block Number" from V$session where sid in('33');

                   Object_id realtive fileno Block Number
                  ---------- --------------- ------------
                    73181               4          151
S-8: to find the table problem occur table and owner
      SQL> select owner,object_type,object_name,data_object_id from dba_objects where object_id='73181';

      OWNER                          OBJECT_TYPE         OBJECT_NAME                                                DATA_OBJECT_ID
     ------------------------------ ------------------- ---------------------------------------------------------------------------------------------------------------------------        ----- --------------
          SCOTT                          TABLE               EMP                                                         73181
S-9:   to find out user details

       SQL> select machine,osuser,username ,sid,service_name,logon_time from V$session where sid in ('33','34');

          MACHINE                                                          OSUSER                         USERNAME        SID SERVICE_NAME                                                                  LOGON_TIM
---------------------------------------------------------------- ------------------------------ ------------------------------ ---------- ---------------------------------------------------------------- ---------
         goldengatesource                                                 oracle                         HR               33 SYS$USERS                                                                  12-MAR-16
        goldengatesource                                                 oracle                         SCOTT            34 SYS$USERS 

S-10: to sql_id for SQL query which has been issued
     SQL> select sql_id from V$session where sid in ('33','34');

S-11:   finally we found the issue query which has been given hr user now we ensure whether user Scott is going to commit or kill that session get the conformation application team.
     SQL> select sql_fulltext from  V$sql where sql_id='1dc2pgg0uh57f';

      update scott.emp set sal=sal-3 where empno=7900 


Thursday, June 20, 2019

How Database Connections are Created When Using SCANs

Based on the environment, the following actions occur when you use a SCAN to connect to an Oracle RAC database using a service name.
The numbered actions correspond to the arrows shown in Load Balancing Actions for Oracle RAC Connections That Use SCAN:
  1. The LREG process of each instance registers the database services with the default listener on the local node and with each SCAN listener, which is specified by the REMOTE_LISTENER database parameter. The listeners are dynamically updated on the amount of work being handled by the instances and dispatchers.
  2. The client issues a database connection request using a connect descriptor of the form:

  1. Note:
    If you use the Easy Connect naming method, then ensure that the sqlnet.ora file on the client contains EZCONNECT in the list of naming methods specified by the NAMES.DIRECTORY_PATH parameter.
  2. The client uses DNS to resolve scan_name. After DNS returns the three addresses assigned to the SCAN, the client sends a connect request to the first IP address. If the connect request fails, then the client attempts to connect using the next IP address.
  3. When the connect request is successful, the client connects to a SCAN listener for the cluster that hosts the sales database and has an instance offering the webapp service, which in this example is sales1 and sales2. The SCAN listener compares the workload of the instances sales1 and sales2 and the workload of the nodes on which they run. If the SCAN listener determines that node2 is less loaded than node1, then the SCAN listener selects node2 and sends the address for the local listener on that node back to the client.
  4. The client connects to the local listener on node2. The local listener starts a dedicated server process for the connection to the database.
  5. The client connects directly to the dedicated server process on node2 and accesses the sales2 database instance.
Figure Load Balancing Actions for Oracle RAC Connections That Use SCAN

1. Public IP- Public IP is used for Direct connection to a particular instance only.
2. VIP- it's used for transferring the connection to another VIP in case of any failure.
3. Scan IP- its a new concept from 11g R2 onwards, wherein it has pre-information of which node has how much of load and it balances the connection between multiple instances.
     Scan IP max cab be 3 (and Oracle has tested it and as per there internal result, 3 is more than           enough for N instances.)

Remote_listener will have your SCAN_IP.
The local listener will have your VIP of a particular node.
VIP will do failover for your existing connection and incoming new request for that IP
SCAN IP will have re-information about load and will redirect your request to respective less load VIP
if you avoid VIP then you are left with public IP i.e your network card assigned to your server. That public IP is not able to failover in case of any failure.
Private IP, it's only for internal communication of RAC components maintained and managed by Oracle software itself