Exadata - Oracle DB Blocking Sessions in BRM application
Issue #:
Issue occurrences: FEB 10th, FEB 15th, FEB 19th
Time : After 5:15PM for all occurrences
SQL_ID :
Query -- blocking itself and other queries
Analysis #1:
Query is not performing bad all the times and every time situation is not same. We can have a call today, to explain you more details.
1. 10th feb – multiple select+update was running
2. 11th ,12th and 13th feb – query execution is less than 1 second
3. 14th Feb – execution of query around 2,00 times and it took 1.2 seconds per execution.
4. 15th Feb – not only this query but insert query on table also took more than 2,000 seconds. But nothing load or suspicious observed on DB level. No cpu or IO wait also observed.
5. 16th and 17th Feb – query took less than 1 second
6. 18th Feb – execution more than 3,000 times and took 1.32 seconds per execution
7. 19th Feb -- not only this query but insert query on same table also took more than 25 seconds. Stats gathering only observed
10th
Feb 5pm to 6pm:
|
Event |
Wait Time |
Summary Avg Wait Time (ms) |
|||||||||
I# |
Class |
Event |
Waits |
%Timeouts |
Total(s) |
Avg(ms) |
%DB time |
Avg |
Min |
Max |
Std Dev |
Cnt |
* |
Application |
enq: TX - row lock contention |
41,882 |
0.00 |
397,243.98 |
9484.84 |
99.70 |
9546.90 |
8651.16 |
10442.64 |
1266.77 |
2 |
* |
|
DB CPU |
|
|
962.47 |
|
0.24 |
|
|
|
|
2 |
* |
Network |
SQL*Net more data from client |
17,408 |
0.00 |
200.38 |
11.51 |
0.05 |
11.51 |
11.40 |
11.62 |
0.15 |
2 |
* |
User I/O |
cell smart table scan |
1,107,581 |
41.36 |
191.88 |
0.17 |
0.05 |
0.17 |
0.17 |
0.17 |
0.00 |
2 |
* |
User I/O |
cell single block physical read |
169,057 |
0.00 |
70.15 |
0.41 |
0.02 |
0.45 |
0.35 |
0.54 |
0.13 |
2 |
* |
User I/O |
Disk file Mirror Read |
136,496 |
0.00 |
42.24 |
0.31 |
0.01 |
0.31 |
0.30 |
0.32 |
0.01 |
2 |
* |
Cluster |
gc current block 2-way |
160,196 |
0.00 |
22.01 |
0.14 |
0.01 |
0.14 |
0.14 |
0.14 |
0.00 |
2 |
* |
Other |
Streams AQ: qmn coordinator waiting for slave to start |
3 |
100.00 |
16.89 |
5629.46 |
0.00 |
5629.46 |
5629.46 |
5629.46 |
|
2 |
* |
Cluster |
gc cr block 2-way |
118,706 |
0.00 |
16.22 |
0.14 |
0.00 |
0.14 |
0.14 |
0.14 |
0.00 |
2 |
* |
System I/O |
control file sequential read |
171,775 |
0.00 |
14.20 |
0.08 |
0.00 |
0.08 |
0.07 |
0.10 |
0.02 |
2 |
Statistic |
Owner |
Tablespace Name |
Object Name |
Subobject Name |
Obj. Type |
Value |
%Total |
%Capture |
buffer busy waits |
|
TABLE |
133 |
|
50.00 |
|||
buffer busy waits |
|
TABLE |
104 |
|
39.10 |
|||
gc buffer busy |
|
TABLE |
5 |
|
20.83 |
|||
physical writes |
|
TABLE |
2,726 |
|
16.89 |
|||
physical write requests |
|
TABLE |
2,365 |
1.67 |
17.00 |
|||
physical write requests |
|
INDEX PARTITION |
2,026 |
1.43 |
14.56 |
|||
row lock waits |
|
TABLE |
4,372 |
|
96.88 |
DML operations for 20th Feb 2020. Updates 3 times greater than inserts. No deletes.
Tables size is 3GB.
Table growth from last 30 days.
Table information:
Table record count:
<table information pulled>
Analysis #2:
Capturing Live blocking sessions information.
Analysis #3:
Created Health Checkup report for monitoring purpose.
Application Question:
We have created Oracle SR under Oracle in Application category and currently we are parallelly working on capturing Debug level logs to create the issue, working on suggestion from delivery and checking customer code for memory leaks and error handling.
But for the insert into account SQL which is getting blocked by select for update SQL.
Could you please check what can cause this and insert SQL taking so long and suggestions.
Answer:
Similar Application level debug enabling, we can enable debug on DB user session level. But we need core DBA support.
https://oracle-base.com/articles/misc/sql-trace-10046-trcsess-and-tkprof
@Core DBA: please can you provide any trace logs generated on 21st Feb between 5:00pm to 7:00pm.
Just tracked some more info
Session 5758 blocked other sessions. This session is inactive and no sql_id (means just simple idle and blocking other session)
<session information >
Before getting idle or inactive, previous sql_id
<sql text information>
@Core DBA:
After observing below session 5758 is waiting for data in UNDO segment (no sql and inactive) and this query is blocked other sessions today @5:35PM.
Moreover, I seen alert error on TEMP segment @10:16 am. Please have a look on my observations.
Please can you help us to provide trace files from 5pm to 6pm today. Temp segment error trace file also. So that we can look in detail about this session 5758.
Analysis #4:
*** 2020-02-21 17:26:25.751
Verified Hangs in the System
Root Chain Total Hang
Hang Hang Inst Root #hung #hung Hang Hang Resolution
ID Type Status Num Sess Sess Sess Conf Span Action
----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
260 HANG UNRSLVBL 1 16 2 3 LOW LOCAL Unresolvable:User
Hang Ignored Reason: This hang is probably the result of an application
problem. External invention is required. This hang will be ignored.
inst# SessId Ser# OSPID PrcNm Event
----- ------ ----- --------- ----- -----
1 3939 38255 171413 FG enq: TX - row lock contention
1 16 35377 148784 FG SQL*Net message from client
Victim Information
Ignored
HangID Inst# Sessid Ser Num OSPID Fatal BG Previous Hang Count
------ ----- ------ ------- --------- -------- ------------- -------
259 2 23 22497 8936 N Existing Hang 1
260 1 16 35377 148784 N New Hang 1
*** 2020-02-21 17:26:25.751
@App Team:
Point 1 and 2 are referred to deadlock sessions --> very less chances were DB will automatically kill any one session. But here that is also not happening.
Point 3 and 4 --> we already checked AWR report and we did not see much load.
Trace files are generated by DB code, beyond this needs to look then we need to contact Oracle support.
The following information will assist Oracle Support Services
in further analysis of the root cause of the hang.
*** 2020-02-21 17:26:25.751
Verified Hangs in the System
Root Chain Total Hang
Hang Hang Inst Root #hung #hung Hang Hang Resolution
ID Type Status Num Sess Sess Sess Conf Span Action
----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
260 HANG UNRSLVBL 1 16 2 3 LOW LOCAL Unresolvable:User
Hang Ignored Reason: This hang is probably the result of an application
problem. External invention is required. This hang will be ignored.
Seems oracle given script from doc id 2530284.1. below Doc id has given solution also.
Sessions Concurrency Observed in BRM Database (Doc ID 2530284.1)
If oracle suggest same thing then please provide below our observation.
IN above doc id, sql is blocking other sql but in given report, we can see blocker Sql is empty but same locked row# access from different instances.
Locked object : <table_name>, <schema_name>
Locked row# : AAAO7pAAAAAAH/8AAa[o#=61161, f#=1024, b#=32764, r#=26] --> same row# requested from different instance ( 1 and 2 )
Blocker SQL : [] --> empty
Solution:
Oracle BRM application SR team given configuration changes to fix blocking sessions issue.
Comments
Post a Comment