[20190328]简单探究sql语句相关mutexes.txt
--//摘要:http://www.askmaclean.com/archives/understanding-oracle-mutex.html
虽然Mutex中文翻译为互斥锁,但为了和OS mutex充分的区别,所以我们在本文里称Oracle Mutex为Mutex。
Oracle中的mutex,类似于Latch,是一种低级的串行机制,用以控制对SGA中部分共享数据结构的访问控制。
Oracle中的串行机制有不少,引入它们的目的是避免一个对象出现下述现象:
当某些进程在访问该对象时,该资源被重新分配
当某些进程在修改它时,被其他进程读取
当某些进程在修改它时,被其他进程修改
当某些进程在读取它时,被其他进程修改
不同于Latch,Mutex的使用更灵活,用途更多,例如:
哪些需要被mutex保护的共享数据结构可以有自己独立的mutex,即一个对象拥有自己独立的mutex,不像Latch往往一个需要保护大量
对象,举例来说,每一个父游标有其对应的mutex, 而每一个子游标也有其对应的mutex
每一个数据结构可能有一个或多个mutex保护,每一个mutex负责保护其结构的不同部分
当然一个mutex也可以用来保护多于一个的数据结构
理论上mutex即可以存放在其保护的结构本身中(其实是嵌入在结构里),也可以存放在其他地方。 一般情况下Mutex是在数据结构需要被
保护时动态创建出来的。 如是嵌在需要保护结构体内的mutex,则当 所依附的数据结构被清理时 该mutex也将被摧毁。
Mutex带来的好处
虽然mutex和latch都是Oracle中的串行机制,但是mutex具有一些latch没有的好处
更轻量级且更快
Mutex作为Latch的替代品,具有更快速获得,更小等优势。 获取一个mutex进需要大约30~35个指令, 而Latch则需要150~200个指令。一
个mutex结构的大小大约为16 bytes,而在10.2版本中一个latch需要112个bytes,在更早的版本中是200个bytes。 从200个bytes 精简到
112个是通过减少不必要的统计指标 SLEEP1~SLEEP11、WAITERS_WOKEN, WAITS_HOLDING_LATCH等从而实现的。今后我们将看到更多关于
Latch的代码优化。
减少伪争用
典型情况下一个Latch保护多个对象。 当一个Latch保护多个热对象时,并行地对这些对象的频繁访问让latch本身变成性能的串行点。
这也就是我们此处说的伪争用点, 因为争用是发生在这个串行保护的机制上,而不是进程去访问的对象本身。与latch不同, 使用mutex
的情况下Oracle开发人员可以为每一个要保护的数据结构创建一个独立的mutex。 这意味着Latch的那种伪争用将大大减少,因为每一个
对象均被自己独立拥有的mutex保护
--//我想通过测试说明问题.
1.环境:
--//session 1:
SCOTT@book> @ ver1
PORT_STRING VERSION BANNER
------------------- -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
$ cat m2.txt
set verify off
host sleep $(echo &&3/50| bc -l )
insert into job_times values ( sys_context ('userenv', 'sid') ,dbms_utility.get_time ,'&&2') ;
commit ;
declare
v_id number;
v_d date;
begin
for i in 1 .. &&1 loop
--select 1 into v_id from dual ;
--select sysdate into v_d from dual ;
select deptno into v_id from dept where deptno=10;
end loop;
end ;
/
update job_times set time_ela = dbms_utility.get_time - time_ela where sid=sys_context ('userenv', 'sid') and method='&&2';
commit;
quit
$ seq 150 | xargs -I {} -P 150 bash -c "sqlplus -s -l scott/book @m2.txt 1e6 C2_150 {} >/dev/null"
--//执行以上命令后,由于大量的sql执行语句相同,出现cursor: pin S.可以执行多次,避免语句从共享池刷出.
--//注意oracle围绕mutex的视图很少,仅仅V$MUTEX_SLEEP,V$MUTEX_SLEEP_HISTORY,指令少结构体小,这样记录的诊断信息少.
2.首先看看mutex在哪里?
--//首先确定mutex_addr在那里.
--//sesson 1:
SCOTT@book> column LOCATION format a40
SCOTT@book> select * from V$MUTEX_SLEEP order by 3 ;
MUTEX_TYPE LOCATION SLEEPS WAIT_TIME
-------------------- -------------------------- -------- ----------
Library Cache kglget1 1 1 0
Library Cache kglini1 32 1 0
Library Cache kgldtin1 42 1 0
Library Cache kglati1 45 2 0
Library Cache kglnti1 46 3 0
Library Cache kglllal1 109 6 0
Library Cache kgllldl2 112 7 0
Library Cache kglllal3 111 10 0
Library Cache kglpnal1 90 17 0
Library Cache kgllkdl1 85 18 0
Library Cache kglpndl1 95 27 0
Library Cache kglget2 2 28 0
Library Cache kgllkc1 57 28 0
Library Cache kglpin1 4 29 0
Cursor Pin kkslce [KKSCHLPIN2] 130 0
Library Cache kglhdgn2 106 475 0
Cursor Pin kksfbc [KKSCHLFSP2] 649446 0
Cursor Pin kksLockDelete [KKSCHLPIN6] 1109496 0
18 rows selected.
--//仅仅知道MUTEX_TYPE=Cursor Pin ,LOCATION=kksLockDelete [KKSCHLPIN6],kksfbc [KKSCHLFSP2]的sleep很高,具体在那里,那个语
--//句引起的问题,明显诊断信息不足,
--//而仅仅mutex出现阻塞sleep后,oracle才会在视图V$MUTEX_SLEEP_HISTORY留下信息,不然很难定位.
--//session 2:
SYS@book> @ mutexy 3
old 21: ORDER BY sum_sleeps DESC ) where rownum<= &1
new 21: ORDER BY sum_sleeps DESC ) where rownum<= 3
HASH SUM_SLEEPS SUM_GETS LOCATION MUTEX_TYPE MUTEX_ADDR SQLID KGLNAOWN C100
---------- ---------- ---------- ------------------------------ -------------------- ---------------- ------------- -------- ---------------------------------------
1692266099 770212 1.8476E+10 kksLockDelete [KKSCHLPIN6] Cursor Pin 000000007DFACAB8 a31kd5tkdvvmm SELECT DEPTNO FROM DEPT WHERE DEPTNO=10
1692266099 408706 9792073012 kksfbc [KKSCHLFSP2] Cursor Pin 000000007DFACAB8 a31kd5tkdvvmm SELECT DEPTNO FROM DEPT WHERE DEPTNO=10
991821498 115 16076 kglhdgn2 106 Library Cache 000000007C638348 RECO.ORACLE.COM
--//MUTEX_ADDR = 000000007DFACAB8
SYS@book> @ fcha 000000007DFACAB8
Find in which heap (UGA, PGA or Shared Pool) the memory address 000000007DFACAB8 resides...
WARNING!!! This script will query X$KSMSP, which will cause heavy shared pool latch contention
in systems under load and with large shared pool. This may even completely hang
your instance until the query has finished! You probably do not want to run this in production!
Press ENTER to continue, CTRL+C to cancel...
old 14: to_number(substr('&1', instr(lower('&1'), 'x')+1) ,'XXXXXXXXXXXXXXXX')
new 14: to_number(substr('000000007DFACAB8', instr(lower('000000007DFACAB8'), 'x')+1) ,'XXXXXXXXXXXXXXXX')
old 32: to_number(substr('&1', instr(lower('&1'), 'x')+1) ,'XXXXXXXXXXXXXXXX')
new 32: to_number(substr('000000007DFACAB8', instr(lower('000000007DFACAB8'), 'x')+1) ,'XXXXXXXXXXXXXXXX')
old 50: to_number(substr('&1', instr(lower('&1'), 'x')+1) ,'XXXXXXXXXXXXXXXX')
new 50: to_number(substr('000000007DFACAB8', instr(lower('000000007DFACAB8'), 'x')+1) ,'XXXXXXXXXXXXXXXX')
LOC KSMCHPTR KSMCHIDX KSMCHDUR KSMCHCOM KSMCHSIZ KSMCHCLS KSMCHTYP KSMCHPAR
--- ---------------- ---------- ---------- ---------------- ---------- -------- ---------- ----------------
SGA 000000007DFAC3F0 1 1 KGLH0^64ddee73 4096 recr 4095 000000007E0AE4B0
--//KSMCHCOM=KGLH0^64ddee73 ,KSMCHCOM有信息KGLH0,可以猜测在sql语句的某个父或者子光标的堆0中.
SYS@book> @ sharepool/shp4 a31kd5tkdvvmm 0
old 20: WHERE kglobt03 = '&1' or kglhdpar='&1' or kglhdadr='&1' or KGLNAHSH= &2
new 20: WHERE kglobt03 = 'a31kd5tkdvvmm' or kglhdpar='a31kd5tkdvvmm' or kglhdadr='a31kd5tkdvvmm' or KGLNAHSH= 0
TEXT KGLHDADR KGLHDPAR C40 KGLHDLMD KGLHDPMD KGLHDIVC KGLOBHD0 KGLOBHD6 KGLOBHS0 KGLOBHS6 KGLOBT16 N0_6_16 N20 KGLNAHSH KGLOBT03 KGLOBT09
-------------- ---------------- ---------------- ---------------------------------------- ---------- ---------- ---------- ---------------- ---------------- ---------- ---------- ---------- --------- ---------- ---------- ------------- ----------
子游标句柄地址 000000007E1B07D8 000000007E3B7830 SELECT DEPTNO FROM DEPT WHERE DEPTNO=10 1 2 0 000000007E23F080 000000007DFACB60 4528 8088 3072 15688 15688 1692266099 a31kd5tkdvvmm 0
父游标句柄地址 000000007E3B7830 000000007E3B7830 SELECT DEPTNO FROM DEPT WHERE DEPTNO=10 1 0 0 000000007E0AE4B0 00 4720 0 0 4720 4720 1692266099 a31kd5tkdvvmm 65535
--//与父游标句柄地址的KGLOBHD0='000000007E0AE4B0',注意这个地址是堆描述符地址.也可以理解为指向堆0的指针,
--//也就是这个mutex结构体在父游标的堆0里面.
SYS@book> SELECT * FROM x$ksmsp WHERE TO_NUMBER ('000000007E0AE4B0', 'xxxxxxxxxxxxxxxx') between TO_NUMBER(KSMCHPTR,'xxxxxxxxxxxxxxxx') and TO_NUMBER(KSMCHPTR, 'xxxxxxxxxxxxxxxx')+KSMCHSIZ-1;
old 1: SELECT * FROM x$ksmsp WHERE TO_NUMBER ('&&1', 'xxxxxxxxxxxxxxxx') between TO_NUMBER(KSMCHPTR, 'xxxxxxxxxxxxxxxx') and TO_NUMBER(KSMCHPTR, 'xxxxxxxxxxxxxxxx')+KSMCHSIZ
new 1: SELECT * FROM x$ksmsp WHERE TO_NUMBER ('000000007E0AE4B0', 'xxxxxxxxxxxxxxxx') between TO_NUMBER(KSMCHPTR, 'xxxxxxxxxxxxxxxx') and TO_NUMBER(KSMCHPTR, 'xxxxxxxxxxxxxxxx')+KSMCHSIZ
ADDR INDX INST_ID KSMCHIDX KSMCHDUR KSMCHCOM KSMCHPTR KSMCHSIZ KSMCHCLS KSMCHTYP KSMCHPAR
---------------- ---------- ---------- ---------- ---------- ---------------- ---------------- ---------- -------- ---------- ----------------
00007F24581306C8 3192 1 1 1 KGLDA 000000007E0AE448 240 freeabl 0 00
3.人为加锁看看:
--//session 2:
SYS@book> @ pt 'select * from V$MUTEX_SLEEP_HISTORY where MUTEX_IDENTIFIER=1692266099 and rownum=1'
old 10: passing xmltype(cursor( &1 ))
new 10: passing xmltype(cursor( select * from V$MUTEX_SLEEP_HISTORY where MUTEX_IDENTIFIER=1692266099 and rownum=1 ))
ROW_NUM COL_NAME COL_VALUE
------- ------------------ --------------------------
1 MUTEX_IDENTIFIER 1692266099
SLEEP_TIMESTAMP 2019-03-28 15:36:51.649428
MUTEX_TYPE Cursor Pin
GETS 599013225
SLEEPS 25361
REQUESTING_SESSION 7
BLOCKING_SESSION 35
LOCATION kksLockDelete [KKSCHLPIN6]
MUTEX_VALUE 000000230000001E
P1 1
P1RAW 00
P2 0
P3 0
P4 0
14 rows selected.
--//MUTEX_VALUE=000000230000001E.0x00000023=35,表示阻塞的SESSION的sid.
--//session 1:
SCOTT@book> @ spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------
295 7 11295 DEDICATED 11296 21 4 alter system kill session '295,7' immediate;
--//sid=295 , 295=0x127.
SYS@book> oradebug setmypid
Statement processed.
SYS@book> oradebug poke 0x000000007DFACAB8 8 0x0000012700000127
BEFORE: [07DFACAB8, 07DFACAC0) = 00000000 00000000
AFTER: [07DFACAB8, 07DFACAC0) = 00000127 00000127
SCOTT@book> @ m2.txt 1 c1 0
1 row created.
Commit complete.
declare
*
ERROR at line 1:
ORA-04024: self-deadlock detected while trying to mutex pin cursor 0x07E1B07D8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ORA-06512: at line 8
2 rows updated.
Commit complete.
--//注意看下划线,这是自锁.oracle设计够严谨的,什么情况下会出现ORA-04024错误.
--//这里的0x07E1B07D8是子游标句柄地址.
--//session 2:
SYS@book> @ spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------
1 31 11248 DEDICATED 11249 24 9 alter system kill session '1,31' immediate;
--//sid=1.
SYS@book> oradebug poke 0x000000007DFACAB8 8 0x0000000100000127
BEFORE: [07DFACAB8, 07DFACAC0) = 00000127 00000127
AFTER: [07DFACAB8, 07DFACAC0) = 00000127 00000001
--//注意一个细节,intel cpu系列的大小头问题. 4个4个字节颠倒的.
SYS@book> oradebug peek 0x000000007DFACAB8 8
[07DFACAB8, 07DFACAC0) = 00000127 00000001
--//session 1:
SCOTT@book> @ m2.txt 1 c1 0
1 row created.
Commit complete.
--//挂起!!
--//说明:一般测试环境没有用户登录的情况下,下次登录sid不会变化.
--//session 2:
SYS@book> @ wait
P1RAW P2RAW P3RAW P1 P2 P3 SID SERIAL# SEQ# EVENT STATUS STATE WAIT_TIME_MICRO SECONDS_IN_WAIT WAIT_CLASS
---------------- ---------------- ---------------- ---------- ---------- ---------- ---------- ---------- ---------- ------------- -------- ------- --------------- --------------- -----------
0000000064DDEE73 0000000100000127 0000000500000000 1692266099 4294967591 2.1475E+10 295 13 30 cursor: pin S ACTIVE WAITING 50465005 50 Concurrency
--//注意P2RAW.
SYS@book> oradebug poke 0x000000007DFACAB8 8 0x000000000000000
BEFORE: [07DFACAB8, 07DFACAC0) = 00000127 00000001
AFTER: [07DFACAB8, 07DFACAC0) = 00000000 00000000
--//session 1:
--//执行完成退出.
4.看看一些细节:
--//session 1:
SCOTT@book> @ spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------
295 17 11477 DEDICATED 11478 21 9 alter system kill session '295,17' immediate;
--//SPID=1147811358
--//session 2:
SYS@book> oradebug poke 0x000000007DFACAB8 8 0x0000000100000127
BEFORE: [07DFACAB8, 07DFACAC0) = 00000000 00000000
AFTER: [07DFACAB8, 07DFACAC0) = 00000127 00000001
--//打开新的终端执行如下命令:
$ strace -fTtt -p 11478 2>&1 | tee /tmp/m.txt
--//session 1:
SCOTT@book> @ m2.txt 1 c1 0
1 row created.
Commit complete.
--//再次挂起.等1分钟以上.
--//session 2:
SYS@book> oradebug poke 0x000000007DFACAB8 8 0x000000000000000
BEFORE: [07DFACAB8, 07DFACAC0) = 00000127 00000001
AFTER: [07DFACAB8, 07DFACAC0) = 00000000 00000000
--//分析/tmp/m.txt.
$ awk '{print $2}' /tmp/m.txt | uniq -c | egrep "semtimedop| getrusag"
...
1 getrusage(RUSAGE_SELF,
2 getrusage(RUSAGE_SELF,
7 getrusage(RUSAGE_SELF,
1 getrusage(RUSAGE_SELF,
117 semtimedop(309166080,
2 getrusage(RUSAGE_SELF,
181 semtimedop(309166080,
2 getrusage(RUSAGE_SELF,
182 semtimedop(309166080,
2 getrusage(RUSAGE_SELF,
182 semtimedop(309166080,
2 getrusage(RUSAGE_SELF,
....
182 semtimedop(309166080,
2 getrusage(RUSAGE_SELF,
167 semtimedop(309166080,
1 getrusage(RUSAGE_SELF,
4 getrusage(RUSAGE_SELF,
6 getrusage(RUSAGE_SELF,
2 getrusage(RUSAGE_SELF,
1 getrusage(RUSAGE_SELF,
2 getrusage(RUSAGE_SELF,
1 getrusage(RUSAGE_SELF,
1 getrusage(RUSAGE_SELF,
2 getrusage(RUSAGE_SELF,
3 getrusage(RUSAGE_SELF,
1 getrusage(RUSAGE_SELF,
2 getrusage(RUSAGE_SELF,
--//大约调用semtimedop 182次后(大约2秒),调用2次getrusage.截取其中1段.
16:04:56.410571 getrusage(RUSAGE_SELF, {ru_utime={0, 33994}, ru_stime={0, 33994}, ...}) = 0 <0.000020>
16:04:56.410676 getrusage(RUSAGE_SELF, {ru_utime={0, 33994}, ru_stime={0, 33994}, ...}) = 0 <0.000017>
16:04:56.410785 semtimedop(309166080, 0x7fff83068fd0, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.010684>
16:04:56.421565 semtimedop(309166080, 0x7fff83068fd0, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.010886>
16:04:56.432559 semtimedop(309166080, 0x7fff83068fd0, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.010894>
...
16:04:58.368564 semtimedop(309166080, 0x7fff83068fd0, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.010889>
16:04:58.379561 semtimedop(309166080, 0x7fff83068fd0, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.010888>
16:04:58.390557 semtimedop(309166080, 0x7fff83068fd0, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.010905>
16:04:58.401570 semtimedop(309166080, 0x7fff83068fd0, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.010883>
16:04:58.412548 getrusage(RUSAGE_SELF, {ru_utime={0, 39993}, ru_stime={0, 38994}, ...}) = 0 <0.000019>
16:04:58.412648 getrusage(RUSAGE_SELF, {ru_utime={0, 39993}, ru_stime={0, 38994}, ...}) = 0 <0.000017>
--//大家可以man semtimedop或者man getrusage表示什么意思.摘要一段:
$ man semtimedop
NAME
semop, semtimedop - semaphore operations
int semtimedop(int semid, struct sembuf *sops, unsigned nsops, struct timespec *timeout);
semtimedop() behaves identically to semop() except that in those cases were the calling process would sleep, the
duration of that sleep is limited by the amount of elapsed time specified by the timespec structure whose address is
passed in the timeout parameter. If the specified time limit has been reached, semtimedop() fails with errno set to
EAGAIN (and none of the operations in sops is performed). If the timeout parameter is NULL, then semtimedop() behaves
exactly like semop().
--//翻译如下:
semtimedop()的行为与semop相同(),但在这些情况下,调用进程将休眠,睡眠的持续时间受时间规格结构指定的经过时间的限制,其地址
是在超时参数中传递。如果已达到指定的时间限制,则semtimedop()失败,errno设置为再次(不执行sop中的操作)。如果超时参数为空
,则semtimedop()行为就像semop()。
16:04:56.410785 semtimedop(309166080, 0x7fff83068fd0, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.010684>
--//{0, 10000000} 是 timespec. 前面单位是秒,后面单位是纳秒(毫微秒) 1秒=10^9纳秒, 10000000/10^9 = .01.
--//这样每次调用semtimedop需要0.010xXX秒.
--//我的理解相当于不断spin,检查这个资源是否可用.2秒后调用getrusage.
--//session 2:
SYS@book> @ hide mutex
NAME DESCRIPTION DEFAULT_VALUE SESSION_VALUE SYSTEM_VALUE
------------------ ----------------- ------------- ------------- ------------
_mutex_spin_count Mutex spin count TRUE 255 255
_mutex_wait_scheme Mutex wait scheme TRUE 2 2
_mutex_wait_time Mutex wait time TRUE 1 1
5.修改这些隐含参数看看:
--//session 2:
SYS@book> alter system set "_mutex_wait_time"=100 scope=memory;
System altered.
--//session 1:
SCOTT@book> @ spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------
37 79 11990 DEDICATED 11991 26 19 alter system kill session '37,79' immediate
--//SID=37 ,37=0x25.
--//session 2:
SYS@book> oradebug poke 0x000000007DFACAB8 8 0x0000000100000025
BEFORE: [07DFACAB8, 07DFACAC0) = 00000000 00000000
AFTER: [07DFACAB8, 07DFACAC0) = 00000025 00000001
--//打开新的终端执行如下命令:
$ strace -fttT -p 11956 2>&1 | tee /tmp/m1.txt
16:45:21.034782 getrusage(RUSAGE_SELF, {ru_utime={0, 300954}, ru_stime={0, 31995}, ...}) = 0 <0.000034>
16:45:21.034936 semtimedop(309166080, 0x7fff3961cd50, 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable) <1.000684>
16:45:22.035733 semtimedop(309166080, 0x7fff3961cd50, 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable) <1.000669>
16:45:23.036517 getrusage(RUSAGE_SELF, {ru_utime={0, 300954}, ru_stime={0, 31995}, ...}) = 0 <0.000033>
16:45:23.036668 getrusage(RUSAGE_SELF, {ru_utime={0, 300954}, ru_stime={0, 31995}, ...}) = 0 <0.000031>
16:45:23.036819 semtimedop(309166080, 0x7fff3961cd50, 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable) <1.000693>
16:45:24.037629 semtimedop(309166080, 0x7fff3961cd50, 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable) <1.000844>
16:45:25.038587 getrusage(RUSAGE_SELF, {ru_utime={0, 301954}, ru_stime={0, 31995}, ...}) = 0 <0.000032>
16:45:25.038734 getrusage(RUSAGE_SELF, {ru_utime={0, 301954}, ru_stime={0, 31995}, ...}) = 0 <0.000035>
16:45:25.038883 semtimedop(309166080, 0x7fff3961cd50, 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable) <1.000709>
16:45:26.039702 semtimedop(309166080, 0x7fff3961cd50, 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable) <1.000822>
16:45:27.040680 getrusage(RUSAGE_SELF, {ru_utime={0, 301954}, ru_stime={0, 31995}, ...}) = 0 <0.000029>
16:45:27.040805 getrusage(RUSAGE_SELF, {ru_utime={0, 301954}, ru_stime={0, 31995}, ...}) = 0 <0.000017>
16:45:27.040909 semtimedop(309166080, 0x7fff3961cd50, 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable) <1.000680>
16:45:28.041693 semtimedop(309166080, 0x7fff3961cd50, 1, {1, 0}) = -1 EAGAIN (Resource temporarily unavailable) <1.000819>
16:45:29.042627 getrusage(RUSAGE_SELF, {ru_utime={0, 301954}, ru_stime={0, 31995}, ...}) = 0 <0.000033>
16:45:29.042780 getrusage(RUSAGE_SELF, {ru_utime={0, 301954}, ru_stime={0, 31995}, ...}) = 0 <0.000034>
--//现在是间隔1秒调用semtimedop. 也就是改变_mutex_wait_time等待时间(单位cs).我的理解在spin 255次不成功sleep 1秒.
--//注:semtimedop , spin次数来源这里(_mutex_spin_count=255),然后sleep,再次唤醒.
--//session 2:
SYS@book> alter system set "_mutex_wait_time"=10 scope=memory;
System altered.
--//重新测试:
16:49:32.057570 getrusage(RUSAGE_SELF, {ru_utime={0, 29995}, ru_stime={0, 18997}, ...}) = 0 <0.000019>
16:49:32.057682 getrusage(RUSAGE_SELF, {ru_utime={0, 29995}, ru_stime={0, 18997}, ...}) = 0 <0.000019>
16:49:32.057796 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100747>
16:49:32.158640 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100834>
16:49:32.259582 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100928>
16:49:32.360618 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100856>
16:49:32.461584 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100890>
16:49:32.562583 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100924>
16:49:32.663615 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100858>
16:49:32.764582 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100928>
16:49:32.865619 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100855>
16:49:32.966582 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100909>
16:49:33.067606 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100853>
16:49:33.168602 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100809>
16:49:33.269536 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100879>
16:49:33.370541 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100854>
16:49:33.471520 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100872>
16:49:33.572516 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100915>
16:49:33.673559 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100849>
16:49:33.774533 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100899>
16:49:33.875556 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100842>
16:49:33.976519 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100887>
16:49:34.077528 getrusage(RUSAGE_SELF, {ru_utime={0, 30995}, ru_stime={0, 18997}, ...}) = 0 <0.000031>
16:49:34.077679 getrusage(RUSAGE_SELF, {ru_utime={0, 30995}, ru_stime={0, 18997}, ...}) = 0 <0.000031>
16:49:34.077827 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100601>
16:49:34.178538 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100878>
16:49:34.279542 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100854>
16:49:34.380520 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100891>
16:49:34.481551 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100866>
16:49:34.582536 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100886>
16:49:34.683545 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100869>
16:49:34.784537 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100874>
16:49:34.885552 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100868>
16:49:34.986539 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100874>
16:49:35.087538 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100855>
16:49:35.188519 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100890>
16:49:35.289535 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100901>
16:49:35.390564 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100880>
16:49:35.491562 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100836>
16:49:35.592519 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100912>
16:49:35.693555 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100861>
16:49:35.794537 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100892>
16:49:35.895553 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100843>
16:49:35.996518 semtimedop(309166080, 0x7fffbe3108a0, 1, {0, 100000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.100876>
16:49:36.097508 getrusage(RUSAGE_SELF, {ru_utime={0, 31995}, ru_stime={0, 19996}, ...}) = 0 <0.000030>
16:49:36.097635 getrusage(RUSAGE_SELF, {ru_utime={0, 31995}, ru_stime={0, 19996}, ...}) = 0 <0.000027>
--//变成了0.1秒.
--//注意一个细节,如果减去_mutex_wait_time时间,可以发现后面的时间都在0.0006xx-0.00090x之间.
--//取一段计算看看
0.100601+0.100878+0.100854+0.100891+0.100866+0.100886+0.100869+0.100874+0.100868+0.100874+0.100855+0.100890
+0.100901+0.100880+0.100836+0.100912+0.100861+0.100892+0.100843+0.100876= 2.017207
2.017207/20 = .10086035000000000000
--//取平均0.000860秒=860微秒.如果spin=255次的话.每次0.000860/255 = 0.0000034秒
--//另外我修改_mutex_spin_count参数,好像调用semtimedop时间不会变化,不知道我理解错误在哪里?
--//session 2:
SYS@book> alter system set "_mutex_wait_time"=1 scope=memory;
System altered.
SYS@book> alter system set "_mutex_wait_scheme"=0 scope=memory;
System altered.
$ awk '{print $2}' /tmp/m2.txt | uniq -c |egrep "sched_yield|select"|head
99 sched_yield()
1 select(0,
99 sched_yield()
1 select(0,
99 sched_yield()
1 select(0,
99 sched_yield()
1 select(0,
99 sched_yield()
1 select(0,
--//调用99次sched_yield,然后1次seelct.
--//截取一段
16:51:53.763611 sched_yield() = 0 <0.000025>
16:51:53.763710 sched_yield() = 0 <0.000020>
16:51:53.763797 sched_yield() = 0 <0.000025>
16:51:53.763896 sched_yield() = 0 <0.000023>
16:51:53.763993 sched_yield() = 0 <0.000023>
16:51:53.764089 sched_yield() = 0 <0.000023>
16:51:53.764206 select(0, [], [], [], {0, 1000}) = 0 (Timeout) <0.001109>
16:51:53.765424 sched_yield() = 0 <0.000024>
16:51:53.765530 sched_yield() = 0 <0.000023>
16:51:53.765624 sched_yield() = 0 <0.000022>
--//如果是_mutex_wait_scheme=0,调用是sched_yield 99次,然后1次select.
$ man sched_yield
SCHED_YIELD(2) Linux Programmer's Manual SCHED_YIELD(2)
NAME
sched_yield - yield the processor
SYNOPSIS
#include <sched.h>
int sched_yield(void);
DESCRIPTION
A process can relinquish the processor voluntarily without blocking by calling sched_yield(). The process will
then be moved to the end of the queue for its static priority and a new process gets to run.
Note: If the current process is the only process in the highest priority list at that time, this process will
continue to run after a call to sched_yield().
POSIX systems on which sched_yield() is available define _POSIX_PRIORITY_SCHEDULING in <unistd.h>.
--//对于select函数不是很理解.不过里面的时间单位是秒,微妙.
$ man select
NAME
select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O multiplexing
DESCRIPTION
select() and pselect() allow a program to monitor multiple file descriptors, waiting until one or more of the
file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is
considered ready
if it is possible to perform the corresponding I/O operation (e.g., read(2)) without blocking.
The operation of select() and pselect() is identical, with three differences:
(i) select() uses a timeout that is a struct timeval (with seconds and microseconds), while pselect() uses a
struct timespec (with seconds and nanoseconds).
(ii) select() may update the timeout argument to indicate how much time was left. pselect() does not change
this argument.
(iii) select() has no sigmask argument, and behaves as pselect() called with NULL sigmask.
Three independent sets of file descriptors are watched. Those listed in readfds will be watched to see if
characters become available for reading (more precisely, to see if a read will not block; in particular, a file
descriptor is also ready on end-of-file), those in writefds will be watched to see if a write will not block,
and those in exceptfds will be watched for exceptions. On exit, the sets are modified in place to indicate which
file descriptors actually changed status. Each of the three file descriptor sets may be specified as NULL if no
file descriptors are to be watched for the corresponding class of events.
Four macros are provided to manipulate the sets. FD_ZERO() clears a set. FD_SET() and FD_CLR() respectively add
and remove a given file descriptor from a set. FD_ISSET() tests to see if a file descriptor is part of the set;
this is useful after select() returns.
nfds is the highest-numbered file descriptor in any of the three sets, plus 1.
timeout is an upper bound on the amount of time elapsed before select() returns. It may be zero, causing select()
to return immediately. (This is useful for polling.) If timeout is NULL (no timeout), select() can block
indefinitely.
sigmask is a pointer to a signal mask (see sigprocmask(2)); if it is not NULL, then pselect() first replaces the
current signal mask by the one pointed to by sigmask, then does the 'select' function, and then restores the
original signal mask.
...
The timeout
The time structures involved are defined in <sys/time.h> and look like
struct timeval {
long tv_sec; /* seconds */
long tv_usec; /* microseconds */
};
--//修改_mutex_wait_scheme=1看看.
--//session 2:
SYS@book> alter system set "_mutex_wait_scheme"=1 scope=memory;
System altered.
$ awk '{print $2}' /tmp/m3.txt | uniq -c
$ awk '{print $2}' /tmp/m3.txt | uniq -c | egrep "select|getrusage"
...
2 getrusage(RUSAGE_SELF,
7 getrusage(RUSAGE_SELF,
1 getrusage(RUSAGE_SELF,
1387 select(0,
2 getrusage(RUSAGE_SELF,
1684 select(0,
2 getrusage(RUSAGE_SELF,
1675 select(0,
2 getrusage(RUSAGE_SELF,
1635 select(0,
2 getrusage(RUSAGE_SELF,
1615 select(0,
2 getrusage(RUSAGE_SELF,
1617 select(0,
2 getrusage(RUSAGE_SELF,
1616 select(0,
2 getrusage(RUSAGE_SELF,
1615 select(0,
2 getrusage(RUSAGE_SELF,
1616 select(0,
2 getrusage(RUSAGE_SELF,
1619 select(0,
2 getrusage(RUSAGE_SELF,
1611 select(0,
2 getrusage(RUSAGE_SELF,
766 select(0,
1 getrusage(RUSAGE_SELF,
4 getrusage(RUSAGE_SELF,
--//截取一段:
17:06:45.648350 sched_yield() = 0 <0.000037>
17:06:45.648570 select(0, [], [], [], {0, 1000}) = 0 (Timeout) <0.001082>
17:06:45.649750 select(0, [], [], [], {0, 1000}) = 0 (Timeout) <0.001115>
17:06:45.650979 select(0, [], [], [], {0, 1000}) = 0 (Timeout) <0.001111>
...
17:06:47.290475 select(0, [], [], [], {0, 1000}) = 0 (Timeout) <0.001099>
17:06:47.291645 select(0, [], [], [], {0, 1000}) = 0 (Timeout) <0.001138>
17:06:47.292862 getrusage(RUSAGE_SELF, {ru_utime={0, 61990}, ru_stime={0, 67989}, ...}) = 0 <0.000021>
17:06:47.292968 getrusage(RUSAGE_SELF, {ru_utime={0, 61990}, ru_stime={0, 67989}, ...}) = 0 <0.000018>
17:06:47.293111 select(0, [], [], [], {0, 1000}) = 0 (Timeout) <0.001095>
--//还原:
--//session 2:
SYS@book> alter system set "_mutex_wait_scheme"=2 scope=memory;
System altered.
--//视乎_mutex_wait_scheme参数调整mutex的方式.
6.总结:
--//不小心写的太长.对于os,oracle等许多概念不是很清楚,做这探究难度太大...
--//网上找了一段资料:
* _mutex_spin_count (Integer)
- This sets the number of times to spin before yielding/waiting.
* _mutex_wait_scheme (Integer)
- In 11.2 this controls which wait scheme to use. It can be set to one
of the three wait schemes described above thus:
_mutex_wait_scheme = 0 – Always YIELD
_mutex_wait_scheme = 1 & _mutex_wait_time = t – Always SLEEP for t milli-seconds (default)
_mutex_wait_scheme = 2 & _mutex_wait_time = t – EXP BACKOFF with maximum sleep
--//注:_mutex_wait_scheme = 2,我并没有看到指数回退-每次迭代我们都会睡得更多的时间的情况.
7.附上测试的脚本:
$ cat mutexy.sql
column kglnaown format a20
column MUTEX_TYPE format a20
column kglnaobj format a100
column LOCATION format a30
select * from (
SELECT kglnahsh hash
,SUM (sleeps) sum_sleeps
,SUM (gets) sum_gets
,location
,mutex_type
,MUTEX_ADDR
,kglobt03 sqlid
,kglnaown
,replace(kglnaobj,chr(13)) c100
--,SUBSTR (kglnaobj, 1, 140) object
FROM x$kglob, x$mutex_sleep_history
WHERE kglnahsh = mutex_identifier
GROUP BY kglnaobj
,kglobt03
,kglnaown
,kglnahsh
,location
,mutex_type
,MUTEX_ADDR
ORDER BY sum_sleeps DESC ) where rownum<= &1;
--//查询x$mutex_sleep_history信息,主要v$mutex_sleep_history没有MUTEX_ADDR字段信息.
$ cat fcha.sql
--------------------------------------------------------------------------------
--
-- File name: fcha.sql (Find CHunk Address) v0.2
-- Purpose: Find in which heap (UGA, PGA or Shared Pool) a memory address resides
--
-- Author: Tanel Poder
-- Copyright: (c) http://blog.tanelpoder.com | @tanelpoder
--
-- Usage: @fcha <addr_hex>
-- @fcha F6A14448
--
-- Other: This would only report an UGA/PGA chunk address if it belongs
-- to *your* process/session (x$ksmup and x$ksmpp do not see other
-- session/process memory)
--
--------------------------------------------------------------------------------
prompt Find in which heap (UGA, PGA or Shared Pool) the memory address &1 resides...
prompt
prompt WARNING!!! This script will query X$KSMSP, which will cause heavy shared pool latch contention
prompt in systems under load and with large shared pool. This may even completely hang
prompt your instance until the query has finished! You probably do not want to run this in production!
prompt
pause Press ENTER to continue, CTRL+C to cancel...
select
'SGA' LOC,
KSMCHPTR,
KSMCHIDX,
KSMCHDUR,
KSMCHCOM,
KSMCHSIZ,
KSMCHCLS,
KSMCHTYP,
KSMCHPAR
from
x$ksmsp
where
to_number(substr('&1', instr(lower('&1'), 'x')+1) ,'XXXXXXXXXXXXXXXX')
between
to_number(ksmchptr,'XXXXXXXXXXXXXXXX')
and to_number(ksmchptr,'XXXXXXXXXXXXXXXX') + ksmchsiz - 1
union all
select
'UGA',
KSMCHPTR,
null,
null,
KSMCHCOM,
KSMCHSIZ,
KSMCHCLS,
KSMCHTYP,
KSMCHPAR
from
x$ksmup
where
to_number(substr('&1', instr(lower('&1'), 'x')+1) ,'XXXXXXXXXXXXXXXX')
between
to_number(ksmchptr,'XXXXXXXXXXXXXXXX')
and to_number(ksmchptr,'XXXXXXXXXXXXXXXX') + ksmchsiz - 1
union all
select
'PGA',
KSMCHPTR,
null,
null,
KSMCHCOM,
KSMCHSIZ,
KSMCHCLS,
KSMCHTYP,
KSMCHPAR
from
x$ksmpp
where
to_number(substr('&1', instr(lower('&1'), 'x')+1) ,'XXXXXXXXXXXXXXXX')
between
to_number(ksmchptr,'XXXXXXXXXXXXXXXX')
and to_number(ksmchptr,'XXXXXXXXXXXXXXXX') + ksmchsiz - 1
/
$ cat shp4.sql
column N0_6_16 format 99999999
SELECT DECODE (kglhdadr,
kglhdpar, '父游标句柄地址',
'子游标句柄地址')
text,
kglhdadr,
kglhdpar,
substr(kglnaobj,1,40) c40,
KGLHDLMD,
KGLHDPMD,
kglhdivc,
kglobhd0,
kglobhd6,
kglobhs0,kglobhs6,kglobt16,
kglobhs0+kglobhs6+kglobt16 N0_6_16,
kglobhs0+kglobhs1+kglobhs2+kglobhs3+kglobhs4+kglobhs5+kglobhs6+kglobt16 N20,
kglnahsh,
kglobt03 ,
kglobt09
FROM x$kglob
WHERE kglobt03 = '&1' or kglhdpar='&1' or kglhdadr='&1' or KGLNAHSH= &2;
$ cat spid.sql
column sid new_value newsid
column serial# new_value newserial
column spid new_value newspid
set verify off
/* Formatted on 2019/3/28 17:19:16 (QP5 v5.252.13127.32867) */
SELECT s.sid
,s.serial#
,s.process
,s.server
,p.spid
,p.pid
,p.serial# p_serial#
, 'alter system kill session '''
|| s.sid
|| ','
|| s.serial#
|| ''''
|| ' immediate;'
c50
FROM v$session s, v$process p
WHERE s.sid IN (SELECT sid FROM v$mystat WHERE ROWNUM = 1)
AND s.paddr = p.addr;
$ cat hide.sql
col name format a40
col description format a66
col session_value format a22
col default_value format a22
col system_value format a22
select
a.ksppinm name,
a.ksppdesc DESCRIPTION,
b.ksppstdf DEFAULT_VALUE,
b.ksppstvl SESSION_VALUE,
c.ksppstvl SYSTEM_VALUE
from x$ksppi a, x$ksppcv b, x$ksppsv c
where a.indx = b.indx
and a.indx = c.indx
and lower(a.ksppinm) like lower('%&1%')
order by 1;
$ cat pt.sql
--http://orasql.org/2013/04/02/sqlplus-tips-2/
-- show output
set termout on
-- but without echo
set echo off
-- without newpage on start:
set embedded on
-- scrolling control
set pause on
-- two lines between rows:
set newpage 2
-- text for prompt after each page:
set pause "Press Enter to view next row..."
-- new page on new "row_num"
break on row_num skip page
-- main query:
select *
from
xmltable( 'for $a at $i in /ROWSET/ROW
,$r in $a/*
return element ROW{
element ROW_NUM{$i}
,element COL_NAME{$r/name()}
,element COL_VALUE{$r/text()}
}'
passing xmltype(cursor( &1 ))
columns
row_num int
-- ,col_name varchar2(30)
,col_name varchar2(30)
,col_value varchar2(100)
);
-- disabling pause and breaks:
set pause off
clear breaks