




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、 ORACLE數(shù)據(jù)庫常見問題診斷方法 (OPS或RAC篇) ORALCE并行服務(wù)(即OPS)在ICD中應(yīng)有廣泛應(yīng)用,平臺(tái)涉及各主流硬件平臺(tái),包括IBM AIX、HP UX、SUN SOLARES、TRU UNIX、WINDOWS平臺(tái),版本也較多,包括7.3.X、8.0.X、8.1.X(8I),由于ORACLE OPS本身固有的特點(diǎn)及缺陷或使用不當(dāng),有時(shí)會(huì)產(chǎn)生較嚴(yán)重的性能問題,甚至導(dǎo)致數(shù)據(jù)庫掛起(HUNG)。本文檔就是分析、解決OPS的性能及OPS掛起問題。由于分析、解決OPS問題需要對(duì)ORACLE OPS的工作原理有較深刻理解,所以本文檔只是提供通用的分析思路及分析腳
2、本。該腳本主要用于ORACLE8I以下版本,對(duì)于ORACLE9I有些腳本可能不能執(zhí)行。一、 準(zhǔn)備工作 以sys用戶運(yùn)行腳本catparr.sql,該腳本位于$ORACLE_HOME/rdbms/admin目錄下。二、 緊急情況下的狀態(tài)備份如果數(shù)據(jù)庫掛起無法執(zhí)行任何命令,則只能執(zhí)行狀態(tài)備份命令,待重啟后再進(jìn)行分析。狀態(tài)備份命令如下(sys或internal用戶):SVRMGR> alter session set max_dump_file_size = 'UNLIMITED' SVRMGR> alter session set events 'immedia
3、te trace name systemstate level 10'對(duì)于8.1.7以上版本,執(zhí)行下列命令:SQL> oradebug setmypid SQL> oradebug unlimit SQL> oradebug setinst all SQL> oradebug -g def dump systemstate 10 三、 OPS設(shè)計(jì)、配置準(zhǔn)則為了充分利用OPS的特性,應(yīng)用設(shè)計(jì)及配置應(yīng)嚴(yán)格遵循相應(yīng)準(zhǔn)則,否則,采用OPS系統(tǒng)反而導(dǎo)致性能及安全性下降。OPS應(yīng)用應(yīng)遵循的幾個(gè)準(zhǔn)則:(1) 應(yīng)用嚴(yán)格分割:例如,不同的應(yīng)用分別運(yùn)行在不同的實(shí)例上;(2) 事務(wù)分
4、割:即一類事務(wù)分別運(yùn)行于同一個(gè)實(shí)例,例如,平臺(tái)的日結(jié)應(yīng)與平臺(tái)業(yè)務(wù)運(yùn)行在一個(gè)實(shí)例上;(3) 功能分割:與設(shè)計(jì)相關(guān),暫不考慮;(4) 表分區(qū):大表的數(shù)據(jù)分區(qū)ICD一般采用1、2、4方法。四、 OPS常見問題(1) OPS掛起(HANG);(2) 性能很低,有時(shí)一臺(tái)機(jī)器幾乎無相應(yīng);(3) 分布式鎖進(jìn)程LMD0占用資源很大;五、診斷分析步驟 1 OPS有時(shí)掛起(HANG) ORACLE數(shù)據(jù)庫掛起的原因很多,也不容易分析,一般情況下,除了ORACLE數(shù)據(jù)庫系統(tǒng)本身的BUG之外,可能應(yīng)用沒有按照第四章的準(zhǔn)則配置,下面是一些檢查步驟:1)按照第四章的準(zhǔn)則檢查配置2) 檢查初始化文件參數(shù)GC_FILES_TO
5、_LOCKS,保證該參數(shù)合理設(shè)置,gc_files_to_locks語法:GC_FILES_TO_LOCKS = " file_list=lock_count!blocksEACH:."具體設(shè)置時(shí)請(qǐng)參考第六章中問題12、13、14的分析結(jié)果,一般情況下,SYSTEM、ROLLBACK、有大量沖突目標(biāo)所在的文件必須重新設(shè)置。該參數(shù)給每一個(gè)數(shù)據(jù)文件給出一定數(shù)量的PCM鎖,這樣減少鎖資源的分配開銷并減少訪問沖突3)減少雙機(jī)之間的通訊,在初始化文件中設(shè)置: _lm_send_direct_send=lkmgr4)檢查L(zhǎng)M_RESS、LM_LOCKS參數(shù) 檢查V$resource_l
6、imit,保證LM_RESS、LM_LOCKS大于該視圖中的最大值,且lm_locks>=lm_ress 參照第六章的問題8。5)DLM(分布式鎖管理器)通信量如果TCKT_AVAIL值特大,會(huì)導(dǎo)致系統(tǒng)很慢,有時(shí)會(huì)導(dǎo)致OPS掛起;如果TCKT_WAIT是YES,則系統(tǒng)已經(jīng)耗盡DLM資源(沒有ticket),因此必須確保足夠的TCKT_AVAIL。select * from gv$dlm_traffic_controller order by TCKT_AVAIL; 參照第六章的問題96)按第六章描述逐步檢查2性能很低1)檢查DLM資源檢查gv$dlm_traffic_controller
7、視圖,如果TCKT_AVAIL值特大,會(huì)導(dǎo)致系統(tǒng)很慢select * from gv$dlm_traffic_controller order by TCKT_AVAIL; 參照第六章的問題9 2)按第六章的問題12、13、14檢查塊沖突,按問題8檢查L(zhǎng)M資源3)按第六章描述逐步檢查。3分布式鎖進(jìn)程LMD0占用資源很大 該問題一般是ORACLE的BUG,需要打補(bǔ)丁。另外,也可通過下列設(shè)置減少LMD0的工作。_lm_send_direct_send=lkmgr六、OPS問題通用分析方法 所有OPS問題均可通過下列腳本進(jìn)行分析,這些腳本檢查OPS涉及的各方面的內(nèi)容,所以,對(duì)所有OPS問題,如果時(shí)間
8、允許,應(yīng)按下列方法全面檢查。另外,這些腳本也可用于日常維護(hù)。1)等待會(huì)話下列SQL語句查詢運(yùn)行時(shí)間較長(zhǎng),且等待一個(gè)非空閑的等待事件,事件名可從V$EVENT_NAME中查找。select sw.inst_id, sw.sid, sw.state, sw.event, sw.seconds_in_wait seconds, sw.p1, sw.p2, sw.p3, sa.sql_text last_sql from gv$session_wait sw, gv$session s, gv$sqlarea sa where sw.event not in ('rdbms ipc messa
9、ge','smon timer','pmon timer', 'SQL*Net message from client','lock manager wait for remote message', 'ges remote message', 'client message', 'pipe get', 'Null event', 'PX Idle Wait', 'single-task message', 'PX De
10、q: Execution Msg', 'KXFQ: kxfqdeq - normal deqeue', 'listen endpoint status','slave wait','wakeup time manager') and seconds_in_wait > 0 and (sw.inst_id = s.inst_id and sw.sid = s.sid) and (s.inst_id = sa.inst_id and s.sql_address = sa.address) order by seconds
11、 desc; 其中p1、p2、p3通過下列語句查找select distinct event, p1text, p2text, p3text from gv$session_wait sw where sw.event not in ('rdbms ipc message','smon timer','pmon timer', 'SQL*Net message from client','lock manager wait for remote message', 'ges remote message
12、39;, 'client message', 'SQL*Net more data from client', 'pipe get', 'Null event', 'PX Idle Wait', 'single-task message', 'wakeup time manager') and seconds_in_wait > 0 order by event; 2)PCM鎖資源阻擋 這個(gè)腳本顯示正在持有鎖的會(huì)話正在阻擋另一個(gè)會(huì)話,其中g(shù)rant_level表示被授予的PCM
13、鎖,request_level表示正在請(qǐng)求的鎖,lockstate表示鎖的狀態(tài),seconds_in_wait表示會(huì)話等待時(shí)間。select dl.inst_id, s.sid, p.spid, dl.resource_name1, decode(substr(dl.grant_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)', 'KJUSERCW','Row-X (SX)','KJUSERPR','Share'
14、,'KJUSERPW','S/Row-X (SSX)', 'KJUSEREX','Exclusive',request_level) as grant_level, decode(substr(dl.request_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)', 'KJUSERCW','Row-X (SX)','KJUSERPR','Share'
15、,'KJUSERPW','S/Row-X (SSX)', 'KJUSEREX','Exclusive',request_level) as request_level, decode(substr(dl.lockstate,1,8),'KJUSERGR','Granted','KJUSEROP','Opening', 'KJUSERCA','Canceling','KJUSERCV','Converting
16、9;) as lockstate, s.sid, sw.event, sw.seconds_in_wait sec from gv$dlm_locks dl, gv$process p, gv$session s, gv$session_wait sw where blocker = 1 and (dl.inst_id = p.inst_id and dl.pid = p.spid) and (p.inst_id = s.inst_id and p.addr = s.paddr) and (s.inst_id = sw.inst_id and s.sid = sw.sid) order by
17、sw.seconds_in_wait desc; 3)PCM鎖等待 這個(gè)腳本顯示正在等待鎖資源的會(huì)話,其中g(shù)rant_level表示被授予的PCM鎖,request_level表示正在請(qǐng)求的鎖,lockstate表示鎖的狀態(tài),seconds_in_wait表示會(huì)話等待時(shí)間。select dl.inst_id, s.sid, p.spid, dl.resource_name1, decode(substr(dl.grant_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)', &
18、#39;KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)', 'KJUSEREX','Exclusive',request_level) as grant_level, decode(substr(dl.request_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)', &
19、#39;KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)', 'KJUSEREX','Exclusive',request_level) as request_level, decode(substr(dl.lockstate,1,8),'KJUSERGR','Granted','KJUSEROP','Opening',
20、9;KJUSERCA','Cancelling','KJUSERCV','Converting') as lockstate, s.sid, sw.event, sw.seconds_in_wait sec from gv$dlm_locks dl, gv$process p, gv$session s, gv$session_wait sw where blocked = 1 and (dl.inst_id = p.inst_id and dl.pid = p.spid) and (p.inst_id = s.inst_id and p
21、.addr = s.paddr) and (s.inst_id = sw.inst_id and s.sid = sw.sid) order by sw.seconds_in_wait desc; 4)本地鎖等待 這個(gè)腳本顯示本地鎖等待,其中,addr表示鎖地址,type表示鎖類型,如果鎖類型是TM,則id1表示目標(biāo)ID,id2為0;如果鎖類型是TX,則id1表示事務(wù)槽數(shù)(可不管),select l.inst_id, l.sid, l.addr, l.type, l.id1, l.id2, decode(l.block,0,'blocked',1,'blocking
22、39;,2,'global') block, sw.event, sw.seconds_in_wait sec from gv$lock l, gv$session_wait sw where (l.sid = sw.sid and l.inst_id = sw.inst_id) and l.block in (0,1) order by l.type, l.inst_id, l.sid; 5)門閂鎖保持該鎖是oracle的內(nèi)部鎖,用戶無法控制,一般持有時(shí)間很短,可不考慮。但如果發(fā)現(xiàn)門閂鎖沖突或'latch free'等待時(shí)間,則應(yīng)執(zhí)行下列腳本檢查。select
23、 distinct lh.inst_id, s.sid, s.username, p.username os_user, from gv$latchholder lh, gv$session s, gv$process p where (lh.sid = s.sid and lh.inst_id = s.inst_id) and (s.inst_id = p.inst_id and s.paddr = p.addr) order by lh.inst_id, s.sid; 門鎖的命中率腳本如下:select distinct lh.inst_id, s.sid, s.usern
24、ame, p.username os_user, from gv$latchholder lh, gv$session s, gv$process p where (lh.sid = s.sid and lh.inst_id = s.inst_id) and (s.inst_id = p.inst_id and s.paddr = p.addr) order by lh.inst_id, s.sid; 或select inst_id, name latch_name, round(immediate_gets/(immediate_gets+immediate_misses),
25、 3) hit_ratio, round(sleeps/decode(immediate_misses,0,1,immediate_misses),3) "SLEEPS/MISS" from gv$latch where round(immediate_gets/(immediate_gets+immediate_misses), 3) < .99 and immediate_gets + immediate_misses > 0 order by round(immediate_gets/(immediate_gets+immediate_misses), 3
26、); 6) 全局緩存一致性讀性能腳本 這個(gè)腳本顯示了平均一致性讀的等待時(shí)間(AVG CR BLOCK RECEIVE TIME),一般情況下該值為15毫秒左右,該值與DB_MULTI_BLOCK_READ_COUNT參數(shù)有關(guān),DB_MULTI_BLOCK_READ_COUNT值越大,等待時(shí)間也越大。 select b1.inst_id, b2.value "GCS CR BLOCKS RECEIVED", b1.value "GCS CR BLOCK RECEIVE TIME", (b1.value / b2.value) * 10) "AVG
27、 CR BLOCK RECEIVE TIME (ms)" from gv$sysstat b1, gv$sysstat b2 where = 'global cache cr block receive time' and = 'global cache cr blocks received' and b1.inst_id = b2.inst_id ; 7)全局緩存鎖性能該腳本顯示平均全局隊(duì)列得到時(shí)間(AVG GLOBAL LOCK GET TIME),一般情況下,AVG GLOBAL LOCK GET TIME為20-
28、30毫秒(包括分配全局隊(duì)列和初始化時(shí)間)。如果該值很大,則系統(tǒng)必然存在許多超時(shí)時(shí)間,請(qǐng)按上述方法檢查。 select b1.inst_id, (b1.value + b2.value) "GLOBAL LOCK GETS", b3.value "GLOBAL LOCK GET TIME", (b3.value / (b1.value + b2.value) * 10) "AVG GLOBAL LOCK GET TIME (ms)" from gv$sysstat b1, gv$sysstat b2, gv$sysstat b3 whe
29、re = 'global lock sync gets' and = 'global lock async gets' and = 'global lock get time' and b1.inst_id = b2.inst_id and b2.inst_id = b3.inst_id; 8)資源應(yīng)用 這個(gè)腳本檢查資源利用情況。select inst_id, resource_name, current_utilization, max_utilization, initial_allocatio
30、n from gv$resource_limit where max_utilization > 0 order by inst_id, resource_name; 9)DLM(分布式鎖管理器)通信量如果該值特大,會(huì)導(dǎo)致系統(tǒng)很慢,有時(shí)會(huì)導(dǎo)致OPS掛起。如果TCKT_WAIT是YES,則系統(tǒng)已經(jīng)耗盡DLM資源(沒有ticket),因此必須確保足夠的TCKT_AVAIL。select * from gv$dlm_traffic_controller order by TCKT_AVAIL; 10)DLM雜項(xiàng)select * from gv$dlm_misc; 11)鎖轉(zhuǎn)換詳細(xì)信息selec
31、t * from gv$lock_activity; 12)前10個(gè)寫PING/FUSION數(shù)最大的的目標(biāo)select inst_id, name, kind, file#, status, BLOCKS, READ_PINGS, WRITE_PINGS from (select p.inst_id, , p.kind, p.file#, p.status, count(p.block#) BLOCKS, sum(p.forced_reads) READ_PINGS, sum(p.forced_writes) WRITE_PINGS from gv$ping p, gv$dataf
32、ile df where p.file# = df.file# (+) group by p.inst_id, , p.kind, p.file#, p.status order by sum(p.forced_writes) desc) where rownum < 11 order by WRITE_PINGS desc; 13)前10個(gè)讀PINGING/FUSION數(shù)最大的目標(biāo)select inst_id, name, kind, file#, status, BLOCKS, READ_PINGS, WRITE_PINGS from (select p.inst_id,
33、 , p.kind, p.file#, p.status, count(p.block#) BLOCKS, sum(p.forced_reads) READ_PINGS, sum(p.forced_writes) WRITE_PINGS from gv$ping p, gv$datafile df where p.file# = df.file# (+) group by p.inst_id, , p.kind, p.file#, p.status order by sum(p.forced_reads) desc) where rownum < 11 order
34、 by READ_PINGS desc; 14)前10個(gè)FALSE PINGING數(shù)最大的目標(biāo) 這個(gè)可通過gc_files_to_locks避免。select inst_id, name, kind, file#, status, BLOCKS, READ_PINGS, WRITE_PINGS from (select p.inst_id, , p.kind, p.file#, p.status, count(p.block#) BLOCKS, sum(p.forced_reads) READ_PINGS, sum(p.forced_writes) WRITE_PINGS from
35、 gv$false_ping p, gv$datafile df where p.file# = df.file# (+) group by p.inst_id, , p.kind, p.file#, p.status order by sum(p.forced_writes) desc) where rownum < 11 order by WRITE_PINGS desc; 15)前10個(gè)等待事件select inst_id, event, time_waited, total_waits, total_timeouts from (select inst_id, eve
36、nt, time_waited, total_waits, total_timeouts from gv$system_event where event not in ('rdbms ipc message','smon timer', 'pmon timer','SQL*Net message from client', 'lock manager wait for remote message','ges remote message', 'client message', &
37、#39;SQL*Net more data from client', 'pipe get', 'Null event', 'PX Idle Wait', 'single-task message', 'wakeup time manager') order by time_waited desc) where rownum < 11 order by time_waited desc; 16)顯示會(huì)話、進(jìn)程、程序select p.inst_id, s.sid, s.serial#, p.pid, p
38、.spid, gram, s.username, p.username os_user, sw.event, sw.seconds_in_wait sec from gv$process p, gv$session s, gv$session_wait sw where (p.inst_id = s.inst_id and p.addr = s.paddr) and (s.inst_id = sw.inst_id and s.sid = sw.sid) order by p.inst_id, s.sid; 17)沒有授予等待會(huì)話的PCM鎖這個(gè)腳本顯示當(dāng)前沒有授予任何等待會(huì)話的鎖,由于
39、沖突這些鎖很可能有問題。select dl.inst_id, s.sid, p.spid, dl.resource_name1, decode(substr(dl.grant_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)', 'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)', 'KJ
40、USEREX','Exclusive',request_level) as grant_level, decode(substr(dl.request_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)', 'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)', 'KJ
41、USEREX','Exclusive',request_level) as request_level, decode(substr(dl.lockstate,1,8),'KJUSERGR','Granted','KJUSEROP','Opening', 'KJUSERCA','Cancelling','KJUSERCV','Converting') as lockstate, s.sid, sw.event, sw.seconds_in_wa
42、it sec from gv$dlm_locks dl, gv$process p, gv$session s, gv$session_wait sw where lockstate not like 'KJUSERGRANTED%' and (dl.inst_id = p.inst_id and dl.pid = p.spid) and (p.inst_id = s.inst_id and p.addr = s.paddr) and (s.inst_id = sw.inst_id and s.sid = sw.sid) and sw.event not in ('rd
43、bms ipc message','smon timer','pmon timer', 'SQL*Net message from client','lock manager wait for remote message', 'ges remote message', 'client message', 'pipe get', 'Null event', 'PX Idle Wait', 'single-task message'
44、;, 'PX Deq: Execution Msg', 'KXFQ: kxfqdeq - normal deqeue', 'listen endpoint status','slave wait','wakeup time manager') and sw.seconds_in_wait > 0 order by sec desc; PCM鎖沒有授予等待會(huì)話的鎖全部信息:select * from gv$dlm_all_locks where resource_name1 in (select dl.reso
45、urce_name1 from gv$dlm_locks dl, gv$process p, gv$session s, gv$session_wait sw where lockstate not like 'KJUSERGRANTED%' and (dl.inst_id = p.inst_id and dl.pid = p.spid) and (p.inst_id = s.inst_id and p.addr = s.paddr) and (s.inst_id = sw.inst_id and s.sid = sw.sid) and sw.event not in (
46、9;rdbms ipc message','smon timer','pmon timer', 'SQL*Net message from client','lock manager wait for remote message', 'ges remote message', 'client message', 'SQL*Net more data from client', 'pipe get', 'Null event', 'PX Idle Wait', 'single-task message', 'wakeup time manager') and sw.seconds_in_wait > 0) order by resource_name1, inst_id, lockstate, pid; PCM鎖沒有授予等待會(huì)話的資源全部信息:select * from gv$dlm_ress where resource_name in (select dl.resource_name1 from gv$dlm_locks dl, gv$process p
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 員工轉(zhuǎn)合伙合同范本
- 土地抵押糾紛合同范本
- 商業(yè)用房裝修合同范本
- 春分營(yíng)銷策略探討
- 班級(jí)挑戰(zhàn)書霸氣簡(jiǎn)短
- 初一地理的探索之旅
- 國(guó)家建安合同范例
- 國(guó)外設(shè)備合同范例
- 合同范例公示寫好
- 出口銷售合同范例
- 教科版科學(xué)四年級(jí)下冊(cè)教師用書
- 娛樂主播如何轉(zhuǎn)型做帶貨主播
- 四川省德陽市各縣區(qū)鄉(xiāng)鎮(zhèn)行政村村莊村名居民村民委員會(huì)明細(xì)及行政區(qū)劃代碼
- 青島版三年級(jí)數(shù)學(xué)下冊(cè)全套單元測(cè)試卷
- (參考)食品加工操作流程圖
- 2023高中物理步步高大一輪 第十章 第1講 磁場(chǎng)及其對(duì)電流的作用
- 空分設(shè)備安全培訓(xùn)課件
- Adobe-Illustrator-(Ai)基礎(chǔ)教程
- 沒頭腦和不高興-竇桂梅.精選優(yōu)秀PPT課件
- 鋼棧橋計(jì)算書(excel版)
- 租賃合同審批表
評(píng)論
0/150
提交評(píng)論