基于Nmon的PowerHA宕機(jī)故障分析_第1頁(yè)
基于Nmon的PowerHA宕機(jī)故障分析_第2頁(yè)
基于Nmon的PowerHA宕機(jī)故障分析_第3頁(yè)
基于Nmon的PowerHA宕機(jī)故障分析_第4頁(yè)
基于Nmon的PowerHA宕機(jī)故障分析_第5頁(yè)
已閱讀5頁(yè),還剩11頁(yè)未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、 基于 nmon 的 PowerHA 宕機(jī)故障分析 NMONmem選項(xiàng)NMONmemnew選項(xiàng)NMONmemuse選項(xiàng)NMONnet選項(xiàng)NMONpage選項(xiàng)NMONtopas選項(xiàng)Interpreting the %Processor by PID chart概述本分析文檔是基于nmon命令抓取的日志來進(jìn)行的,從0點(diǎn)開始,每?jī)煞昼娮ト∫淮?,共抓?4小時(shí)。本文檔采用了3月31日和4月1日的日志內(nèi)容,主機(jī)為jddbs02NMONmem選項(xiàng)選項(xiàng)官方注釋The main graph on this sheet shows the amount of Real Free memory in Mbytes

2、 by time of day. This would be the same as dividing the fre values reported by vmstat over the same interval by 256. The small graph shows the amount of real memory. This is useful in determining if dynamic reconfiguration has been used during the collection period.For AIX, other columns on the shee

3、t are as follows:Real Free the percentage of real pages on the free list.Virtual Free the percentage of unallocated virtual slots on the paging spaces.Real Free (MB) the amount of memory on the free list in Mbytes.Virtual Free (MB) the amount of unallocated space on the paging spaces.Real Total (MB)

4、 the total amount of memory available to AIX.Virtual Total (MB) the total amount of space allocated for paging spaces.Note: you can calculate the amount of memory used during an interval simply by subtracting theReal Free (MB)valuefrom theReal Total (MB)value.This will, however, include file pages.T

5、he graph on the MEMUSE sheet gives a more accurate assessment of memory used by programs (computational pages).分項(xiàng)解析空閑內(nèi)存率自16點(diǎn)左右,開始大幅度下滑自16:02開始,實(shí)內(nèi)存空閑率逐步下滑,到50%左右,自此后再?zèng)]有增加小結(jié)自16點(diǎn)開始,內(nèi)存空閑率在持續(xù)降低,也就是說,內(nèi)存使用率在增長(zhǎng)。NMONmemnew選項(xiàng)選項(xiàng)官方解釋The graph shows the allocation of memory split into the three major categories:

6、 pages used by user processes, file system cache, and pages used by the system (kernel).Process% the percentage of real memory allocated to user processesFSCache% the percentage of real memory allocated to file system cacheSystem% the percentage of real memory used by system segmentsFree% the percen

7、tage of unallocated real memoryUser% the percentage of real memory used by non-system segments分項(xiàng)解析整體上內(nèi)存的使用率趨于緩和趨勢(shì),system項(xiàng)(系統(tǒng)占用)沒有增長(zhǎng),process項(xiàng)有平緩的增長(zhǎng)趨于。自16點(diǎn)左右開始,用戶使用內(nèi)存比在持續(xù)增加,其他選項(xiàng)未發(fā)生變化。小結(jié)增加的內(nèi)存使用中,主要是用戶類型進(jìn)程在持續(xù)增長(zhǎng)。NMONmemuse選項(xiàng)選項(xiàng)官方解釋Except for %comp, the values on this sheet are the same as would be reporte

8、d by the vmtune command.%numperm the percentage of real memory allocated to file pages.%minperm value specified on the vmtune command or system default of 20%. This will normally be constant for the run unless the vmtune or rmss commands are used during collection.%maxperm value specified on the vmt

9、une command or system default of 80%. This will normally be constant for the run unless the vmtune or rmss commands are used during collection.minfree the minimum number of pages AIX is to keep on the free list. Specified on the vmtune command or system default of maxfree - 8.maxfree the maximum num

10、ber of pages AIX is steal in order to replenish the free list. Specified on the vmtune command or system default.%comp The percentage of real memory allocated to computational pages. NMON_Analyser calculates this value. Computational pages are those backed by page space and include working storage a

11、nd program text segments. They exclude data, executable and shared library files.The Analyser generates two graphs.The first shows the split between computational and file pages by time of day.The second plots the values of%numperm, %minperm, %maxpermand %comp.If%numpermfalls below%minpermthen compu

12、tational pages will be stolen.If%numpermrises above%maxpermthen computational pages cannot be stolen.Low values for both%minpermand%maxpermindicate that the system has been tuned for a database server.You may also want to check the setting of STRICT_MAXPERM on the BBBP sheet (if present).分項(xiàng)解析從圖中可看出,

13、計(jì)算內(nèi)存為發(fā)生明顯變化,永久內(nèi)存在16點(diǎn)后開始持續(xù)增長(zhǎng),并且增長(zhǎng)的內(nèi)存一直未釋放。如圖所示,%numberper和%numclient類型內(nèi)存,在16點(diǎn)后開始增長(zhǎng),到16點(diǎn)半左右趨于穩(wěn)定,且后續(xù)增長(zhǎng)的內(nèi)存一直未釋放。小結(jié)自16點(diǎn)開始持續(xù)增長(zhǎng)的內(nèi)存應(yīng)是用于了持久性內(nèi)存,且為網(wǎng)絡(luò)文件系統(tǒng)類型的(根據(jù)numclient得出)NMON-net選項(xiàng)選項(xiàng)官方解釋This sheet shows the data rates, in Kbytes/sec, for each network adapter in the system (including SP switch if present).This

14、 is the same as produced by the netpmon O dd command.NMON_Analyser adds one column for each adapter showing the total data rate (read + write) and two columns showing Total Read and Total Write.Note that the Total Write is calculated as a negative number for graphing.The Analyser generates three gra

15、phs.The first graph shows total network traffic broken down as Total-Read and Total-Write.The writes are shown below the X-Axis.Note the area chart can be easily converted to a line chart if required.Simply right click on the white space within the chart area, then select Chart TypeLineOK.分項(xiàng)解析如圖所示,在

16、16點(diǎn)左右的時(shí)間段en4網(wǎng)卡產(chǎn)生了大量的讀流量小結(jié)可從側(cè)面印證numclient類型內(nèi)存的大量使用。NMONpage選項(xiàng)選項(xiàng)官方解釋This sheet has the paging statistics as recorded by NMON.faults the number of page faults per second. This is not a count of page faults that generate I/O, because some page faults can be resolved without I/O.pgin the total rate/sec o

17、f in operations to both paging space and file systems during the interval.pgout the total rate/sec of out operations to both paging space and file systems during the interval.pgsin the rate/sec of in operations from paging space during the interval. This is the same as the pi value reported by vmsta

18、t. If pgsin is consistently higher than pgsout this may indicate thrashing. pgsout the rate/sec of out operations to paging space during the interval. This is the same as the po value reported by vmstat.reclaims from NMON 10 onwards this field is the same as thefrvalue reported byvmstatand represent

19、s the number of pages/sec freed by the replacement routine.scans the number of pages/sec examined by the page replacement routine. This is the same as the sr value reported by vmstat. Page replacement is initiated when the number of free pages falls below minfree and stops when the number of free pa

20、ges exceeds maxfree.cycles the number of times/sec the page replacement routine had to scan the entire Page Frame Table in order to replenish the free list. This is the same as the cy value reported by vmstat but note that vmstat reports this number as an integer whereas nmon reports it as a real nu

21、mber.fsin calculated by the Analyser as pgin-pgsin for graphingfsout calculated by the Analyser as pgout-pgsout for graphingsr/fr calculated by the Analyser as scans/reclaims for graphingNMON_Analyser produces two graphs.The first shows paging operations to/from paging space.The ideal here would be

22、no more than 5 operations/sec per page space (see the BBBC sheet for details).The second graph shows the scan:free rate.Memory may be over-committed when this figure is 4 although you also need to examine the MEM and PAGE sheets as well.分項(xiàng)解析在16點(diǎn)時(shí)間段,有基于文件系統(tǒng)的密集換出操作在16點(diǎn)左右pgout和fsout的密集性換出操作,在整體的時(shí)間段faul

23、ts值很高。小結(jié)16點(diǎn)左右的page io可以是由于在這個(gè)時(shí)段有大量的numclient內(nèi)存動(dòng)作,可以解釋。而faults動(dòng)作暫時(shí)保留NMON-topas選項(xiàng)選項(xiàng)官方解釋This sheet is only generated if you specify the -t flag on the NMON command line. The output is similar to that produced using the ps v command.Note that, because of the limitation of having only 65,000 lines on a s

24、ingle sheet, some data may be omitted for very large files and this may mean that entire PIDs or even commands may be missing from the analysis.Note that data are only present for processes that consumed a significant amount of CPU during an interval.The TOP sheet does not represent a complete view

25、of the system.NMON_Analyser does the following:reorders the columns for easier processing. Sorts the data on the sheet into COMMAND name order - using TIME as a minor sort key. Creates a table at the end of the sheet summarising the data by command name and used for graphing.You can see the detail s

26、ection by scrolling to the top of the sheet. The summary table is largely obscured by the graphs and so you will need to move (or delete) them for easier viewing.PID in the detail section this is the process ID of a specific invocation of a command. In the summary table this is the command name.%CPU

27、 in the detail section this is the utilisation of a single processor (rather than of the system) by that PID during the interval. In the summary table this is the average amount of CPU used by all invocations of the command during the collection period.%Usr in the detail section this is the average

28、amount of User-mode CPU used by that PID during the interval.%Sys in the detail section this is the average amount of Kernel-mode CPU used by that PID during the interval.Threads the number of (software) threads being used by this command.Size the average amount of paging space (in Kbytes) allocated

29、 for the data section (private segment + shared library data pages) for one invocation of this command. This is the same as the SIZE figure on the ps v command. Note that if Size is greater than ResData it means some working segment pages are currently paged out.ResText the average amount of real me

30、mory (in Kbytes) used for the code segments of one invocation of this command. Note that multiple concurrent invocations will normally share these pages.ResData the average amount of real memory (in Kbytes) used for the data segments of one invocation of this command. A method of calculating real me

31、mory usage for a command is ResText + (ResData * N).CharIO this is the count of bytes/sec being passed via the read and write system calls. The bulk of this is reading and writing to disks but also includes data to/from terminals, sockets and pipes. Use this to work out which processes are doing the

32、 I/O.%RAM this is an indication of what percentage of real memory this command is using. This is (ResText + ResData) / Real Mem; it is the same as the %MEM value on theps v command. Due to rounding/truncation, and the large amounts of memory in modern systems, this is usually 0. Paging sum of all pa

33、ge faults for this process. Use this to identify which process is causing paging but note that the figure includes asynchronous I/O and can be misleading.Command name of the commandWLMClass name of the Workload Partition or Workload Manager superclass to which this command has been allocated (64-bit

34、 kernel only). IntervalCPU generated by the Analyser. In the detail section this shows the total amount of CPU used by all invocations of a command in the time interval. It is calculated as the sum of CPU used by all PIDS running the same command divided by the number of active processors (physical

35、cores) available during the interval. In the summary section this is broken down as Average, Weighted Average and Maximum and is used to generate the graph.WSet generated by the Analyser. In the detail section this shows the total amount of memory used by all invocations of a command recorded in the

36、 time interval. It is calculated as ResText + (ResData * N) (where “N” is the number of copies of this command running concurrently during the interval). In the summary section this is broken down as Minimum, Average and Maximum and is used to generate the graph.User generated by the Analyser if a U

37、ARG sheet is present. This contains the name of the user running the process. Arg generated by the Analyser if a UARG sheet is present. This contains the complete argument string entered for the command.The Analyser generates four graphs using data in the generated table:A graph showing Average, Wei

38、ghted Average and Maximum CPU Utilisation by commandA graph showing Minimum, Average and Maximum Memory Utilisation by commandA graph showing Average, Weighted Average and Maximum CHARIO by commandA graph showing the CPU utilisation for each PID for each interval as a scatter chart. Note that this chart is only produced if there are fewer than 32,000 lines on the TOP sheet. See below for notes on interpreting this chart.Interpreting the %Processor by PID chartThe purpose of the chart

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論