coa課件第四章.ppt_第1頁
coa課件第四章.ppt_第2頁
coa課件第四章.ppt_第3頁
coa課件第四章.ppt_第4頁
coa課件第四章.ppt_第5頁
已閱讀5頁,還剩68頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、William Stallings Computer Organization and Architecture6th Edition,Chapter 4 Cache Memory(重點(diǎn)),高速緩沖存儲(chǔ)器,4.1 Computer Memory System Overview,Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organization,存儲(chǔ)位置,容量,傳輸單位,存取方式,性能,物理類型,物理特性,組織,Page96,特性,1.Ch

2、aracteristics of memory Systems,(1)Location,CPU (registers, cache) Internal (main memory, cache) External (disk, tape),存儲(chǔ)位置,(2)Capacity,Word size word length (字長) The natural unit of organisation 8,16,32bits Number of words (字?jǐn)?shù)) or Bytes 注意:字(word)在不同的機(jī)器上位數(shù)不同,它不是一個(gè)公認(rèn)的單位,而字節(jié)是標(biāo)準(zhǔn)單位:1Byte=8bits,容量=字?jǐn)?shù)x字長

3、,(字的位數(shù)),N is number of words,(3)Unit of Transfer,Internal “word” Usually governed by data bus width Be equal to the word length (size) (or xx bits) External “block” Usually a block which is much larger than a word(e.g.4,8,16) 1block=2n words,傳輸單位,(4)Access Methods (1),Sequential access 順序存取 Start at

4、 the beginning and read through in order Access time depends on location of data and previous location e.g. tape Direct access 直接存取 Individual blocks have unique address Access time depends on location of data and previous location e.g. disk,存取方式,Access Methods (2),Random access 隨機(jī)存取 Individual addr

5、esses identify locations exactly Access time is independent of location or previous access e.g. RAM(Main memory) Associative access 關(guān)聯(lián)存取 Data is located by a comparison with contents of a portion of the store Access time is independent of location or previous access e.g. cache,存取方式,(5)Performance,Ac

6、cess time 存取時(shí)間 Time between presenting the address and getting the valid data Memory Cycle time存儲(chǔ)周期 Time may be required for the memory to “recover” before next access Cycle time is access + recovery Transfer Rate 傳輸速度 Rate at which data can be moved Note: Access 存取,訪問(忽略方向) =Fetch & store(方向性),性能,P

7、age 98,(6)Physical Types,Semiconductor 半導(dǎo)體 RAM (DRAM & SRAM) Magnetic 磁 Disk & Tape Optical 光 CD & DVD Magnetic Optical 磁-光 MO(用的少),物理類型,(7)Physical Characteristics,Volatile/nonvolatile 易失性 非易失性 Erasable/nonerasable 可擦除 不可擦除,物理特性,(可讀可寫 ) (只讀),(8)Organization 存儲(chǔ)單元組織 Physical arrangement of bits into

8、words,2.Memory Hierarchy 存儲(chǔ)器分層結(jié)構(gòu),Registers In CPU Internal or Main memory May include one or more levels of cache “RAM” External memory Backing store 將存儲(chǔ)備份,Page99,Memory Hierarchy - Diagram,Page100,主板內(nèi)存儲(chǔ)器,主板外存儲(chǔ),離線存儲(chǔ),寄存器組 高速緩沖存儲(chǔ)器 主存儲(chǔ)器,磁盤 CD光盤 DVD光盤,磁帶 光-磁盤 一寫多讀光盤 (刻錄盤),1.每位價(jià)格降低 2.容量大 3.存儲(chǔ)時(shí)間長(速度慢),Hie

9、rarchy List,Registers(in CPU) L1 Cache L2 Cache Main memory Disk cache Disk Optical Tape,Example: two levels of memory,Level1 fast but size small Access time T1=0.01us Level2 slow but size big Access time T2=0.1us Suppose:(約定 ) CPU needs must in LeveL2(L2) maybe in LeveL1(L1),Page101,Access method:

10、CPU first access L1 memory, if find (Hit on 命中), directly gets the data(fast) if no find (Hit miss 未命中), then Access L2 memory, read required data to L1 memory Then deliver from L1 to CPU,Example: two levels of memory,Hhit on radio (x%) , Taverage Access time T1access time to L1, T2access time to L2

11、 T=HT1+(1-H)(T1+T2)=T1+(1-H)T2 If H is near to 100%, the T near to T1 If H is near to 0%, the T near to (T1+T2) Suppose H=95, then T=T1+(1-0.95)T2=0.015us Result: Speed=L1 Access time Capacity=L2 capacity,Page102,命中率,Performance of a simple two-level memory,Page101,T=T1+(1-H)T2,So you want fast,It i

12、s possible to build a computer which uses only static RAM (see later) This would be very fast This would need no cache-結(jié)構(gòu)簡化 How can you cache cache? This would cost a very large amount,Page103,4.2 Cache,Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or

13、 module,Page103,(L1:快,小),(L2:慢,大),“零售 ” “批發(fā)”,Cache operation - overview,CPU requests contents of memory location Check cache for this data If present, get from cache (fast) If not present, read required block from main memory to cache Then deliver from cache to CPU Cache includes tags to identify wh

14、ich block of main memory is in each cache slot(line),Page104,Cache read operation,Page105,RA=Real Address 實(shí)地址 或真實(shí)地址,Typical Cache Organization,Page106,Cache/Main memory structure,Page104,CM,Total:M blocks,Total:C lines,w0,Wk-1,w1,= Block size=Line size,w0,w1,Wk-1,w2,w3,4.3 Elements of Cache Design,C

15、ache size (capacity) Mapping Function Replacement Algorithm Write Policy Line Size Number of Caches,Page107,容量,映射函數(shù),替換算法,寫策略,Cache 數(shù)量,行大小,Cache的設(shè)計(jì)要素,1.Size does matter,Cost More cache is expensive Speed More cache is faster (up to a point) Checking cache for data takes time,容量,2.Mapping Function(1),

16、Because there are fewer cache lines than main memory blocks, an algorithm is needed for mapping main memory block into cache line. determine which main memory block currently occupies a cache line Three techniques: Direct Mapping 直接映射 Associative Mapping 關(guān)聯(lián)映射 Set Associative Mapping 組關(guān)聯(lián)映射*,映射函數(shù),(1)D

17、irect Mapping 直接映射,Each block of main memory maps to only one (the fixed) cache line i.e. if a block is in cache, it must be in one specific place Address code length is divided 3 parts :技術(shù)處理 Right bits identify unique word in the block (word number :Word#) Middle bits specify caches location (line

18、number : line #) Left bits serve as mark ( tag ),字的編號(hào),行的編號(hào),標(biāo)志,塊的編號(hào),Page107112,Note: Word number 字的編號(hào)(字號(hào)) Number of words 字?jǐn)?shù) Line number 行的編號(hào)(行號(hào)) Number of lines 行數(shù) block number 塊的編號(hào)(塊號(hào)) Number of blocks 塊數(shù) Address code length 地址碼長度(RA),For example(一個(gè)簡單的例子),已知: (1)Main memory size=16words (2)Line siz

19、e=Block size=2words (3)Cache size=8words,書外例子,Cache,Elements for all three cases,Cache of 64kbytes ( or word ) Cache block of 4 bytes i.e. cache is 16k (214) lines of 4 bytes 16Mbytes main memory,書上例子:Page107,Elements for all three cases,Cache Size=64KBytes = (216)bytes ( or word ) Cache Line size=4

20、bytes=(22)bytes Number of cache Lines=(216)bytes/(22)bytes/line =(214) line Block size=Line size=4bytes =(22)bytes Main Memory Size=16MBytes=(224)bytes Number of memory blocks =(224)bytes/(22)bytes/block = (222) block Length of RA=24 bits (224=16M) M=2N NAddress code bits MAddressable memory capacit

21、y,RA=Real Address 實(shí)地值 或真實(shí)地址,Page107,Direct Mapping Address Structure,Tag,Line #,Word #,8,14,2,24 bit address(RA) 2 bit word# (4 byte block) 14 bit line# and 8 bit tag (=24-2-14) No two blocks in the same line have the same Tag field Check contents of cache by finding line and checking Tag,block#,Pag

22、e110,bits,Direct Mapping Cache Organization,hit on,hit miss,block#,word #,RA,word#,line#,tag,Page109,(1),(2),(3),(3),(4) MM to Cache?,Direct Mapping Example,Page111,e.g. 16 339CH 0001 0110 00 1100 1110 0111 00B tag=16H r=0CE7H w=0H,Tag,Line #,Word #,8,14,2,Address format,339CH/4=? 0 3/4 3 C (3X16+3)

23、/4 3 E (3X16+9)/4 1 7 (1X16+12)/4 0,Direct Mapping Summary,Number of addressable units = 2n words or bytes =Address length = n bits (RA) Block size = line size = 2w words or bytes = Size of word# = w bits Number of blocks in main memory = 2n / 2w = 2n-w Number of lines in cache=2L = Size of line# =

24、L bits RA=n bits Size of tag = (n-w-L)bits tag L w bits,Page110,Direct Mapping pros & cons,Simple 實(shí)現(xiàn)簡單 Inexpensive 花費(fèi)少 Fixed location for given block 位置固定 If a program accesses 2 blocks that map to the same line repeatedly, cache misses are very high conflict(沖突),命中率低,正面地&反面地,Page111,(2)Associative

25、Mapping (關(guān)聯(lián)映射/全關(guān)聯(lián)映射),A main memory block can load into any line of cache Memory address code is interpreted as tag and word# Tag uniquely identifies block of memory Every lines tag is examined for a match Cache searching gets expensive,Page112,Tag 22 bit,Word 2 bit,Associative Mapping Address Struct

26、ure,22 bit tag stored with each 32 bit block of data Compare tag field with tag entry in cache to check for hit,Fully Associative Cache Organization,Page113,hit on,hit miss,block#,word #,RA,word#,tag,(1),(2),(2),(3) MM to Cache?,1:m,Associative Mapping Example,e.g. 16339CH 00 0101 1000 1100 1110 011

27、1 00B tag=058CE7H w=0H,Page114,Tag,Word #,22,2,Address format,Associative Mapping Summary,Address length =n bits=(tag bits + word#) Number of addressable units = 2n words or bytes Block size = line size = 2w words or bytes Number of blocks in main memory =2n / 2w = 2 n-w Number of lines in cache=und

28、etermined Size of tag = n-w bits,Page112,在地址格式中未確定 可映射到cache中的任意一行,Associative Mapping pros & cons,Flexible 靈活的 Require complex circuitry to examine the tags of all cache lines in parallel 檢查tag電路復(fù)雜 (Comparing tag is most difficulty Hardware difficulty ),Page112,Reviews(回顧),Direct Mapping Each block

29、 of main memory maps to only one (the fixed) cache line Simple/Inexpensive Conflict/hit miss highly Associative Mapping a main memory block can load into any line of cache Flexible Compare tag difficulty,(3)Set Associative Mapping 組關(guān)聯(lián)映射,Cache is divided into a number of sets Each set contains a numb

30、er of lines A given block maps to any line in a given (the fixed) set e.g. Block B can be in any line of set i e.g. 2 lines per set 2-way associative mapping A given block can be in one of 2 lines in only one set (fixed),Page112,K-Way :K路組關(guān)聯(lián)映射,Address Format,tag set word,Two Way Set Associative Cach

31、e Organization,Page116,hit on,hit miss,RA,word#,set#,tag,(1),(2),(3),(3),1:k (km),(4) MM to Cache?,Set Associative Mapping Address Structure,Number of sets in cache= (214) lines /2lines/set = (213) sets tag=RA - Set# - Word# =24 - 13 - 2=9bits Use set field to determine cache set to look in Compare

32、tag field to see if we have a hit,Tag 9 bit,Set# 13 bit,Word# 2 bit,Two Way Set Associative Mapping Example,e.g. 16339CH 0001 0110 00 1100 1110 0111 00B tag=02CH set=0CE7H w=0H,Tag,set #,Word #,9,13,2,Address format,Set Associative Mapping Summary,Address length = n bits Number of addressable units

33、= 2n words or bytes Block size = line size=2w words or bytes Number of blocks in main memory = 2n-w Number of lines in set = k (k行一組) Number of sets = v = 2s Number of lines in cache = kv = k * 2s Size of tag = (n s w) bits,Page115,Mapping,已知:Cache of 64k Words Cache block of 4 Words 16M Words main

34、memory K-Way set 實(shí)際應(yīng)用中 2,4,8,16 Memory size =RA= ? Bits Block size or line size =W= ? Bits Cache size/line size/(K lines/set) =Set#= ? Bits Number of lines in cache =line#= ? Bits,3.Replacement Algorithms (1)(1)Direct mapping,When a new block is brought into the Cache,one of the existing block must

35、be replaced No choice Each block only maps to one line a fixed line Replace that line,替換算法,Page115,which one?,Replacement Algorithms (2)(2)Associative & Set Associative,Hardware implemented algorithm (speed) Least Recently used(LRU) First in first out (FIFO) Least frequently used Random,最近最少使用法,先進(jìn)先出

36、法,最不經(jīng)常使用法,隨機(jī)法,Replacement Algorithms (2-1)Associative & Set Associative,Random replace block at random Least frequently used(LFU) replace block which has had fewest hits using a counter to record frequency First in first out (FIFO) replace block that has been in cache longest using a counter to reco

37、rd history,隨機(jī)法,最不經(jīng)常使用法,先進(jìn)先出法,Replacement Algorithms (2-2)Associative & Set Associative,Least Recently used (LRU) Each line includes a Use bit,when a line is referenced(or accessed),its Use bit is set to 1 and the Use bit of the other line in same set is set to 0 So, the Use bit is 0,meaning the line

38、 is LRU, replacement is occurring in the line,最近最少使用法,Replacement Algorithms (2-3)Associative & Set Associative,Least Recently used (LRU) e.g. in 2 way set associative Which of the 2 block is LRU? 2 ways for each set have a flag bit Use=1 means left way is just accessed So replacement make right one

39、(Use=0 ),最近最少使用法,Use,a set,Use,1,0,Left,Right,4.Write Policy,If CPU modifies the content of cache different with memory when modify the content of main memory? write through write back,寫策略,Page118,(1)Write through,All writes go to main memory as well as cache The data in a cache line is changed(writ

40、e), the change must occur in memory block at the same time Lots of traffic Slows down writes Remember bogus write through caches!,寫直達(dá)/寫通/寫透法,Page118,總線上傳輸繁忙,降低寫操作的效率,“假的”-沒有實(shí)現(xiàn)過,(2)Write back,Updates initially made in cache only Update bit for cache line is set when update occurs If block is to be re

41、placed, write to main memory only if update bit is set,寫回/拷回法,寫操作占存儲(chǔ)器操作的 15,5.Line Size,Line size=block size Simple, a large block size(same as line size) may rise the hit in ratio But heavy traffic to bus,行的大小,Page119,尺寸大的行,可以增加命中率,但是傳輸過程中對(duì)總線的壓力增大。,6.Number of Caches,Multilevel Caches on chip cache

42、(片內(nèi)Cache),or board cache(板上Cache) CPU,L1 Cache,L2 Cache,Memory 1:110:2 100:50(速度) Unified versus Split Caches on-chip cache holds data & instructions Unified cache , dynamic adjust the size of data or instruction by Hit On Radio Split cache:data writes back memory, but instructions does not ,Cache數(shù)目

43、,Page119,多級(jí)Cache,統(tǒng)一和分立Cache,Unified Cache,4.4 Pentium 4 Cache and Power PC Cache,80386 no on chip cache 80486 a single on-cache of 8KBytes using 16 byte lines and four way set associative organization Pentium (all versions) two on chip L1 caches Data & instructions Pentium 4 L1 caches(Data Cache & I

44、nstruction Cache) 8k bytes(Data Cache) 64 byte lines four way set associative L2 cache Feeding both L1 caches 256k 128 byte lines 8 way set associative,Page121,Pentium 4 Diagram (Simplified),L1指令Cache,L1數(shù)據(jù)Cache,L2,Cache,Page122,Power PC Cache Organization,601 single 32kb 8 way set associative 603 16kb (2 x 8kb) two way set associative 604 32kb 620 6

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論