




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認(rèn)領(lǐng)
文檔簡介
1、Pipelining-1OutlineAn overview of pipeliningA pipelined datapathPipelined controlData hazards and forwardingData hazards and stallsBranch hazardsExceptionsSuperscalar and dynamic pipeliningPipelining-2Laundry example:Ann, Brian, Cathy, Dave each have one load ofclothes to wash, dry,and foldWasher ta
2、kes 30 minutesDryer takes 40 minutes“Folder” takes 20 minutesABCDPipelining Is Natural!Pipelining-3Sequential laundry takes 6 hours for 4 loadsIf they learned pipelining, how long would it take? ABCD3040203040203040203040206 PM7891011MidnightTaskOrderTimeSequential LaundryPipelining-4Pipelined laund
3、ry takes 3.5 hours for 4 loads ABCD6 PM7891011MidnightTaskOrderTime304040404020Pipelined Laundry: Start ASAPPipelining-5Pipelining LessonsDoesnt help latency of single task, but throughput of entirePipeline rate limited by slowest stageMultiple tasks working at same time using different resourcesPot
4、ential speedup = Number pipe stagesUnbalanced stage length; time to “fill” & “drain” the pipeline reduce speedupStall for dependencesABCD6 PM789TaskOrderTime304040404020Pipelining-6Single cycle vs. PipelineClkCycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10LoadPipeline Impleme
5、ntation:ClkSingle Cycle Implementation:LoadStoreWasteIfetchRegExecMemWrIfetchRegExecMemWrStoreIfetchRegExecMemWrR-typeCycle 1Cycle 2Pipelining-7Pipeline PerformanceSingle-cycle (Tc= 800ps)Pipelined (Tc= 200ps)Pipelining-8Instr.OrderTime (clock cycles)Inst 0Inst 1Inst 2Inst 4Inst 3ALUImRegDmRegALUImR
6、egDmRegALUImRegDmRegALUImRegDmRegALUImRegDmRegWhy Pipeline? Because the Resources Are There!Single-cycle DatapathPipelining-9OutlineAn overview of pipeliningA pipelined datapathPipelined controlData hazards and forwardingData hazards and stallsBranch hazardsExceptionsSuperscalar and dynamic pipelini
7、ngPipelining-10Designing a Pipelined ProcessorExamine the datapath and control diagramStarting with single cycle datapathSingle cycle control?Partition datapath into stages:IF (instruction fetch), ID (instruction decode and register file read), EX (execution or address calculation), MEM (data memory
8、 access), WB (write back)Associate resources with stagesEnsure that flows do not conflict, or figure out how to resolveAssert control in appropriate stagePipelining-11Multi-Execution StepsBut, use single-cycle datapath .Pipelining-12Split Single-cycle DatapathWhat to add to split the datapath into s
9、tages?FeedbackPathPipelining-13Pipeline registers (latches)Add Pipeline RegistersUse registers between stages to carry data and controlPipelining-14IF: Instruction FetchFetch the instruction from the Instruction MemoryID: Instruction DecodeRegisters fetch and instruction decodeEX: Calculate the memo
10、ry addressMEM: Read the data from the Data MemoryWB: Write the data back to the register fileCycle 1Cycle 2Cycle 3Cycle 4Cycle 5IfetchReg/DecExecMemWrLoadConsider loadPipelining-15ClockCycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7IfetchReg/DecExecMemWr1st lwIfetchReg/DecExecMemWr2nd lwIfetchReg/
11、DecExecMemWr3rd lwPipelining load5 functional units in the pipeline datapath are:Instruction Memory for the Ifetch stageRegister Files Read ports (busA and busB) for the Reg/Dec stageALU for the Exec stageData Memory for the MEM stageRegister Files Write port (busW) for the WB stagePipelining-16IR =
12、 memPC; PC = PC + 4IF Stage of loadIR, PC+4Pipelining-17ID Stage of loadA = RegIR25-21; B = RegIR20-16;Pipelining-18EX Stage of loadALUout = A + sign-ext(IR15-0)Pipelining-19MEM State of loadMDR = memALUoutPipelining-20WB Stage of loadRegIR20-16 = MDRWho will supply this address?Pipelining-21Cycle 1
13、Cycle 2Cycle 3Cycle 4IfetchReg/DecExecWrR-typeThe Four Stages of R-typeIF: fetch the instruction from the Instruction MemoryID: registers fetch and instruction decodeEX: ALU operates on the two register operandsWB: write ALU output back to the register filePipelining-22We have a structural hazard:Tw
14、o instructions try to write to the register file at the same time!Only one write portClockCycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9IfetchReg/DecExecWrR-typeIfetchReg/DecExecWrR-typeIfetchReg/DecExecMemWrLoadIfetchReg/DecExecWrR-typeIfetchReg/DecExecWrR-typeOps! We have a proble
15、m!Pipelining R-type and loadPipelining-23Important ObservationIfetchReg/DecExecMemWrLoad12345IfetchReg/DecExecWrR-type1234Each functional unit can only be used once per instructionEach functional unit must be used at the same stage for all instructions:Load uses Register Files write port during its
16、5th stageR-type uses Register Files write port during its 4th stageSeveral ways to solve: forwarding, adding pipeline bubble, making instructions same lengthPipelining-24ClockCycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9IfetchReg/DecMemWrR-typeIfetchReg/DecMemWrR-typeIfetchReg/DecE
17、xecMemWrLoadIfetchReg/DecMemWrR-typeIfetchReg/DecMemWrR-typeIfetchReg/DecExecWrR-typeMemExecExecExecExec12345Solution: Delay R-types WriteDelay R-types register write by one cycle:R-type also use Reg Files write port at Stage 5MEM is a NOP stage: nothing is being done.R-type also has 5 stagesPipelin
18、ing-25Cycle 1Cycle 2Cycle 3Cycle 4IfetchReg/DecExecMemStoreWrThe Four Stages of storeIF: fetch the instruction from the Instruction MemoryID: registers fetch and instruction decodeEX: calculate the memory addressMEM: write the data into the Data MemoryAdd an extra stage:WB: NOPPipelining-26IF: fetch
19、 the instruction from the Instruction MemoryID: registers fetch and instruction decodeEX: compares the two register operandselect correct branch target addresslatch into PCAdd two extra stages:MEM: NOPWB: NOPCycle 1Cycle 2Cycle 3Cycle 4IfetchReg/DecExecMemBeqWrThe Three Stages of beqPipelining-27Pip
20、elined DatapathPipelining-28Graphically Representing PipelinesCan help with answering questions like:How many cycles to execute this code?What is the ALU doing during cycle 4?Help understand datapathsPipelining-29Example 1: Cycle 1Pipelining-30Example 1: Cycle 2Pipelining-31Example 1: Cycle 3Pipelin
21、ing-32Example 1: Cycle 4Pipelining-33Example 1: Cycle 5Pipelining-34Example 1: Cycle 6Pipelining-35OutlineAn overview of pipeliningA pipelined datapathPipelined controlData hazards and forwardingData hazards and stallsBranch hazardsExceptionsSuperscalar and dynamic pipeliningPipelining-36Pipeline Co
22、ntrol: Control SignalsPipelining-37Fig. 4.22Group Signals According to StagesCan use control signals of single-cycle CPU Pipelining-38Pass control signals along just like the dataMain control generates control signals during ID Data Stationary ControlFig. 4.50Pipelining-39IF/ID RegisterID/Ex Registe
23、rEx/MEM RegisterMEM/WB RegisterIDEXMEMExtOpALUOpRegDstALUSrcBranchMemWrMemtoRegRegWrMainControlExtOpALUOpRegDstALUSrcMemtoRegRegWrMemtoRegRegWrMemtoRegRegWrBranchMemWrBranchMemWWBData Stationary Control (cont.)Signals for EX (ExtOp, ALUSrc, .) are used 1 cycle laterSignals for MEM (MemWr, Branch) ar
24、e used 2 cycles laterSignals for WB (MemtoReg, MemWr) are used 3 cycles laterPipelining-40WB Stage of loadRegIR20-16 = MDRWho will supply this address?Pipelining-41Datapath with ControlPipelining-42lw $10, 20($1)sub$11, $2, $3and$12, $4, $5or$13, $6, $7add$14, $8, $9Lets Try it OutPipelining-43Examp
25、le 2: Cycle 1Pipelining-44Example 2: Cycle 2Pipelining-45Example 2: Cycle 3Pipelining-46Example 2: Cycle 4Pipelining-47Example 2: Cycle 5Pipelining-48Example 2: Cycle 6Pipelining-49Example 2: Cycle 7Fig. 6.34Pipelining-50Example 2: Cycle 8Fig. 6.34Pipelining-51Example 2: Cycle 9Pipelining-52Summary
26、of Pipeline BasicsPipelining is a fundamental conceptMultiple steps using distinct resourcesUtilize capabilities of datapath by pipelined instruction processingStart next instruction while working on the current oneLimited by length of longest stage (plus fill/flush)Need to detect and resolve hazard
27、s What makes it easy in MIPS?All instructions are of the same lengthJust a few instruction formatsMemory operands only in loads and storesWhat makes pipelining hard? hazardsPipelining-53OutlineAn overview of pipeliningA pipelined datapathPipelined controlData hazards and forwarding (R-Type and R-Typ
28、e)Data hazards and stalls (Load and R-type)Branch hazardsExceptionsSuperscalar and dynamic pipeliningPipelining-54Pipeline HazardsPipeline Hazards:Structural hazards: attempt to use the same resource in two different ways at the same timeEx.: combined washer/dryer or folder busy doing something else
29、 (watching TV)Data hazards: attempt to use item before readyInstruction depends on result of prior instruction still in the pipelineControl hazards: attempt to make decision before condition is evaluatedEx.: wash football uniforms and need to see result of previous load to get proper detergent level
30、Branch instructionsCan always resolve hazards by waitingpipeline control must detect the hazardtake action (or delay action) to resolve hazardsPipelining-55MemInstr.OrderTimeLoadInstr 1Instr 2Instr 3Instr 4ALUMemRegMemRegALUMemRegMemRegALUMemRegMemRegALURegMemRegALUMemRegMemRegStructural Hazard: Sin
31、gle MemoryPipelining-56Pipeline Hazards IllustratedIF ID EX MEM WBStructuralHazardIF ID .timePipelining-57MemInstr.OrderTimeLoadInstr 1Instr 2Instr 3Instr 4ALUMemRegMemRegALUMemRegMemRegALUMemRegMemRegALURegMemRegALUMemRegMemRegUse 2 memory: data memory and instruction memoryStructural Hazard: Singl
32、e MemoryPipelining-58InstructionmemoryAddress4320AddAddresultShiftleft 2InstructionIF/IDEX/MEMMEM/WBMux01AddPC0WritedataMux1RegistersReaddata 1Readdata 2Readregister 1Readregister 216SignextendWriteregisterWritedataReaddata1ALUresultMuxALUZeroID/EXDatamemoryAddressFeedback PathPipelining-59Data Haza
33、rdsIMRegIMRegCC 1CC 2CC 3CC 4CC 5CC 6Time (in clock cycles)sub $2, $1, $3Programexecutionorder(in instructions)and $12, $2, $5IMRegDMRegIMDMRegIMDMRegCC 7CC 8CC 91010101010/-20-20-2 0-20-20or $13, $6, $2add $14, $2, $2sw $15, 100($2)Value of register $2:DMRegRegRegRegDMPipelining-60Types of Data Haz
34、ardsThree types: (inst. i1 followed by inst. i2)RAW (read after write): i2 tries to read operand before i1 writes itWAR (write after read):i2 tries to write operand before i1 reads itGets wrong operand, e.g., autoincrement addr.Cant happen in MIPS 5-stage pipeline because:All instructions take 5 sta
35、ges, and reads are always in stage 2, and writes are always in stage 5WAW (write after write): i2 tries to write operand before i1 writes itLeaves wrong result ( i1s not i2s); occur only in pipelines that write in more than one stageCant happen in MIPS 5-stage pipeline because:All instructions take
36、5 stages, and writes are always in stage 5Pipelining-61Pipeline Hazards Illustrated IF ID EX MEM WB IF ID EX MemRAW (read after write) Data HazardWAW Data Hazard (write after write) IF ID EX MEM WBWAR Data Hazard (write after read) IF ID EX MEM WB IF ID EX MEM WBtimePipelining-62Handling Data Hazard
37、sUse simple, fixed designsEliminate WAR by always fetching operands early (ID) in pipelineEliminate WAW by doing all write backs in order (last stage, static)These features have a lot to do with ISA designInternal forwarding in register file:Write in first half of clock and read in second halfRead d
38、elivers what is written, resolve hazard between sub and addDetect and resolve remaining onesCompiler inserts NOP (software solution)Forward (hardware solution)Stall (hardware solution)Pipelining-63Software SolutionHave compiler guarantee no hazardsWhere do we insert the NOPs?sub$2, $1, $3and $12, $2
39、, $5or$13, $6, $2add$14, $2, $2sw$15, 100($2)Problem: this really slows us down!Pipelining-64Data HazardsIMRegIMRegCC 1CC 2CC 3CC 4CC 5CC 6Time (in clock cycles)sub $2, $1, $3Programexecutionorder(in instructions)and $12, $2, $5IMRegDMRegIMDMRegIMDMRegCC 7CC 8CC 91010101010/-20-20-2 0-20-20or $13, $
40、6, $2add $14, $2, $2sw $15, 100($2)Value of register $2:DMRegRegRegRegDMInsert two nopsPipelining-65Data Hazards : ForwardingIMRegIMRegCC 1CC 2CC 3CC 4CC 5CC 6Time (in clock cycles)sub $2, $1, $3Programexecutionorder(in instructions)and $12, $2, $5IMRegDMRegIMDMRegIMDMRegCC 7CC 8CC 91010101010/-20-2
41、0-20-20-20or $13, $6, $2add $14, $2, $2sw $15, 100($2)Value of register $2:DMRegRegRegRegDMPipelining-66Datapath with ForwardingPipelining-67Control: Detecting Data HazardsHazard conditions:1a. EX/MEM.RegisterRd = ID/EX.RegisterRs1b. EX/MEM.RegisterRd = ID/EX.RegisterRt2a. MEM/WB.RegisterRd = ID/EX.
42、RegisterRs2b. MEM/WB.RegisterRd = ID/EX.RegisterRtTwo optimizations:Dont forward if instruction does not write register= check if RegWrite is assertedDont forward if destination register is $0= check if RegisterRd = 0Pipelining-68Detecting Data Hazards (cont.)Hazard conditions using control signals:
43、At EX stage:EX/MEM.RegWrite and (EX/MEM.RegRd0) and (EX/MEM.RegRd=ID/EX.RegRs)At MEM stage:MEM/WB.RegWrite and (MEM/WB.RegRd0) and (MEM/WB.RegRd=ID/EX.RegRs)(replace ID/EX.RegRt for ID/EX.RegRs for the other two conditions)Pipelining-69Use temporary results, e.g., those in pipeline registers, dont w
44、ait for them to be writtenResolving Hazards: ForwardingPipelining-70Datapath with ForwardingPipelining-71Forwarding LogicForwarding: input to ALU from any pipe reg.Add multiplexors to ALU input Control forwarding in EX = carry Rs in ID/EXControl signals for forwarding:If both WB and MEM forward, e.g
45、., add $1,$1,$2; add $1,$1,$3; add $1,$1,$4; = let MEM forwardEX hazard:if (EX/MEM.RegWrite and (EX/MEM.RegRd0)and (EX/MEM.RegRd=ID/EX.RegRs) ForwardA=10MEM hazard:if (MEM/WB.RegWrite and (MEM/WB.RegRd0)and (EX/MEM.RegRd ID/EX.Reg.Rs)and (MEM/WB.RegRd=ID/EX.RegRs) ForwardA=01(ID/EX.RegRtID/EX.RegRs,
46、 ForwardB ForwardA)Pipelining-72Example 3: Cycle 3Pipelining-73Example 3: Cycle 4Fig. 6.41Pipelining-74Example 3: Cycle 5Pipelining-75Example 3: Cycle 6Fig. 6.42Pipelining-76OutlineAn overview of pipeliningA pipelined datapathPipelined controlData hazards and forwarding (R-Type and R-Type)Data hazar
47、ds and stalls (Load and R-type)Branch hazardsExceptionsSuperscalar and dynamic pipeliningPipelining-77lw can still cause a hazard:if is followed by an instruction to read the loaded reg. Cant Always ForwardUse stalling or compiler to resolvePipelining-78StallingStall pipeline by keeping instructions
48、 in same stage and inserting an NOP insteadPipelining-79Datapath with Stalling UnitForwarding controls ALU inputs, hazard detection controls PC, IF/ID, control signalsPipelining-80Control: Handling StallsHazard detection unit in ID to insert stall between a load instruction and its use: if (ID/EX.Me
49、mRead and (ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.registerRt) stall the pipeline for one cycle(ID/EX.MemRead=1 indicates a load instruction)How to stall?Stall instruction in IF and ID: not change PC and IF/ID= the stages re-execute the instructionsWhat to move into EX: ins
50、ert an NOP by changing EX, MEM, WB control fields of ID/EX pipeline register to 0as control signals propagate, all control signals to EX, MEM, WB are deasserted and no registers or memories are writtenPipelining-81Example 4: Cycle 2Pipelining-82Example 4: Cycle 3Pipelining-83Example 4: Cycle 4Pipeli
51、ning-84Example 4: Cycle 5Pipelining-85Example 4: Cycle 6Fig. 6.49Pipelining-86Example 4: Cycle 7Pipelining-87OutlineAn overview of pipeliningA pipelined datapathPipelined controlData hazards and forwardingData hazards and stallsBranch hazardsExceptionsSuperscalar and dynamic pipeliningPipelining-88I
52、nstructionmemoryAddress4320AddAddresultShiftleft 2InstructionIF/IDEX/MEMMEM/WBMux01AddPC0WritedataMux1RegistersReaddata 1Readdata 2Readregister 1Readregister 216SignextendWriteregisterWritedataReaddata1ALUresultMuxALUZeroID/EXDatamemoryAddressFeedback PathPipelining-89Pipeline Datapath with Control
53、SignalsPipelining-90When decide to branch, other inst. are in pipeline!Branch HazardsPipelining-91Handling Branch HazardPredict branch always not takenNeed to add hardware for flushing inst. if wrongBranch decision made at MEM = need to flush instruction in IF/ID, ID/EX by changing control values to
54、 0Reduce delay of taken branch by moving branch execution earlier in the pipelineMove up branch address calculation to IDCheck branch equality at ID (using XOR) by comparing the two registers read during IDBranch decision made at ID = one instruction to flushAdd a control signal, IF.Flush, to zero i
55、nstruction field of IF/ID = making the instruction an NOPDynamic branch predictionCompiler rescheduling, delay branchPipelining-92Pipeline with FlushingPipelining-93Example 5: Cycle 3Pipelining-94Example 5: Cycle 4Pipelining-95Dynamic Branch PredictionIn deeper and superscalar pipelines, branch pena
56、lty is more significantUse dynamic prediction (e.g. loop)Branch prediction buffer (i.e., branch history table)Indexed by recent branch instruction addressesStores outcome (taken/not taken)To execute a branchCheck table, expect the same outcomeStart fetching from fall-through or targetIf wrong, flush
57、 pipeline and flip predictionPipelining-961-Bit Predictor: ShortcomingInner loop branches mispredicted twice!outer: inner: beq , , inner beq , , outerMispredict as taken on last iteration of inner loopThen mispredict as not taken on first iteration of inner loop next time aroundPipelining-972-Bit Pr
58、edictorOnly change prediction on two successive mis-predictionsPipelining-98Calculating the Branch TargetEven with predictor, still need to calculate the target address1-cycle penalty for a taken branchBranch target bufferCache of target addressesIndexed by PC when instruction fetchedIf hit and inst
59、ruction is branch predicted taken, can fetch target immediatelyPipelining-99Predict-not-taken + branch decision at ID= the following instruction is always executed= branches take effect 1 cycle later0 clock cycle penalty per branch instruction if can find instruction to put in slot (50% of time)Inst
60、r.OrderTime (clock cycles)addbeqmiscALUMemRegMemRegALUMemRegMemRegMemALURegMemReglwMemALURegMemRegDelayed BranchPipelining-100OutlineAn overview of pipeliningA pipelined datapathPipelined controlData hazards and forwardingData hazards and stallsBranch hazardsExceptionsSuperscalar and dynamic pipelin
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 化工產(chǎn)品批發(fā)商銷售技能提升考核試卷
- 儀器制造中的質(zhì)量控制與檢測技術(shù)考核試卷
- 機器人情感識別與表達(dá)考核試卷
- 高級審計培訓(xùn)課件
- 批發(fā)市場魚品安全監(jiān)管考核試卷
- 批發(fā)市場的小批量訂單處理考核試卷
- 飼料店轉(zhuǎn)讓合同范本
- 教學(xué)加盟合同范本
- 材料合同范本簡易圖表
- 食品運輸儲藏合同范本
- 內(nèi)科學(xué)講義(唐子益版)
- GB/T 19845-2005機械振動船舶設(shè)備和機械部件的振動試驗要求
- GB 9706.14-1997醫(yī)用電氣設(shè)備第2部分:X射線設(shè)備附屬設(shè)備安全專用要求
- 測繪安全生產(chǎn)專題培訓(xùn)課件
- 心肺復(fù)蘇簡易呼吸器使用除顫儀使用
- 油缸裝配作業(yè)指導(dǎo)書
- 2022年濟南工程職業(yè)技術(shù)學(xué)院單招綜合素質(zhì)考試筆試試題及答案解析
- 初中數(shù)學(xué)競賽試題匯編
- GB∕Z 27735-2022 野營帳篷
- 高分子材料研究方法 X 射線法
- 【課件】第二單元第三節(jié)漢族民歌課件-2021-2022學(xué)年高中音樂人音版(2019)必修音樂鑒賞
評論
0/150
提交評論