計算機(jī)組織與結(jié)構(gòu)(英文版)課后答案_第1頁
計算機(jī)組織與結(jié)構(gòu)(英文版)課后答案_第2頁
計算機(jī)組織與結(jié)構(gòu)(英文版)課后答案_第3頁
計算機(jī)組織與結(jié)構(gòu)(英文版)課后答案_第4頁
計算機(jī)組織與結(jié)構(gòu)(英文版)課后答案_第5頁
已閱讀5頁,還剩69頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

SolutionsManualCOMPUTERORGANIZATIONANDARCHITECTUREDesigningforPerformanceSeventhEditionWilliamStallingsCopyright2005:WilliamStallings

?2005byCONTACT_Con-475D4AA31\c\s\lWilliamStallingsAllrightsreserved.Nopartofthisdocumentmaybereproduced,inanyformorbyanymeans,orpostedontheInternet,withoutpermissioninwritingfromtheauthor.

NoticeThismanualcontainssolutionstoallofthereviewquestionsandhomeworkproblemsinComputerOrganizationandArchitecture,SeventhEdition.Ifyouspotanerrorinasolutionorinthewordingofaproblem,Iwouldgreatlyappreciateitifyouwouldforwardtheinformationviaemailtows@.Anerratasheetforthismanual,ifneeded,isavailableatWilliamSW.S.

TABLEOFCONTENTSChapter2: ComputerEvolutionandPerformance 5Chapter3: ComputerFunctionandInterconnection 9Chapter4: CacheMemory 14Chapter5: InternalMemory 27Chapter6: ExternalMemory 33Chapter7: Input/Output 37Chapter8: OperatingSystemSupport 43Chapter9: ComputerArithmetic 48Chapter10: InstructionSets:CharacteristicsandFunctions 61Chapter11: InstructionSets:AddressingModesandFormats 72Chapter12: ProcessorStructureandFunction 77Chapter13: ReducedInstructionSetComputers(RISCs) 83Chapter14: Instruction-LevelParallelismandSuperscalarProcessors 87Chapter15: TheIA-64Architecture 93Chapter16: ControlUnitOperation 97Chapter17: MicroprogrammedControl 100Chapter18: ParallelProcessing 103AppendixA: NumberSystems 112AppendixB: DigitalLogic 113

Chapter2ComputerEvolutionandPerformanceAnswerstoQuestions2.1 Inastoredprogramcomputer,programsarerepresentedinaformsuitableforstoringinmemoryalongsidethedata.Thecomputergetsitsinstructionsbyreadingthemfrommemory,andaprogramcanbesetoralteredbysettingthevaluesofaportionofmemory.2.2 Amainmemory,whichstoresbothdataandinstructions:anarithmeticandlogicunit(ALU)capableofoperatingonbinarydata;acontrolunit,whichinterpretstheinstructionsinmemoryandcausesthemtobeexecuted;andinputandoutput(I/O)equipmentoperatedbythecontrolunit.2.3 Gates,memorycells,andinterconnectionsamonggatesandmemorycells.2.4 Mooreobservedthatthenumberoftransistorsthatcouldbeputonasinglechipwasdoublingeveryyearandcorrectlypredictedthatthispacewouldcontinueintothenearfuture.2.5 Similaroridenticalinstructionset:Inmanycases,thesamesetofmachineinstructionsissupportedonallmembersofthefamily.Thus,aprogramthatexecutesononemachinewillalsoexecuteonanyother.Similaroridenticaloperatingsystem:Thesamebasicoperatingsystemisavailableforallfamilymembers.Increasingspeed:Therateofinstructionexecutionincreasesingoingfromlowertohigherfamilymembers.IncreasingNumberofI/Oports:Ingoingfromlowertohigherfamilymembers.Increasingmemorysize:Ingoingfromlowertohigherfamilymembers.Increasingcost:Ingoingfromlowertohigherfamilymembers.2.6 Inamicroprocessor,allofthecomponentsoftheCPUareonasinglechip.AnswerstoProblems2.1 Thisprogramisdevelopedin[HAYE98].ThevectorsA,B,andCareeachstoredin1,000contiguouslocationsinmemory,beginningatlocations1001,2001,and3001,respectively.Theprogrambeginswiththelefthalfoflocation3.AcountingvariableNissetto999anddecrementedaftereachstepuntilitreaches–1.Thus,thevectorsareprocessedfromhighlocationtolowlocation.

LocationInstructionComments0999Constant(countN)11Constant21000Constant3LLOADM(2000)TransferA(I)toAC3RADDM(3000)ComputeA(I)+B(I)4LSTORM(4000)TransfersumtoC(I)4RLOADM(0)LoadcountN5LSUBM(1)DecrementNby15RJUMP+M(6,20:39)TestNandbranchto6Rifnonnegative6LJUMPM(6,0:19)Halt6RSTORM(0)UpdateN7LADDM(1)IncrementACby17RADDM(2)8LSTORM(3,8:19)Modifyaddressin3L8RADDM(2)9LSTORM(3,28:39)Modifyaddressin3R9RADDM(2)10LSTORM(4,8:19)Modifyaddressin4L10RJUMPM(3,0:19)Branchto3L2.2 a.OpcodeOperand00000001000000000010 b. First,theCPUmustmakeaccessmemorytofetchtheinstruction.Theinstructioncontainstheaddressofthedatawewanttoload.Duringtheexecutephaseaccessesmemorytoloadthedatavaluelocatedatthataddressforatotaloftwotripstomemory.2.3 Toreadavaluefrommemory,theCPUputstheaddressofthevalueitwantsintotheMAR.TheCPUthenassertstheReadcontrollinetomemoryandplacestheaddressontheaddressbus.Memoryplacesthecontentsofthememorylocationpassedonthedatabus.ThisdataisthentransferredtotheMBR.Towriteavaluetomemory,theCPUputstheaddressofthevalueitwantstowriteintotheMAR.TheCPUalsoplacesthedataitwantstowriteintotheMBR.TheCPUthenassertstheWritecontrollinetomemoryandplacestheaddressontheaddressbusandthedataonthedatabus.Memorytransfersthedataonthedatabusintothecorrespondingmemorylocation.

2.4AddressContents08A 08B 08C08DLOADM(0FA)STORM(0FB)LOADM(0FA)JUMP+M(08D)LOAD–M(0FA)STORM(0FB) Thisprogramwillstoretheabsolutevalueofcontentatmemorylocation0FAintomemorylocation0FB.2.5 Alldatapathsto/fromMBRare40bits.Alldatapathsto/fromMARare12bits.Pathsto/fromACare40bits.Pathsto/fromMQare40bits.2.6Thepurposeistoincreaseperformance.Whenanaddressispresentedtoamemorymodule,thereissometimedelaybeforethereadorwriteoperationcanbeperformed.Whilethisishappening,anaddresscanbepresentedtotheothermodule.Foraseriesofrequestsforsuccessivewords,themaximumrateisdoubled.2.7 Thediscrepancycanbeexplainedbynotingthatothersystemcomponentsasidefromclockspeedmakeabigdifferenceinoverallsystemspeed.Inparticular,memorysystemsandadvancesinI/Oprocessingcontributetotheperformanceratio.Asystemisonlyasfastasitsslowestlink.Inrecentyears,thebottleneckshavebeentheperformanceofmemorymodulesandbusspeed.2.8 AsnotedintheanswertoProblem2.7,eventhoughtheIntelmachinemayhaveafasterclockspeed(2.4GHzvs.1.2GHz),thatdoesnotnecessarilymeanthesystemwillperformfaster.Differentsystemsarenotcomparableonclockspeed.Otherfactorssuchasthesystemcomponents(memory,buses,architecture)andtheinstructionsetsmustalsobetakenintoaccount.Amoreaccuratemeasureistorunbothsystemsonabenchmark.Benchmarkprogramsexistforcertaintasks,suchasrunningofficeapplications,performingfloatingpointoperations,graphicsoperations,andsoon.Thesystemscanbecomparedtoeachotheronhowlongtheytaketocompletethesetasks.AccordingtoAppleComputer,theG4iscomparableorbetterthanahigher-clockspeedPentiumonmanybenchmarks.2.9 Thisrepresentationiswastefulbecausetorepresentasingledecimaldigitfrom0through9weneedtohavetentubes.IfwecouldhaveanarbitrarynumberofthesetubesONatthesametime,thenthosesametubescouldbetreatedasbinarybits.Withtenbits,wecanrepresent210patterns,or1024patterns.Forintegers,thesepatternscouldbeusedtorepresentthenumbersfrom0through1023.

2.10 IcpmkInstructionsetarchitectureXXCompilertechnologyXXXProcessorimplementationXXCacheandmemoryhierarchyXX Source:[HWAN93]2.11 MIPSrate=f/(CPI106)2.12 a. WecanexpresstheMIPsrateas:[(MIPSrate)/106]=Ic/T.Sothat: Ic=T[(MIPSrate)/106].TheratiooftheinstructioncountoftheRS/6000totheVAXis[x18]/[12x1]=1.5. b. FortheVax,CPI=(5MHz)/(1MIPS)=5. FortheRS/6000,CPI=25/18=3 CPI=1.55;MIPSrate=25.8;Executiontime=3.87ns.Source:[HWAN93]2.14 a. Ultimately,theuserisconcernedwiththeexecutiontimeofasystem,notitsexecutionrate.IfwetakearithmeticmeanoftheMIPSratesofvariousbenchmarkprograms,wegetaresultthatisproportionaltothesumoftheinversesofexecutiontimes.Butthisisnotinverselyproportionaltothesumofexecutiontimes.Inotherwords,thearithmeticmeanoftheMIPSratedoesnotcleanlyrelatetoexecutiontime.Ontheotherhand,theharmonicmeanMIPSrateistheinverseoftheaverageexecutiontime. b. ArithmeticmeanHarmonicMeanRankComputerA25.3MIPS0.25MIPS2ComputerB2.8MIPS0.21MIPS3ComputerC3.25MIPS2.1MIPS1

hapter3ComputerFunctionandInterconnectionAnswerstoQuestions3.1 Processor-memory:Datamaybetransferredfromprocessortomemoryorfrommemorytoprocessor.Processor-I/O:DatamaybetransferredtoorfromaperipheraldevicebytransferringbetweentheprocessorandanI/Omodule.Dataprocessing:Theprocessormayperformsomearithmeticorlogicoperationondata.Control:Aninstructionmayspecifythatthesequenceofexecutionbealtered.3.2 Instructionaddresscalculation(iac):Determinetheaddressofthenextinstructiontobeexecuted.Instructionfetch(if):Readinstructionfromitsmemorylocationintotheprocessor.Instructionoperationdecoding(iod):Analyzeinstructiontodeterminetypeofoperationtobeperformedandoperand(s)tobeused.Operandaddresscalculation(oac):IftheoperationinvolvesreferencetoanoperandinmemoryoravailableviaI/O,thendeterminetheaddressoftheoperand.Operandfetch(of):FetchtheoperandfrommemoryorreaditinfromI/O.Dataoperation(do):Performtheoperationindicatedintheinstruction.Operandstore(os):WritetheresultintomemoryorouttoI/O.3.3 (1)Disableallinterruptswhileaninterruptisbeingprocessed.(2)Defineprioritiesforinterruptsandtoallowaninterruptofhigherprioritytocausealower-priorityinterrupthandlertobeinterrupted.3.4 Memorytoprocessor:Theprocessorreadsaninstructionoraunitofdatafrommemory.Processortomemory:Theprocessorwritesaunitofdatatomemory.I/Otoprocessor:TheprocessorreadsdatafromanI/OdeviceviaanI/Omodule.ProcessortoI/O:TheprocessorsendsdatatotheI/Odevice.I/Otoorfrommemory:Forthesetwocases,anI/Omoduleisallowedtoexchangedatadirectlywithmemory,withoutgoingthroughtheprocessor,usingdirectmemoryaccess(DMA).3.5 Withmultiplebuses,therearefewerdevicesperbus.This(1)reducespropagationdelay,becauseeachbuscanbeshorter,and(2)reducesbottleneckeffects.3.6 Systempins:Includetheclockandresetpins.Addressanddatapins:Include32linesthataretimemultiplexedforaddressesanddata.Interfacecontrolpins:Controlthetimingoftransactionsandprovidecoordinationamonginitiatorsandtargets.Arbitrationpins:UnliketheotherPCIsignallines,thesearenotsharedlines.Rather,eachPCImasterhasitsownpairofarbitrationlinesthatconnectitdirectlytothePCIbusarbiter.ErrorReportingpins:Usedtoreportparityandothererrors.InterruptPins:TheseareprovidedforPCIdevicesthatmustgeneraterequestsforservice.Cachesupportpins:ThesepinsareneededtosupportamemoryonPCIthatcanbecachedintheprocessororanotherdevice.64-bitBusextensionpins:Include32linesthataretimemultiplexedforaddressesanddataandthatarecombinedwiththemandatoryaddress/datalinestoforma64-bitaddress/databus.JTAG/BoundaryScanPins:ThesesignallinessupporttestingproceduresdefinedinIEEEStandard1149.1.AnswerstoProblems3.1 Memory(contentsinhex):300:3005;301:5940;302:7006 Step1:3005IR;Step2:3AC Step3:5940IR;Step4:3+2=5AC Step5:7006IR;Step6:ACDevice63.2 1. a. ThePCcontains300,theaddressofthefirstinstruction.ThisvalueisloadedintotheMAR. b. Thevalueinlocation300(whichistheinstructionwiththevalue1940inhexadecimal)isloadedintotheMBR,andthePCisincremented.Thesetwostepscanbedoneinparallel. c. ThevalueintheMBRisloadedintotheIR. 2. a. TheaddressportionoftheIR(940)isloadedintotheMAR. b. Thevalueinlocation940isloadedintotheMBR. c. ThevalueintheMBRisloadedintotheAC. 3. a. ThevalueinthePC(301)isloadedintotheMAR. b. Thevalueinlocation301(whichistheinstructionwiththevalue5941)isloadedintotheMBR,andthePCisincremented. c. ThevalueintheMBRisloadedintotheIR. 4. a. TheaddressportionoftheIR(941)isloadedintotheMAR. b. Thevalueinlocation941isloadedintotheMBR. c. TheoldvalueoftheACandthevalueoflocationMBRareaddedandtheresultisstoredintheAC. 5. a. ThevalueinthePC(302)isloadedintotheMAR. b. Thevalueinlocation302(whichistheinstructionwiththevalue2941)isloadedintotheMBR,andthePCisincremented. c. ThevalueintheMBRisloadedintotheIR. 6. a. TheaddressportionoftheIR(941)isloadedintotheMAR. b. ThevalueintheACisloadedintotheMBR. c. ThevalueintheMBRisstoredinlocation941.3.3 a. 224=16MBytes b. (1) Ifthelocaladdressbusis32bits,thewholeaddresscanbetransferredatonceanddecodedinmemory.However,becausethedatabusisonly16bits,itwillrequire2cyclestofetcha32-bitinstructionoroperand. (2) The16bitsoftheaddressplacedontheaddressbuscan'taccessthewholememory.Thusamorecomplexmemoryinterfacecontrolisneededtolatchthefirstpartoftheaddressandthenthesecondpart(becausethemicroprocessorwillendintwosteps).Fora32-bitaddress,onemayassumethefirsthalfwilldecodetoaccessa"row"inmemory,whilethesecondhalfissentlatertoaccessa"column"inmemory.Inadditiontothetwo-stepaddressoperation,themicroprocessorwillneed2cyclestofetchthe32bitinstruction/operand. c. Theprogramcountermustbeatleast24bits.Typically,a32-bitmicroprocessorwillhavea32-bitexternaladdressbusanda32-bitprogramcounter,unlesson-chipsegmentregistersareusedthatmayworkwithasmallerprogramcounter.Iftheinstructionregisteristocontainthewholeinstruction,itwillhavetobe32-bitslong;ifitwillcontainonlytheopcode(calledtheopcoderegister)thenitwillhavetobe8bitslong.3.4 Incases(a)and(b),themicroprocessorwillbeabletoaccess216=64Kbytes;theonlydifferenceisthatwithan8-bitmemoryeachaccesswilltransferabyte,whilewitha16-bitmemoryanaccessmaytransferabyteora16-byteword.Forcase(c),separateinputandoutputinstructionsareneeded,whoseexecutionwillgenerateseparate"I/Osignals"(differentfromthe"memorysignals"generatedwiththeexecutionofmemory-typeinstructions);ataminimum,oneadditionaloutputpinwillberequiredtocarrythisnewsignal.Forcase(d),itcansupport28=256inputand28=256outputbyteportsandthesamenumberofinputandoutput16-bitports;ineithercase,thedistinctionbetweenaninputandanoutputportisdefinedbythedifferentsignalthattheexecutedinputoroutputinstructiongenerated.3.5 Clockcycle= Buscycle=4125ns=500ns 2bytestransferredevery500ns;thustransferrate=4MBytes/sec Doublingthefrequencymaymeanadoptinganewchipmanufacturingtechnology(assumingeachinstructionswillhavethesamenumberofclockcycles);doublingtheexternaldatabusmeanswider(maybenewer)on-chipdatabusdrivers/latchesandmodificationstothebuscontrollogic.Inthefirstcase,thespeedofthememorychipswillalsoneedtodouble(roughly)nottoslowdownthemicroprocessor;inthesecondcase,the"wordlength"ofthememorywillhavetodoubletobeabletosend/receive32-bitquantities.3.6 a. InputfromtheTeletypeisstoredinINPR.TheINPRwillonlyacceptdatafromtheTeletypewhenFGI=0.Whendataarrives,itisstoredinINPR,andFGIissetto1.TheCPUperiodicallychecksFGI.IfFGI=1,theCPUtransfersthecontentsofINPRtotheACandsetsFGIto0. WhentheCPUhasdatatosendtotheTeletype,itchecksFGO.IfFGO=0,theCPUmustwait.IfFGO=1,theCPUtransfersthecontentsoftheACtoOUTRandsetsFGOto0.TheTeletypesetsFGIto1afterthewordisprinted. b. Theprocessdescribedin(a)isverywasteful.TheCPU,whichismuchfasterthantheTeletype,mustrepeatedlycheckFGIandFGO.Ifinterruptsareused,theTeletypecanissueaninterrupttotheCPUwheneveritisreadytoacceptorsenddata.TheIENregistercanbesetbytheCPU(underprogrammercontrol)3.7 a. Duringasinglebuscycle,the8-bitmicroprocessortransfersonebytewhilethe16-bitmicroprocessortransferstwobytes.The16-bitmicroprocessorhastwicethedatatransferrate. b. Supposewedo100transfersofoperandsandinstructions,ofwhich50areonebytelongand50aretwobyteslong.The8-bitmicroprocessortakes50+(2x50)=150buscyclesforthetransfer.The16-bitmicroprocessorrequires50+50=100buscycles.Thus,thedatatransferratesdifferbyafactorof1.5.Source:[PROT88].3.8 Thewholepointoftheclockistodefineeventtimesonthebus;therefore,wewishforabusarbitrationoperationtobemadeeachclockcycle.Thisrequiresthattheprioritysignalpropagatethelengthofthedaisychain(Figure3.26)inoneclockperiod.Thus,themaximumnumberofmastersisdeterminedbydividingtheamountoftimeittakesabusmastertopassthroughthebusprioritybytheclockperiod.3.9 Thelowest-prioritydeviceisassignedpriority16.Thisdevicemustdefertoalltheothers.However,itmaytransmitinanyslotnotreservedbytheotherSBIdevices.3.10 Atthebeginningofanyslot,ifnoneoftheTRlinesisasserted,onlythepriority16devicemaytransmit.Thisgivesitthelowestaveragewaittimeundermostcircumstances.Onlywhenthereisheavydemandonthebus,whichmeansthatmostofthetimethereisatleastonependingrequest,willthepriority16devicenothavethelowestaveragewaittime.3.11 a. Withaclockingfrequencyof10MHz,theclockperiodis10–9s=100ns.Thelengthofthememoryreadcycleis300ns. b. TheReadsignalbeginstofallat75nsfromthebeginningofthethirdclockcycle(middleofthesecondhalfofT3).Thus,memorymustplacethedataonthebusnolaterthan55nsfromthebeginningofT3.Source:[PROT88]3.12 a. Theclockperiodis125ns.Therefore,twoclockcyclesneedtobeinserted. b. FromFigure3.19,theReadsignalbeginstoriseearlyinT2.Toinserttwoclockcycles,theReadylinecanbeputinlowatthebeginningofT2andkeptlowfor250ns.Source:[PROT88]3.13 a. A5MHzclockcorrespondstoaclockperiodof200ns.Therefore,theWritesignalhasadurationof150ns. b. Thedataremainvalidfor150+20=170ns. c. Onewaitstate.Source:[PROT88]3.14 a. Withoutthewaitstates,theinstructiontakes16busclockcycles.Theinstructionrequiresfourmemoryaccesses,resultingin8waitstates.Theinstruction,withwaitstates,takes24clockcycles,foranincreaseof50%. b. Inthiscase,theinstructiontakes26buscycleswithoutwaitstatesand34buscycleswithwaitstates,foranincreaseof33%.Source:[PROT88]3.15 a. Theclockperiodis125ns.Onebusreadcycletakes500ns=0.5μs.Ifthebuscyclesrepeatoneafteranother,wecanachieveadatatransferrateof2MB/s. b. Thewaitstateextendsthebusreadcycleby125ns,foratotaldurationof0.625μs.Thecorrespondingdatatransferrateis1/0.625=1.6MB/s.Source:[PROT88]3.16 Abuscycletakes0.25μs,soamemorycycletakes1μs.Ifbothoperandsareeven-aligned,ittakes2μstofetchthetwooperands.Ifoneisodd-aligned,thetimerequiredis3μs.Ifbothareodd-aligned,thetimerequiredis4μs.Source:[PROT88].3.17 Consideramixof100instructionsandoperands.Onaverage,theyconsistof2032-bititems,4016-bititems,and40bytes.Thenumberofbuscyclesrequiredforthe16-bitmicroprocessoris(220)+40+40=120.Forthe32-bitmicroprocessor,thenumberrequiredis100.Thisamountstoanimprovementof20/120orabout17%.Source:[PROT88].3.18 Theprocessorneedsanothernineclockcyclestocompletetheinstruction.Thus,theInterruptAcknowledgewillstartafter900ns.Source:[PROT88].3.19

Chapter4CacheMemoryAnswerstoQuestions4.1 Sequentialaccess:Memoryisorganizedintounitsofdata,calledrecords.Accessmustbemadeinaspecificlinearsequence.Directaccess:Individualblocksorrecordshaveauniqueaddressbasedonphysicallocation.Accessisaccomplishedbydirectaccesstoreachageneralvicinityplussequentialsearching,counting,orwaitingtoreachthefinallocation.Randomaccess:Eachaddressablelocationinmemoryhasaunique,physicallywired-inaddressingmechanism.Thetimetoaccessagivenlocationisindependentofthesequenceofprioraccessesandisconstant.4.2 Fasteraccesstime,greatercostperbit;greatercapacity,smallercostperbit;greatercapacity,sloweraccesstime.4.3 Itispossibletoorganizedataacrossamemoryhierarchysuchthatthepercentageofaccessestoeachsuccessivelylowerlevelissubstantiallylessthanthatofthelevelabove.Becausememoryreferencestendtocluster,thedatainthehigher-levelmemoryneednotchangeveryoftentosatisfymemoryaccessrequests.4.4 Inacachesystem,directmappingmapseachblockofmainmemoryintoonlyonepossiblecacheline.Associativemappingpermitseachmainmemoryblocktobeloadedintoanylineofthecache.Inset-associativemapping,thecacheisdividedintoanumberofsetsofcachelines;eachmainmemoryblockcanbemappedintoanylineinaparticularset.4.5 Onefieldidentifiesauniquewordorbytewithinablockofmainmemory.Theremainingtwofieldsspecifyoneoftheblocksofmainmemory.Thesetwofieldsarealinefield,whichidentifiesoneofthelinesofthecache,andatagfield,whichidentifiesoneoftheblocksthatcanfitintothatline.4.6 Atagfielduniquelyidentifiesablockofmainmemory.Awordfieldidentifiesauniquewordorbytewithinablockofmainmemory.4.7 Onefieldidentifiesauniquewordorbytewithinablockofmainmemory.Theremainingtwofieldsspecifyoneoftheblocksofmainmemory.Thesetwofieldsareasetfield,whichidentifiesoneofthesetsofthecache,andatagfield,whichidentifiesoneoftheblocksthatcanfitintothatset.4.8 Spatiallocalityreferstothetendencyofexecutiontoinvolveanumberofmemorylocationsthatareclustered.Temporallocalityreferstothetendencyforaprocessortoaccessmemorylocationsthathavebeenusedrecently.4.9 Spatiallocalityisgenerallyexploitedbyusinglargercacheblocksandbyincorporatingprefetchingmechanisms(fetchingitemsofanticipateduse)intothecachecontrollogic.Temporallocalityisexploitedbykeepingrecentlyusedinstructionanddatavaluesincachememoryandbyexploitingacachehierarchy.AnswerstoProblems4.1 Thecacheisdividedinto16setsof4lineseach.Therefore,4bitsareneededtoidentifythesetnumber.Mainmemoryconsistsof4K=212blocks.Therefore,thesetplustaglengthsmustbe12bitsandthereforethetaglengthis8bits.Eachblockcontains128words.Therefore,7bitsareneededtospecifytheword.TAGSETWORDMainmemoryaddress=8474.2 Thereareatotalof8kbytes/16bytes=512linesinthecache.Thusthecacheconsistsof256setsof2lineseach.Therefore8bitsareneededtoidentifythesetnumber.Forthe64-Mbytemainmemory,a26-bitaddressisneeded.Mainmemoryconsistsof64-Mbyte/16bytes=222blocks.Therefore,thesetplustaglengthsmustbe22bits,sothetaglengthis14bitsandthewordfieldlengthis4bits.TAGSETWORDMainmemoryaddress=14844.3Address111111666666BBBBBBa.Tag/Line/Word11/444/166/1999/2BB/2EEE/3b.Tag/Word44444/1199999/22EEEEE/3c.Tag/Set/Word22/444/1CC/1999/2177/EEE/34.4 a. Addresslength:24;numberofaddressableunits:224;blocksize:4;numberofblocksinmainmemory:222;numberoflinesincache:214;sizeoftag:8. b. Addresslength:24;numberofaddressableunits:224;blocksize:4;numberofblocksinmainmemory:222;numberoflinesincache:4000hex;sizeoftag:22. c. Addresslength:24;numberofaddressableunits:224;blocksize:4;numberofblocksinmainmemory:222;numberoflinesinset:2;numberofsets:213;numberoflinesincache:214;sizeoftag:9.4.5 Blockframesize=16bytes=4doublewords

Numberofblockframesincache=

Numberofsets=

Example:doublewordfromlocationABCDE8F8ismappedonto:set143,anyline,doubleword2:

4.6

4.7 A32-bitaddressconsistsofa21-bittagfield,a7-bitsetfield,anda4-bitwordfield.Eachsetinthecacheincludes3LRUbitsandfourlines.Eachlineconsistsof432-bitwords,avalidbit,anda21-bittag.4.8 a. 8leftmostbits=tag;5middlebits=linenumber;3rightmostbits=bytenumber b. slot3;slot6;slot3;slot21 c. Byteswithaddresses0001101000011000through0001101000011111arestoredinthecache d. 256bytes e. Becausetwoitemswithtwodifferentmemoryaddressescanbestoredinthesameplaceinthecache.Thetagisusedtodistinguishbetweenthem.4.9 a. Thebitsaresetaccordingtothefollowingruleswitheachaccesstotheset:

1. IftheaccessistoL0orL1, B01.

2. IftheaccessistoL0, B11.

3. IftheaccessistoL1, B10. 4. IftheaccessistoL2orL3, B00. 5. IftheaccessistoL2, B21. 6. IftheaccessistoL3, B20. Thereplacementalgorithmworksasfollows(Figure4.15):Whenalinemustbereplaced,thecachewillfirstdeterminewhetherthemostrecentusewasfromL0andL1orL2andL3.Thenthecachewilldeterminewhichofthepairofblockswasleastrecentlyusedandmarkitforreplacement.Whenthecacheisinitializedorflushedall128setsofthreeLRUbitsaresettozero. b. The80486dividesthefourlinesinasetintotwopairs(L0,L1andL2,L3).BitB0isusedtoselectthepairthathasbeenleast-recentlyused.Withineachpair,onebitisusedtodeterminewhichmemberofthepairwasleast-recentlyused.However,theultimateselectiononlyapproximatesLRU.Considerthecaseinwhichtheorderofusewas:L0,L2,L3,L1.Theleast-recentlyusedpairis(L2,L3)andtheleast-recentlyusedmemberofthatpairisL2,whichisselectedforreplacement.However,theleast-recentlyusedlineofallisL0.Dependingontheaccesshistory,thealgorithmwillalwayspicktheleast-recentlyusedentryorthesecondleast-recentlyusedentry. c. ThemoststraightforwardwaytoimplementtrueLRUforafour-linesetistoassociateatwobitcounterwitheachline.Whenanaccessoccurs,thecounterforthatblockissetto0;allcounterswithvalueslowe

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論