靜態(tài)存儲器和動態(tài)存儲器設(shè)計簡介._第1頁
靜態(tài)存儲器和動態(tài)存儲器設(shè)計簡介._第2頁
靜態(tài)存儲器和動態(tài)存儲器設(shè)計簡介._第3頁
靜態(tài)存儲器和動態(tài)存儲器設(shè)計簡介._第4頁
靜態(tài)存儲器和動態(tài)存儲器設(shè)計簡介._第5頁
已閱讀5頁,還剩166頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、STMicroelectronicsSemiconductor Memory Design (SRAM & DRAM)Kaushik SahaContact: , mobile-98110-64398 2Understanding the Memory TradeThe memory market is the most- Volatile- Cost Competitive- Innovativein the IC tradeMemory marketDemandSupplyTechnical Change3Classification of MemoriesRWMemoryNVRW

2、MROMRandom AccessNon-Random AccessEPROMEEPROMMask ProgrammedSRAM (Static)DRAM (Dynamic)FIFO (Queue)LIFO (Stack)SR (Shift Register)CAM (Content Addressable)FLASHPROM (Fuse Programmed)4Feature Comparison Between Memory Types5Memory selection : cost and performanceDRAM, EPROM- Merit : cheap, high densi

3、ty- Demerit : low speed, high powerSRAM- Merit : high speed or low power- Demerit : expensive, low densityLarge memory with cost pressure : - DRAMLarge memory with very fast speed : - SRAM or- DRAM main + SRAM cacheBack-up main for no data loss when power failure- SRAM with battery back-up- EEPROM6G

4、enerationTrends in Storage Technology*MB=MbytesIncreasing die size factor 1.5 per generationCombined with reducing cell size factor 2.6 per generation7The Need for Innovation in Memory IndustryThe learning rate (viz. the constant b) is the highest for the memory industry- Because prices drop most st

5、eeply among all ICs Due to the nature of demand + supply- Yet margins must the maintainedTechniques must be applied to reduce production costOften, memories are the launch vehicles for a technology node- Leads to volatile nature of prices8Memory Hierarchy of a Modern Computer SystemBy taking advanta

6、ge of the principle of locality:- Present the user with as much memory as is available in the cheapest technology.- Provide access at the speed offered by the fastest technology.ControlDatapathSecondaryStorage(Disk)ProcessorRegistersMainMemory(DRAM)SecondLevelCache(SRAM)On-ChipCache1s10,000,000s (10

7、s ms)Speed (ns):10s100s100sGsSize (bytes):KsMsTertiaryStorage(Tape)10,000,000,000s (10s sec)Ts9How is the hierarchy managed?Registers Memory- by compiler (programmer?)cache memory- by the hardwarememory disks- by the hardware and operating system (virtual memory)- by the programmer (files)10Memory H

8、ierarchy TechnologyRandom Access:- “Random” is good: access time is the same for all locations- DRAM: Dynamic Random Access MemoryHigh density, low power, cheap, slowDynamic: need to be “refreshed” regularly- SRAM: Static Random Access MemoryLow density, high power, expensive, fastStatic: content wi

9、ll last “forever”(until lose power)“Not-so-random” Access Technology:- Access time varies from location to location and from time to time- Examples: Disk, CDROM11Main Memory BackgroundPerformance of Main Memory: - Latency: Cache Miss PenaltyAccess Time: time between request and word arrivesCycle Tim

10、e: time between requests- Bandwidth: I/O & Large Block Miss Penalty (L2)Main Memory is DRAM : Dynamic Random Access Memory- Dynamic since needs to be refreshed periodically Addresses divided into 2 halves (Memory as a 2D matrix):RAS or Row Access StrobeCAS or Column Access StrobeCache uses SRAM

11、: Static Random Access Memory- No refresh (6 transistors/bit vs. 1 transistor)Size: DRAM/SRAM - 4-8 Cost/Cycle time: SRAM/DRAM - 8-1612Memory InterfacesAddress i/ps- Maybe latched with strobe signalsWrite Enable (/WE)- To choose between read / write- To control writing of new data to memoryChip Sele

12、ct (/CS)- To choose between memory chips / banks on systemOutput Enable (/OE)- To control o/p buffer in read circuitryData i/os- For large memories data i/p and o/p muxed on same pins, selected with /WERefresh signals13 N words M bits per word N select lines 1:N decoder very inefficient design diffi

13、cult to place and routeWord N-1Word N-2Word 2Word 1Word 0S0S1S2SN-2SN-1Single StorageCellM bit output wordMemory - Basic Organization14Memory - Real OrganizationS0SR-1RowDecoderLog2RAddress Lines- - - - KxM bits - - - -C of M bit words row 0C of M bit words row 1C of M bit words row 2C of M bit word

14、s row N-2C of M bit words row N-1Array of N x K words- columns - KxM- rows R-Log2CAddress LinesColumn SelectM bit data wordN = R * C15Array-Structured Memory ArchitectureInput-Output(M bits)Row DecoderAKAK+1AL-12L-KColumn DecoderBit LineWord LineA0AK-1Storage CellSense Amplifiers / DriversM.2KProble

15、m: ASPECT RATIO or HEIGHT WIDTHAmplify swing torail-to-rail amplitudeSelects appropriateword16Hierarchical Memory ArchitectureGlobal Data BusRowAddressColumnAddressBlockAddressBlock SelectorGlobalAmplifier/DriverI/OControlCircuitryAdvantages:1. Shorter wires within blocks2. Block address activates o

16、nly 1 block = power savings17Memory - Organization and Cell Design Issuesaspect ratio (height : width) should be relative square- Row / Column organisation (matrix)- R = log2(N_rows); C = log2(N_columns)- R + C = N (N_address_bits)number of rows should be power of 2- number of bits in a rowsense amp

17、lifiers to amplify the voltage from each memory cell1 - 2R row decoder1 - 2C column decoder - implement M of the column decoders (M bits, one per bit) M = output word width18Semiconductor Manufacturing Process19Basic Micro Technology20Semiconductor Manufacturing ProcessFundamental Processing Steps1.

18、Silicon Manufacturing a) Czochralski method. b) Wafer Manufacturing c) Crystal structure2.Photolithography a) Photoresists b) Photomask and Reticles c) Patterning 21Lithography Requirements22Excimer Laser DUV & EUV lithographyNovaLine Laser Lambda PhysikPulse RatePower o/p23Dry or Plasma Etching

19、24Dry or Plasma Etching25Dry or Plasma EtchingCombination of chemical and physical etching Reactive Ion Etching (RIE) Directional etching due to ion assistance. In RIE processes the wafers sit on the powered electrode. This placement sets up a negative bias on the wafer which accelerates positively

20、charge ions toward the surface. These ions enhance the chemical etching mechanisms and allow anisotropic etching. Wet etches are simpler, but dry etches provide better line width control since it is anisotropic. 26Dry EtchingReactive Ion Etching- RIETechnology27Paulo MoreiraCMOS fabrication sequence

21、4.2 Local oxidation of silicon (LOCOS)-The photoresist mask is removed-The SiO2/SiN layers will now act as masks-The thick field oxide is then grown by: exposing the surface of the wafer to a flow of oxygen-rich gas-The oxide grows in both the vertical and lateral directions-This results in a active

22、 area smaller than patternedn-w ellp-typeField oxide (FO X)patterned active areaactive area after LO C O S28LOCOS: Local OxidationTechnology29Paulo MoreiraAdvanced CMOS processesShallow trench isolationn+ and p+-doped polysilicon gates (low threshold)source-drain extensions LDD (hot-electron effects

23、)Self-aligned silicide (spacers)Non-uniform channel doping (short-channel effects)n-w ellp+p+n+n+p-dopingn-dopingS ilicideO xide spacern+ polyp+ polyS hallow -trench isolationp-type substrateS ource-drainextensionTechnology30Paulo MoreiraProcess enhancementsUp to eight metal levels in modern process

24、esCopper for metal levels 2 and higherStacked contacts and viasChemical Metal Polishing for technologies with several metal levelsFor analog applications some processes offer:- capacitors- resistors- bipolar transistors (BiCMOS) 31MetalisationMetal deposited first, followed by photoresistThen metal

25、etched away to leave pattern, gaps filled with SiO232Electroplating Based Damascene Process SequenceSimple, Low-cost, Hybrid, Robust Fill SolutionPre-clean IMP barrier + Copper Electroplating CMP25 nm 10-20 nm + 100-200 nm 333435Example CMOS SRAM Process0.7u n-channel min gate length, 0.6u Leff1.0u

26、FOX isolation using SiNiO2 masking0.25u N+ to P+ spacingThin epi material to suppress latchupTwin well to suppress parasitic channel through field transistorsLDD struct for n & p transistors to suppress hot carrier effectsBuried contacts to overlying metal or underlying gatesMetal salicide to re

27、duce poly resistivity2 metals to reduce die areaPlanarisation after all major process steps-To reduce step coverage problems on contact cut fills Large oxide depositions36SRAM Application AreasMain memory in high performance small systemMain memory in low power consumption systemSimpler and less exp

28、ensive system if without a cacheBattery back-upBattery operated system37SRAM Performance vs Application Families38Typical Application ScenariosHand phone and CacheMMUFPUBIUALUCORE L116KBL2256KB64MBI/OPCIISASRAMDRAMi586 based PC39Market View by Application40Overview of SRAM TypesLow SpeedMedium Speed

29、High SpeedAsynchronousFlow Through / PipelinedZero Bus TurnaroundDouble Data RateDual PortInterleaved / Linear BurstSynchronousCAM / Cache TagFIFOMultiportSpecialSRAMs41Array Organization common bit precharge lines need sense amplifierSL0SL1SL2SRAM Array42Write Enable is usually active low (WE_L)Din

30、 and Dout are combined to save pins:- A new control signal, output enable (OE_L) is needed- WE_L = 0, OE_L = 1 D serves as the data input pin- WE_L = 1, OE_L = 0 D is the data output pin- Both WE_L = 1, OE_L = 1 Result is unknown. Dont do that!A0-ANDOE_L2Nwordsx M bitSRAMMWE_LLogic Diagram of a Typi

31、cal SRAMCS!43Simple 4x4 SRAM MemoryA0Row DecoderBLWL0A1A2Column Decodersense amplifierswrite circuitry!BLWL1WL2WL3bit line precharge2 bit width M=2R = 2 N_rows = 2R = 4C = 1 N_columns = 2c x M = 4N = R + C = 3Array size = N_rows x N_columns = 16clocking and control -enableread prechargeA0!WE! , OE!-

32、44Basic Memory Read CycleSystem selects memory with /CS=LSystem presents correct address (A0-AN)System turns o/p buffers on with /OE=LSystem tri-states previous data sources within a permissible time limit (tOLZ or tCLZ)System must wait minimum time of tAA, tAC or tOE to get correct data45Basic Memo

33、ry Write CycleSystem presents correct address (A0-AN)System selects memory with /CS=L System waits a minimum time equal to internal setup time of new addresses (tAS) System enables writing with /WE=LSystem waits for minimum time to disable o/p driver (twz)System inputs data and waits minimum time (t

34、DW) for data to be written in core, then turns off write (/WE=H)46Memory Timing: DefinitionsREADWRITEDATARead AccessRead AccessRead CycleData ValidData WrittenWrite AccessWrite Cycle47Memory Timing: ApproachesAddressBusRASCASRAS-CAS timingAddressBusAddressAddress transitioninitiates memory operation

35、 DRAM TimingSRAM TimingRow Address Column AddressMSBLSBMultiplexed AdressingSelf-timed48The system level view of Async SRAMs49The system level view of synch SRAMs50Typical Async SRAM TimingWrite Timing:DRead Timing:WE_LAWriteHold TimeWrite Setup TimeADOE_L2Nwordsx M bitSRAMNMWE_LData InWrite Address

36、OE_LHigh ZRead AddressJunkRead AccessTimeData OutRead AccessTimeData OutRead Address51SRAM Read Timing (typical)tAA (access time for address): how long it takes to get stable output after a change in address.tACS (access time for chip select): how long it takes to get stable output after CS is asser

37、ted.tOE (output enable time): how long it takes for the three-state output buffers to leave the high- impedance state when OE and CS are both asserted.tOZ (output-disable time): how long it takes for the three-state output buffers to enter high- impedance state after OE or CS are negated.tOH (output

38、-hold time): how long the output data remains valid after a change to the address inputs.52SRAM Read Timing (typical)stablestablestablevalidvalidvalidtAAtOZ tAAtOEtACStOZtOEMax(tAA, tACS)tOHADDRCS_LOE_LDOUTWE_L = HIGH53SRAM Architecture and Read TimingstAAtACStOEtOZtOH54SRAM write cycle timing /WE c

39、ontrolled/CS controlled55SRAM Architecture and Write TimingsWrite drivertWP-tDWSetup time = tDWtDH56SRAM Architecture57SRAM Cell DesignMemory array typically needs to store lots of bits- Need to optimize cell design for area and performance- Peripheral circuits can be complex Smaller compared to the

40、 array (60-70% area in array, 30-40% in periphery)Memory cell design- 6T cell full CMOS - 4T cell with high resistance poly load- TFT load cell58Anatomy of the SRAM CellWrite:set bit lines to new data valueb = opposite of braise word line to “high”sets cell to new state May need to flip old stateRea

41、d:set bit lines highset word line highsee which bit line goes low-59SRAM Cell Operating PrincipleInverter AmplifiesNegative gainSlope 1 in middleSaturates at ends60Bistable ElementStability Require Vin = V2 Stable at endpointsrecover from pertubation Metastable in middleFall out when perturbedBall o

42、n Ramp Analogy61SRAM Cell technologiesBipolar ECL : NPN with dual emitterNMOS loadA) Enhancement : additional load gate biasB) Depletion : no additional load gate biasHigh Load Resistance (4T)Full CMOS(6T)Thin FilmTransistors626T & 4T cell Implementation6T Bistable Latch4T Bistable LatchHigh res

43、istance poly63Reading a CellIcellDV = Icell * t t- CbSense Amplifier641 - 00 - 1Writing a Cell65Bistable ElementStability Require Vin = V2 Stable at endpointsrecover from pertubation Metastable in middleFall out when perturbedBall on Ramp Analogy66Cell Static Noise MarginCell state may be disturbed

44、byDCLayout pattern offsetProcess mismatches non-uniformity of implantationgate pattern size errorsACAlpha particlesCrosstalkVoltage supply rippleThermal noiseSNM = Maximum Value of VnWithout flipping cell state67SNM: Butterfly Curves11222211SNMSNM68SNM for Poly Load Cell69VDDGNDSELB-Q/B+QSEL MOSFETP

45、MOSPull UpNMOSPull DownSubstrateConnectionN WellConnection6T Cell Layout706T SRAM Array Layout71Another 6T Cell LayoutStick DiagrambitbitThese four contacts shared with (mirrored) cell belowwordGndGND and contact shared with cell to leftVDDTTTTTT2 Metal Layer Process726T Array Layout (2x2)Stick Diag

46、rambitbitwordGndVDDwordVDDGndbitbitVDD736T Cell Full LayoutTransistor sizing-M2 (pMOS) 4:3-M1 (nMOS) 6:2-M3 (nMOS) 4:2All boundaries shared38l l H x 28l l WReduced cap on bit lines M3M1M2746T Cell Example Layout & AbutmentVddVssVssVddVssVssVddBBB BB4 x 4 arrayVddVssT3T4T1T2T5T6VddVssT3T4T1T2T5T6

47、VddVssT3T4T1T2T5T6VddVssT3T4T1T2T5T6VddVssT3T4T1T2T5T62 x 2 abutment756T and 4T Cell LayoutsBITBIT!GNDVddWordLineT1T2T3T4R1R2VDDGNDQQWLBLBLT4T3T5T6T1T2766T - 4T Cell Comparison6T cell- Merits Faster Better Noise Immunity Low standby current- Demerits Large size due to 6 transistors4T cell- Merits Sm

48、aller cell, only 4 transistors HR Poly stacked above transistors- Demerits Additional process step due to HR poly Poor noise immunity Large standby current Thermal instability77Transistor Level View of CorePrechargeRow DecodeColumn DecodeSense Amp78SRAM, Putting it all together2n rows, 2m * k column

49、sn + m address lines, k bits data width79Hierarchical Array ArchitectureGlobal Data BusRowAddressColumnAddressBlockAddressBlock SelectorGlobalAmplifier/DriverI/OControlCircuitryAdvantages:1. Shorter wires within blocks2. Block address activates only 1 block = power savingsSubblocks1 / output bitSele

50、ct 1 column / subblock1 sense amp / subblock80Standalone SRAM Floorplan Example81Divided bit-line structure82SRAM Partitioning Partitioned Bitline83SRAM Partitioning Divided Wordline Arch84Partioning summaryPartioning involves a trade off between area, power and speedFor high speed designs, use shor

51、t blocks(e.g 64 rows x 128 columns )- Keep local bitline heights smallFor low power designs use tall narrow blocks (e.g 256 rows x 64 columns)- Keep the number of columns same as the access width to minimize wasted power85RedundancyMemoryArrayRedundantcolumnsRedundantrowsColumn DecoderRow DecoderRow

52、AddressColumnAddressFuseBank:86Periphery Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry87Asynchronous & Synchronous SRAMs88Address Transition Detection Provides Clock for Asynch RAMsDELAYtdA0DELAYtdDELAYtdATD.A1AN-1VDDATD89Row DecodersCollection of 2R complex logic gat

53、es organized in a regular, dense fashion(N)AND decoder 9-512WL(0) /= !A8!A7!A6!A5!A4!A3!A2!A1!A0WL(511) /= A8A7A6A5A4A3A2A1A0NOR decoder 9-512WL(0) = !(A8+A7+A6+A5+A4+A3+A2+A1+A0)WL(511) = !(!A8+!A7+!A6+!A5+!A4+!A3+!A2+!A1+!A0)90A NAND decoder using 2-input pre-decodersA0A1A0A1A0A1A0A1A2A3A2A3A2A3A2

54、A3A1A0A0A1A3A2A2A3WL0WL1Splitting decoder into two or more logic layersproduces a faster and cheaper implementation91A0A1A2A3A0/ A1/ A0 A1/ A0/ A1 A0 A1 A2/ A3/ A2 A3/ A2/ A3 A2 A3 and so forthR0/R1/R2/Row Decoders (contd)92Dynamic DecodersWL3GNDGNDPrecharge devicesWL2WL1WL0VDDA0A0A1A1A0A0A1A1VDDVDD

55、VDDVDDWL3WL2WL1WL0Dynamic 2-to-4 NOR decoder2-to-4 MOS dynamic NAND DecoderPropagation delay is primary concern93Dynamic NOR Row DecoderVddA0!A0A1!A1WL3WL2WL1WL0Precharge/94Dynamic NAND Row Decoder!A0A0!A1A1WL0Precharge/WL1WL2WL3Back95Decodersn:2n decoder consists of 2n n-input AND gates- One needed

56、 for each row of memory- Build AND from NAND or NOR gatesMake devices on address line minimal sizeScale devices on decoder O/P to drive word lines Static CMOSPseudo-nMOSword0word1word2word3A0A1A1wordA0111/224816wordA0A1111148word0word1word2word3A0A196Decoder LayoutDecoders must be pitch-matched to S

57、RAM cell- Requires very skinny gatesGNDVDDwordbuffer inverterNAND gateA0A0A1A2A3A2A3A197Large DecodersFor n 4, NAND gates become slow- Break large gates into multiple smaller gatesword0word1word2word3word15A0A1A2A398Predecoding- Group address bits in predecoder- Saves area- Same path effortA0A1A2A3w

58、ord1word2word3word15word01 of 4 hotpredecoded linespredecoders99Column CircuitrySome circuitry is required for each column- Bitline conditioning- Sense amplifiers- Column multiplexingNeed hazard-free reading & writing of RAM cellColumn decoder drives a MUX the two are often merged100Typical Colu

59、mn Access101Pass Transistor Based Column DecoderBL3BL2BL1BL0Data2 input NOR decoderA1A0S3S2S1S0qAdvantage: speed since there is only one extra transistor in the signal pathqDisadvantage: large transistor count!BL3!BL2!BL1!BL0!Data102Tree Decoder MuxColumn MUX can use pass transistors- Use nMOS only,

60、 precharge outputsOne design is to use k series transistors for 2k:1 mux- No external decoder logic neededB0B1B2B3B4B5B6B7B0B1B2B3B4B5B6B7A0A0A1A1A2A2YYto sense amps and write circuits103Bitline ConditioningPrecharge bitlines high before readsEqualize bitlines to minimize voltage difference when using sense amplifiersbitbit_bbitbit

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論