版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
1、STMicroelectronicsSemiconductor Memory Design (SRAM & DRAM)Kaushik SahaContact: , mobile-98110-64398 2Understanding the Memory TradeThe memory market is the most- Volatile- Cost Competitive- Innovativein the IC tradeMemory marketDemandSupplyTechnical Change3Classification of MemoriesRWMemoryNVRW
2、MROMRandom AccessNon-Random AccessEPROMEEPROMMask ProgrammedSRAM (Static)DRAM (Dynamic)FIFO (Queue)LIFO (Stack)SR (Shift Register)CAM (Content Addressable)FLASHPROM (Fuse Programmed)4Feature Comparison Between Memory Types5Memory selection : cost and performanceDRAM, EPROM- Merit : cheap, high densi
3、ty- Demerit : low speed, high powerSRAM- Merit : high speed or low power- Demerit : expensive, low densityLarge memory with cost pressure : - DRAMLarge memory with very fast speed : - SRAM or- DRAM main + SRAM cacheBack-up main for no data loss when power failure- SRAM with battery back-up- EEPROM6G
4、enerationTrends in Storage Technology*MB=MbytesIncreasing die size factor 1.5 per generationCombined with reducing cell size factor 2.6 per generation7The Need for Innovation in Memory IndustryThe learning rate (viz. the constant b) is the highest for the memory industry- Because prices drop most st
5、eeply among all ICs Due to the nature of demand + supply- Yet margins must the maintainedTechniques must be applied to reduce production costOften, memories are the launch vehicles for a technology node- Leads to volatile nature of prices8Memory Hierarchy of a Modern Computer SystemBy taking advanta
6、ge of the principle of locality:- Present the user with as much memory as is available in the cheapest technology.- Provide access at the speed offered by the fastest technology.ControlDatapathSecondaryStorage(Disk)ProcessorRegistersMainMemory(DRAM)SecondLevelCache(SRAM)On-ChipCache1s10,000,000s (10
7、s ms)Speed (ns):10s100s100sGsSize (bytes):KsMsTertiaryStorage(Tape)10,000,000,000s (10s sec)Ts9How is the hierarchy managed?Registers Memory- by compiler (programmer?)cache memory- by the hardwarememory disks- by the hardware and operating system (virtual memory)- by the programmer (files)10Memory H
8、ierarchy TechnologyRandom Access:- “Random” is good: access time is the same for all locations- DRAM: Dynamic Random Access MemoryHigh density, low power, cheap, slowDynamic: need to be “refreshed” regularly- SRAM: Static Random Access MemoryLow density, high power, expensive, fastStatic: content wi
9、ll last “forever”(until lose power)“Not-so-random” Access Technology:- Access time varies from location to location and from time to time- Examples: Disk, CDROM11Main Memory BackgroundPerformance of Main Memory: - Latency: Cache Miss PenaltyAccess Time: time between request and word arrivesCycle Tim
10、e: time between requests- Bandwidth: I/O & Large Block Miss Penalty (L2)Main Memory is DRAM : Dynamic Random Access Memory- Dynamic since needs to be refreshed periodically Addresses divided into 2 halves (Memory as a 2D matrix):RAS or Row Access StrobeCAS or Column Access StrobeCache uses SRAM
11、: Static Random Access Memory- No refresh (6 transistors/bit vs. 1 transistor)Size: DRAM/SRAM - 4-8 Cost/Cycle time: SRAM/DRAM - 8-1612Memory InterfacesAddress i/ps- Maybe latched with strobe signalsWrite Enable (/WE)- To choose between read / write- To control writing of new data to memoryChip Sele
12、ct (/CS)- To choose between memory chips / banks on systemOutput Enable (/OE)- To control o/p buffer in read circuitryData i/os- For large memories data i/p and o/p muxed on same pins, selected with /WERefresh signals13 N words M bits per word N select lines 1:N decoder very inefficient design diffi
13、cult to place and routeWord N-1Word N-2Word 2Word 1Word 0S0S1S2SN-2SN-1Single StorageCellM bit output wordMemory - Basic Organization14Memory - Real OrganizationS0SR-1RowDecoderLog2RAddress Lines- - - - KxM bits - - - -C of M bit words row 0C of M bit words row 1C of M bit words row 2C of M bit word
14、s row N-2C of M bit words row N-1Array of N x K words- columns - KxM- rows R-Log2CAddress LinesColumn SelectM bit data wordN = R * C15Array-Structured Memory ArchitectureInput-Output(M bits)Row DecoderAKAK+1AL-12L-KColumn DecoderBit LineWord LineA0AK-1Storage CellSense Amplifiers / DriversM.2KProble
15、m: ASPECT RATIO or HEIGHT WIDTHAmplify swing torail-to-rail amplitudeSelects appropriateword16Hierarchical Memory ArchitectureGlobal Data BusRowAddressColumnAddressBlockAddressBlock SelectorGlobalAmplifier/DriverI/OControlCircuitryAdvantages:1. Shorter wires within blocks2. Block address activates o
16、nly 1 block = power savings17Memory - Organization and Cell Design Issuesaspect ratio (height : width) should be relative square- Row / Column organisation (matrix)- R = log2(N_rows); C = log2(N_columns)- R + C = N (N_address_bits)number of rows should be power of 2- number of bits in a rowsense amp
17、lifiers to amplify the voltage from each memory cell1 - 2R row decoder1 - 2C column decoder - implement M of the column decoders (M bits, one per bit) M = output word width18Semiconductor Manufacturing Process19Basic Micro Technology20Semiconductor Manufacturing ProcessFundamental Processing Steps1.
18、Silicon Manufacturing a) Czochralski method. b) Wafer Manufacturing c) Crystal structure2.Photolithography a) Photoresists b) Photomask and Reticles c) Patterning 21Lithography Requirements22Excimer Laser DUV & EUV lithographyNovaLine Laser Lambda PhysikPulse RatePower o/p23Dry or Plasma Etching
19、24Dry or Plasma Etching25Dry or Plasma EtchingCombination of chemical and physical etching Reactive Ion Etching (RIE) Directional etching due to ion assistance. In RIE processes the wafers sit on the powered electrode. This placement sets up a negative bias on the wafer which accelerates positively
20、charge ions toward the surface. These ions enhance the chemical etching mechanisms and allow anisotropic etching. Wet etches are simpler, but dry etches provide better line width control since it is anisotropic. 26Dry EtchingReactive Ion Etching- RIETechnology27Paulo MoreiraCMOS fabrication sequence
21、4.2 Local oxidation of silicon (LOCOS)-The photoresist mask is removed-The SiO2/SiN layers will now act as masks-The thick field oxide is then grown by: exposing the surface of the wafer to a flow of oxygen-rich gas-The oxide grows in both the vertical and lateral directions-This results in a active
22、 area smaller than patternedn-w ellp-typeField oxide (FO X)patterned active areaactive area after LO C O S28LOCOS: Local OxidationTechnology29Paulo MoreiraAdvanced CMOS processesShallow trench isolationn+ and p+-doped polysilicon gates (low threshold)source-drain extensions LDD (hot-electron effects
23、)Self-aligned silicide (spacers)Non-uniform channel doping (short-channel effects)n-w ellp+p+n+n+p-dopingn-dopingS ilicideO xide spacern+ polyp+ polyS hallow -trench isolationp-type substrateS ource-drainextensionTechnology30Paulo MoreiraProcess enhancementsUp to eight metal levels in modern process
24、esCopper for metal levels 2 and higherStacked contacts and viasChemical Metal Polishing for technologies with several metal levelsFor analog applications some processes offer:- capacitors- resistors- bipolar transistors (BiCMOS) 31MetalisationMetal deposited first, followed by photoresistThen metal
25、etched away to leave pattern, gaps filled with SiO232Electroplating Based Damascene Process SequenceSimple, Low-cost, Hybrid, Robust Fill SolutionPre-clean IMP barrier + Copper Electroplating CMP25 nm 10-20 nm + 100-200 nm 333435Example CMOS SRAM Process0.7u n-channel min gate length, 0.6u Leff1.0u
26、FOX isolation using SiNiO2 masking0.25u N+ to P+ spacingThin epi material to suppress latchupTwin well to suppress parasitic channel through field transistorsLDD struct for n & p transistors to suppress hot carrier effectsBuried contacts to overlying metal or underlying gatesMetal salicide to re
27、duce poly resistivity2 metals to reduce die areaPlanarisation after all major process steps-To reduce step coverage problems on contact cut fills Large oxide depositions36SRAM Application AreasMain memory in high performance small systemMain memory in low power consumption systemSimpler and less exp
28、ensive system if without a cacheBattery back-upBattery operated system37SRAM Performance vs Application Families38Typical Application ScenariosHand phone and CacheMMUFPUBIUALUCORE L116KBL2256KB64MBI/OPCIISASRAMDRAMi586 based PC39Market View by Application40Overview of SRAM TypesLow SpeedMedium Speed
29、High SpeedAsynchronousFlow Through / PipelinedZero Bus TurnaroundDouble Data RateDual PortInterleaved / Linear BurstSynchronousCAM / Cache TagFIFOMultiportSpecialSRAMs41Array Organization common bit precharge lines need sense amplifierSL0SL1SL2SRAM Array42Write Enable is usually active low (WE_L)Din
30、 and Dout are combined to save pins:- A new control signal, output enable (OE_L) is needed- WE_L = 0, OE_L = 1 D serves as the data input pin- WE_L = 1, OE_L = 0 D is the data output pin- Both WE_L = 1, OE_L = 1 Result is unknown. Dont do that!A0-ANDOE_L2Nwordsx M bitSRAMMWE_LLogic Diagram of a Typi
31、cal SRAMCS!43Simple 4x4 SRAM MemoryA0Row DecoderBLWL0A1A2Column Decodersense amplifierswrite circuitry!BLWL1WL2WL3bit line precharge2 bit width M=2R = 2 N_rows = 2R = 4C = 1 N_columns = 2c x M = 4N = R + C = 3Array size = N_rows x N_columns = 16clocking and control -enableread prechargeA0!WE! , OE!-
32、44Basic Memory Read CycleSystem selects memory with /CS=LSystem presents correct address (A0-AN)System turns o/p buffers on with /OE=LSystem tri-states previous data sources within a permissible time limit (tOLZ or tCLZ)System must wait minimum time of tAA, tAC or tOE to get correct data45Basic Memo
33、ry Write CycleSystem presents correct address (A0-AN)System selects memory with /CS=L System waits a minimum time equal to internal setup time of new addresses (tAS) System enables writing with /WE=LSystem waits for minimum time to disable o/p driver (twz)System inputs data and waits minimum time (t
34、DW) for data to be written in core, then turns off write (/WE=H)46Memory Timing: DefinitionsREADWRITEDATARead AccessRead AccessRead CycleData ValidData WrittenWrite AccessWrite Cycle47Memory Timing: ApproachesAddressBusRASCASRAS-CAS timingAddressBusAddressAddress transitioninitiates memory operation
35、 DRAM TimingSRAM TimingRow Address Column AddressMSBLSBMultiplexed AdressingSelf-timed48The system level view of Async SRAMs49The system level view of synch SRAMs50Typical Async SRAM TimingWrite Timing:DRead Timing:WE_LAWriteHold TimeWrite Setup TimeADOE_L2Nwordsx M bitSRAMNMWE_LData InWrite Address
36、OE_LHigh ZRead AddressJunkRead AccessTimeData OutRead AccessTimeData OutRead Address51SRAM Read Timing (typical)tAA (access time for address): how long it takes to get stable output after a change in address.tACS (access time for chip select): how long it takes to get stable output after CS is asser
37、ted.tOE (output enable time): how long it takes for the three-state output buffers to leave the high- impedance state when OE and CS are both asserted.tOZ (output-disable time): how long it takes for the three-state output buffers to enter high- impedance state after OE or CS are negated.tOH (output
38、-hold time): how long the output data remains valid after a change to the address inputs.52SRAM Read Timing (typical)stablestablestablevalidvalidvalidtAAtOZ tAAtOEtACStOZtOEMax(tAA, tACS)tOHADDRCS_LOE_LDOUTWE_L = HIGH53SRAM Architecture and Read TimingstAAtACStOEtOZtOH54SRAM write cycle timing /WE c
39、ontrolled/CS controlled55SRAM Architecture and Write TimingsWrite drivertWP-tDWSetup time = tDWtDH56SRAM Architecture57SRAM Cell DesignMemory array typically needs to store lots of bits- Need to optimize cell design for area and performance- Peripheral circuits can be complex Smaller compared to the
40、 array (60-70% area in array, 30-40% in periphery)Memory cell design- 6T cell full CMOS - 4T cell with high resistance poly load- TFT load cell58Anatomy of the SRAM CellWrite:set bit lines to new data valueb = opposite of braise word line to “high”sets cell to new state May need to flip old stateRea
41、d:set bit lines highset word line highsee which bit line goes low-59SRAM Cell Operating PrincipleInverter AmplifiesNegative gainSlope 1 in middleSaturates at ends60Bistable ElementStability Require Vin = V2 Stable at endpointsrecover from pertubation Metastable in middleFall out when perturbedBall o
42、n Ramp Analogy61SRAM Cell technologiesBipolar ECL : NPN with dual emitterNMOS loadA) Enhancement : additional load gate biasB) Depletion : no additional load gate biasHigh Load Resistance (4T)Full CMOS(6T)Thin FilmTransistors626T & 4T cell Implementation6T Bistable Latch4T Bistable LatchHigh res
43、istance poly63Reading a CellIcellDV = Icell * t t- CbSense Amplifier641 - 00 - 1Writing a Cell65Bistable ElementStability Require Vin = V2 Stable at endpointsrecover from pertubation Metastable in middleFall out when perturbedBall on Ramp Analogy66Cell Static Noise MarginCell state may be disturbed
44、byDCLayout pattern offsetProcess mismatches non-uniformity of implantationgate pattern size errorsACAlpha particlesCrosstalkVoltage supply rippleThermal noiseSNM = Maximum Value of VnWithout flipping cell state67SNM: Butterfly Curves11222211SNMSNM68SNM for Poly Load Cell69VDDGNDSELB-Q/B+QSEL MOSFETP
45、MOSPull UpNMOSPull DownSubstrateConnectionN WellConnection6T Cell Layout706T SRAM Array Layout71Another 6T Cell LayoutStick DiagrambitbitThese four contacts shared with (mirrored) cell belowwordGndGND and contact shared with cell to leftVDDTTTTTT2 Metal Layer Process726T Array Layout (2x2)Stick Diag
46、rambitbitwordGndVDDwordVDDGndbitbitVDD736T Cell Full LayoutTransistor sizing-M2 (pMOS) 4:3-M1 (nMOS) 6:2-M3 (nMOS) 4:2All boundaries shared38l l H x 28l l WReduced cap on bit lines M3M1M2746T Cell Example Layout & AbutmentVddVssVssVddVssVssVddBBB BB4 x 4 arrayVddVssT3T4T1T2T5T6VddVssT3T4T1T2T5T6
47、VddVssT3T4T1T2T5T6VddVssT3T4T1T2T5T6VddVssT3T4T1T2T5T62 x 2 abutment756T and 4T Cell LayoutsBITBIT!GNDVddWordLineT1T2T3T4R1R2VDDGNDQQWLBLBLT4T3T5T6T1T2766T - 4T Cell Comparison6T cell- Merits Faster Better Noise Immunity Low standby current- Demerits Large size due to 6 transistors4T cell- Merits Sm
48、aller cell, only 4 transistors HR Poly stacked above transistors- Demerits Additional process step due to HR poly Poor noise immunity Large standby current Thermal instability77Transistor Level View of CorePrechargeRow DecodeColumn DecodeSense Amp78SRAM, Putting it all together2n rows, 2m * k column
49、sn + m address lines, k bits data width79Hierarchical Array ArchitectureGlobal Data BusRowAddressColumnAddressBlockAddressBlock SelectorGlobalAmplifier/DriverI/OControlCircuitryAdvantages:1. Shorter wires within blocks2. Block address activates only 1 block = power savingsSubblocks1 / output bitSele
50、ct 1 column / subblock1 sense amp / subblock80Standalone SRAM Floorplan Example81Divided bit-line structure82SRAM Partitioning Partitioned Bitline83SRAM Partitioning Divided Wordline Arch84Partioning summaryPartioning involves a trade off between area, power and speedFor high speed designs, use shor
51、t blocks(e.g 64 rows x 128 columns )- Keep local bitline heights smallFor low power designs use tall narrow blocks (e.g 256 rows x 64 columns)- Keep the number of columns same as the access width to minimize wasted power85RedundancyMemoryArrayRedundantcolumnsRedundantrowsColumn DecoderRow DecoderRow
52、AddressColumnAddressFuseBank:86Periphery Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry87Asynchronous & Synchronous SRAMs88Address Transition Detection Provides Clock for Asynch RAMsDELAYtdA0DELAYtdDELAYtdATD.A1AN-1VDDATD89Row DecodersCollection of 2R complex logic gat
53、es organized in a regular, dense fashion(N)AND decoder 9-512WL(0) /= !A8!A7!A6!A5!A4!A3!A2!A1!A0WL(511) /= A8A7A6A5A4A3A2A1A0NOR decoder 9-512WL(0) = !(A8+A7+A6+A5+A4+A3+A2+A1+A0)WL(511) = !(!A8+!A7+!A6+!A5+!A4+!A3+!A2+!A1+!A0)90A NAND decoder using 2-input pre-decodersA0A1A0A1A0A1A0A1A2A3A2A3A2A3A2
54、A3A1A0A0A1A3A2A2A3WL0WL1Splitting decoder into two or more logic layersproduces a faster and cheaper implementation91A0A1A2A3A0/ A1/ A0 A1/ A0/ A1 A0 A1 A2/ A3/ A2 A3/ A2/ A3 A2 A3 and so forthR0/R1/R2/Row Decoders (contd)92Dynamic DecodersWL3GNDGNDPrecharge devicesWL2WL1WL0VDDA0A0A1A1A0A0A1A1VDDVDD
55、VDDVDDWL3WL2WL1WL0Dynamic 2-to-4 NOR decoder2-to-4 MOS dynamic NAND DecoderPropagation delay is primary concern93Dynamic NOR Row DecoderVddA0!A0A1!A1WL3WL2WL1WL0Precharge/94Dynamic NAND Row Decoder!A0A0!A1A1WL0Precharge/WL1WL2WL3Back95Decodersn:2n decoder consists of 2n n-input AND gates- One needed
56、 for each row of memory- Build AND from NAND or NOR gatesMake devices on address line minimal sizeScale devices on decoder O/P to drive word lines Static CMOSPseudo-nMOSword0word1word2word3A0A1A1wordA0111/224816wordA0A1111148word0word1word2word3A0A196Decoder LayoutDecoders must be pitch-matched to S
57、RAM cell- Requires very skinny gatesGNDVDDwordbuffer inverterNAND gateA0A0A1A2A3A2A3A197Large DecodersFor n 4, NAND gates become slow- Break large gates into multiple smaller gatesword0word1word2word3word15A0A1A2A398Predecoding- Group address bits in predecoder- Saves area- Same path effortA0A1A2A3w
58、ord1word2word3word15word01 of 4 hotpredecoded linespredecoders99Column CircuitrySome circuitry is required for each column- Bitline conditioning- Sense amplifiers- Column multiplexingNeed hazard-free reading & writing of RAM cellColumn decoder drives a MUX the two are often merged100Typical Colu
59、mn Access101Pass Transistor Based Column DecoderBL3BL2BL1BL0Data2 input NOR decoderA1A0S3S2S1S0qAdvantage: speed since there is only one extra transistor in the signal pathqDisadvantage: large transistor count!BL3!BL2!BL1!BL0!Data102Tree Decoder MuxColumn MUX can use pass transistors- Use nMOS only,
60、 precharge outputsOne design is to use k series transistors for 2k:1 mux- No external decoder logic neededB0B1B2B3B4B5B6B7B0B1B2B3B4B5B6B7A0A0A1A1A2A2YYto sense amps and write circuits103Bitline ConditioningPrecharge bitlines high before readsEqualize bitlines to minimize voltage difference when using sense amplifiersbitbit_bbitbit
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 《知識產(chǎn)權(quán)培訓(xùn)》課件
- 《種釀酒白葡萄》課件
- 《診斷原則》課件
- 單位管理制度集合大全【人員管理】
- 單位管理制度合并選集員工管理篇
- 單位管理制度分享合集【員工管理篇】十篇
- 單位管理制度分享大合集【員工管理篇】
- 單位管理制度范例匯編【員工管理】十篇
- 七年級英語SpringFestival課件
- 單位管理制度呈現(xiàn)大全【員工管理篇】
- 指揮中心 施工方案
- 金融模擬交易實(shí)驗(yàn)報告
- 國家開放大學(xué)電大本科《古代小說戲曲專題》2023期末試題及答案(試卷號:1340)
- 加德納多元智能理論教學(xué)課件
- 北師大版數(shù)學(xué)八年級上冊全冊教案
- 現(xiàn)代文閱讀之散文
- 從業(yè)人員在安全生產(chǎn)方面的權(quán)利和義務(wù)
- 新開模具清單
- 抗菌藥物臨床應(yīng)用指導(dǎo)原則(2023年版)
- 2023年軍政知識綜合題庫
- 2023-2024學(xué)年福建省福州市小學(xué)語文 2023-2024學(xué)年六年級語文期末試卷期末評估試卷
評論
0/150
提交評論