HPE 3PAR數(shù)據(jù)消重壓縮方案_第1頁(yè)
HPE 3PAR數(shù)據(jù)消重壓縮方案_第2頁(yè)
HPE 3PAR數(shù)據(jù)消重壓縮方案_第3頁(yè)
HPE 3PAR數(shù)據(jù)消重壓縮方案_第4頁(yè)
HPE 3PAR數(shù)據(jù)消重壓縮方案_第5頁(yè)
已閱讀5頁(yè),還剩47頁(yè)未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、HPE 3PAR數(shù)據(jù)消重壓縮方案HPE 3PAR Adaptive Data ReductionThe HPE 3PAR StoreServ data reduction storyWhen used together, the Adaptive Data Reduction technologies will operate in this orderWe will go through them in this order as this is the correct way to position them to customersDeduplicationPrevent storing

2、 duplicate dataCompressionReduce data footprintData PackingPack odd-sized data togetherZero DetectRemove zeros inline1234數(shù)據(jù)消重 Deduplication3HPE 3PAR StoreServ deduplicationAdvanced inline, in-memory deduplicationHost writes data to the array held in cache pages to increase write performanceDuplicate

3、s are removed, only unique data is flushed to the SSDs reducing writesThe 3PAR ASIC, paired with Express Index lookup tables provides high performance, low-latency inline deduplicationThe 3PAR ASIC/CPU checks to see if the pages are duplicates of existing pagesDedup lookup tablePotential duplicates

4、are confirmed with a bit-for-bit checkAccelerated by the ASIC and Express Indexing 3PAR Thin deduplication000.1101L1 TableHash L1L2 TableL3 TableHash L2Hash L3LBAxxx yyy zzzWrite CacheHost WriteWrite Acknowledge to hostASIC computes hashASIC performs fast metadata lookup with Express IndexingMatch?

5、No - Write new host data to the backendYes Read data from cache or backend and XOR with new host write data to prevent any hash collision If XOR result = 0 just update metadataIf XOR result 0 write new host data to backend1234000.11015a000.11015bHPE 3PAR deduplicationImplementation in HPE 3PAR OS 3.

6、2.x (TDVV1/2)Dedup-enabled TPVVPrivate space (one per dedup-enabled TPVV)Shared space (one per CPG)New incoming writes have a hash calculated. If the hash hasnt been seen before, the data is stored in the shared space (DDS). The DDS can grow significantly.If a calculated hash matches an existing one

7、, the array uses the ASIC to detect if the new data matches the existing data (XOR). If it does, no new data is writtenIf the data does not match, it is a hash collision (where the hash is the same but the data is different). This data is stored in the VVs private spaceWhen data in the shared space

8、is no longer referenced by any dedup-enabled VV, the space is freed from the shared space and is available for new dataSmall private space only contains collision dataLarge shared space contains most data, including a large amount of unique dataDuplicate dataDeduplication in the real worldA small am

9、ount of data is very heavily duplicated7Unique dataDuplicate data (11:1)Unique dataDuplicate data880GB720GB80GB720GB80GB80GB+720GB=800GB (2:1)80GBx11=880GB + 720GB = 1.6TB (rehydrated)880GB+720GB=1.6TB (1:1)Compression-optimized deduplication approachNew compression-friendly implementation (TDVV3)Al

10、l writes with new hashes are written to the private space first, not shared space. All hashes are tracked.When data is seen for the second time, the data is moved from the private space to shared space. Both VVs reference that data (as do future duplicate writes)The majority of data by capacity will

11、 be held in private space since, once deduplicated, duplicate data accounts for only a small amount of consumed capacityShared space is used more efficiently, meaning a smaller amount of shared space can offer increased deduplication scalability while reduced metadata improves perfomancePrivate spac

12、e (one per dedup-enabled TPVV)Dedup-enabled TPVVShared space (one per CPG)Small shared space contains only duplicate dataLarge private spaces hold majority of dataTDVV3 metadata structure9L1L2L3L2L316K pageSD spaceDDS Table16K pageSD spaceDDC (Dedup Client / Private Space, one per VV)LBA2iii jjj kkk

13、LBA1aaa bbb cccDDS (Dedup Store / Shared Space, one per CPG)L1L2L3VV1VV2LBA3xxx yyy zzzDuplicate matchDeduplication hash use: reducing back-end readsTaking advantage of the larger hash1016K cache page09745aed7c6fa8ab74120620efd8b8ab241d81a3f4d56278b0ed0926f3f8f4eff3f8f4eff4d56278b0ed092632-bitsUsed

14、for DDS entry index64-bitsStored in DDS for collision detectionDDS Tablef3f8f4ee f4d56278b0ed0926f63b8e018bcf38572b5c41ac59b0ee83df47eca7b46cd3713cb34af0f3f8f4ef16K cache pageIf the hash doesnt match, we have a hash collision and we dont need a backend readIf the hash matches, then we perform a read

15、 and an XOR to verifyA quick note on TDVV311TDVV1 and 2 are working perfectly for almost all of our existing customersThe new deduplication implementation is designed to enable support for deduplication and compression together on the same VV (more detail coming up!)A single DDS goes further than ev

16、er beforeOn average, duplicate data accounts for 10% of capacity consumed12DDS10%DDC90%Max size: 64TBGiven that 10% of data by volume stored is duplicate and 90% is unique, a single 64TB TDVV3 DDS allows us to deduplicate 640TB of written dataAfter the first 640TB, further new data is 100% unique (a

17、nd will be written to the DDS), but any incoming data thats a duplicate of data already stored in the DDS can continued to be deduplicatedThis will not affect the amount of savings (in GB) but it will reduce the deduplication ratio as the ratio of unique to shared data will change as the DDCs contin

18、ue to grow576TB total DDC space for a 64TB DDS640TBProvisioning changes in 3.3.1“TDVV” and Thinly Deduplicated is going away13SSMC 3.1 VV provisioning screenDeduplication is no longer a volume typeProvisioning changes in 3.3.1“TDVV” is going away14Deduplication and compression are now attributes of

19、TPVVs (thin volumes)They are always shown but only enabled when a system supports deduplication and compressionThey can be enabled together or independently from one anotherYou can still use the tune tools to convert a VV between thin, dedup, compress, decoYou can easily see the status of dedup/comp

20、ression15What does the new implementation mean?Advantages of the new deduplication implementationUp to 8x better scalabilityBetter deduplication scalability with more efficient use of DDSImproved savingsStoring only duplicate data in the DDS means more chances to deduplicate dataSimplified managemen

21、tImproved scalability means fewer CPGs for more deduplicated volumesImproved performanceIncreased IOPS, bandwidth and reduced latency for all platforms數(shù)據(jù)壓縮 Compression17HPE 3PAR StoreServ compressionAdvanced inline, in-memory compressionHPE 3PAR StoreServ arrays leverage Express Scan technology to p

22、revent wasted CPU cyclesHost writes data to the array held in cache pages to increase write performancePage compressed using CPU once Express Scan verifies data is compressibleCompressed pages are written to SSD for permanent storageCompression uses the same three-layer exception tables used for ded

23、uplicationLossy vs. lossless compressionComparing methods for compression19Uncompressed imageSize: 12MBLossless (PNG)compressed imageSize: 6MBLossy (JPEG)compressed imageSize: 0.1MBCompression ratio1:1Compression ratio2:1Compression ratio120:1Compression algorithm and block sizeCompression ratio for

24、 Oracle test dataFull File64KB blocks16KB blocksCompression (MB/s)Decompression (MB/s)lz43.743.693.454411460lzo3.773.753.58404610gzip4.733.643.47246133For compression, larger block sizes offer increased savings3PARs 16KB page size is naturally a great choice for compression16KB shows just a 6.5% los

25、s in savings compared to a 64KB block sizeTelemetry data tells us that the majority of write bandwidth is driven by block sizes 16KB and largerCompression with blocks smaller than 16KB is CPU-intensive work and is very inefficient in terms of savingsA perfect example is EMC XtremIO; when they introd

26、uced compression in XOS 3.0 they were forced to change their block size from 4KB to 8KB (which is still less efficient than a 16KB block size)lz4 offers excellent savings when using 16KB pages and the best performance for compression and decompressionData segments111110000011111000000000000000000000

27、000000000000000000000000000000000000000000000000000000CompressionData reduction by compressing dataCompression algorithms work by inspecting data in blocks and removing redundant informationWithin each block, there will be repeated data and often padding around the real dataCompression removes the r

28、epeated data and padding space to reduce the capacity required to store the data100001000010000111111111111111100010111010001100010111010001111110000011111100001000010000HPE Private | Internal Use Only 16K pageCompression metadata structure2216KB page from SD LD spaceCompressed pagesPage metadataaaa

29、bbbcccMD000000000000iiijjjkkkxxxyyyzzzPadding spaceOffset=010 (2)Offset=011 (3)Offset=001 (1)In this example, three 16KiB pages are compressed in a single 16KiB pageThe host has written 48KiB to the volume but the array is only consuming 16KiB, resulting in 3:1 data reductionThere is no change to th

30、e metadata path for compressed pages since the existing Express Index tables are leveraged for increased simplicityUp to eight compressed pages can be stored in a single SD LD pageL1L2L3LBA2iii jjj kkkL2L3LBA1aaa bbb cccL3L2LBA3xxx yyy zzzSD spaceData Packing23Compressed data presents a unique probl

31、emAfter compression, blocks are odd sizes2416KB16KB16KB16KB16KB16KB2.3KB8.3KB5.2KB3.2KB10.7KB1.1KBTotal: 96KBTotal: 30.8KBSaving: 65.2KB3.12:1CompressedUncompressed?The odd sizes of compressed pages make them difficult to storeAppend-only data structuresUsed by many all-flash arrays25As data is writ

32、ten to the system, it is compressed and then combined into a single stripeThe complete stripe, with any metadata required, is written to SSD sequentiallyWhen hosts overwrite data, the old blocks are invalidated and new data is written to new stripesAt some point, a post-process task must take the ex

33、isting data and write it to a new stripe with data from other partial stripesExtremely inefficient use of space as data is overwrittenRequires the array to hide space for housekeeping (overprovisioning)Requires backend I/O intensive housekeeping to keep up with massive amount of garbageGarbage colle

34、ction and housekeeping need to be run at drive and system levelVirtually impossible to accurately report space consumption and true data reduction ratiosStripeStoring compressed data in variable block sizesSome systems use variable block sizes (base 2 4K, 8K, 16K)2616KB16KB16KB16KB16KB16KB2.3KB8.3KB

35、5.2KB3.2KB10.7KB1.1KB4KB4KB4KB16KB8KB16KBTotal: 96KBTotal: 30.8KBSaving: 65.2KB3.12:1Total: 52KBSaving: 44KB1.85:1CompressedUncompressedBackendPadding per-backend page means lots of wasted space, resulting in lower total system efficiencyThe array would likely still report this as 3.12:1HPE 3PAR Dat

36、a Packing2716KB16KB16KB16KB16KB2.3KB1.1KB5.2KB3.2KB2.9KBCompressedUncompressedData PackingNo compromise on efficiency gained through compressionHow does it work?Buffer Header is 256 bytes and contains pointers to the compressed pagesEach data page can hold up to 8 compressed pages (limited by the av

37、ailable page table entry space)Compressed Data Page FormatControl Buffer Header (256b)Compressed Data0Compressed Data1.Compressed Data716 KiB data pageHPE Private | Internal Use Only How does it work?Data PackingControl Buffer HeaderCompressed DataCompressed DataCompressed Data16 KiB data page16 KiB

38、 compression buffer16 KiB uncompressed CMP #1101000101000101011101010101001010100101001010101000001010100101001010016 KiB uncompressed CMP #216 KiB uncompressed CMP #3HPE Private | Internal Use Only 16 KiB data page1 KiBHow does it work?Overwrites3016 KiB compressed CMPControl Buffer Header3 KiB2 Ki

39、B8 KiBControl Buffer Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB1 KiB8 KiBNew DataCompressionHPE Private | Internal Use Only 16 KiB data pageHow does it work?Overwrites3116 KiB compressed CMPControl Buffer Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB2 KiB8

40、KiBControl Buffer Header3 KiB2 KiB8 KiB4 KiBControl Buffer Header3 KiB4 KiB8 KiBNew DataCompressionHPE Private | Internal Use Only 16 KiB data pageHow does it work?Express Scan16 KiB compression bufferCompressed Data16 KiB uncompressed CMP1010001010001010111010101010010101001010010101010000010101001

41、01001010051%HPE Private | Internal Use Only Adaptive Data Reduction33HPE 3PAR StoreServ data reduction technologiesTechnologies work together for optimal resultsWhen used together, duplicate pages are removed first and unique pages are then compressedExpress Index tablesDeduplication3PAR ASICCompres

42、sionIntel CPUData in cacheResulting data written to SSDExpress ScanUnique dataData PackingDeduplication compared with compressionA simpler way to understand the differences and target use-casesCompressionWorks within datasetsDeduplicationWorks across datasetsDeduplication and compression on HPE 3PAR

43、36PrivateSpace(DDC)PrivateSpace(DDC)Shared Space(DDS)PrivateSpace(DDC)PrivateSpace(DDC)90% of dataCompression ratio2:110% of dataAverage references per page: 10-20CPG dedup ratio 2:1When new pages are received, a hash is calculated. If the page is unique, its compressed and written to the DDCNew uni

44、que pages are received, the hash calculated and if theyre unique, theyre written to the DDC.When a duplicate page is detected, its written uncompressed to the DDS and a pointer in the L3 exception table points to the location. The existing pages L3 exception is also updated with the DDS locationThe

45、original, compressed page is now marked as invalid and collected during the next GC runOnly 10% of data is stored in the DDS that data is referenced between 10 and 20 times on average, resulting in a 10-20:1 ratio within the DDS. Compressing this data offers limited savingsVolume Type Positioning37F

46、ullDeduplicatedDeduplicated + CompressedCompressedThinProvisioning TypePerformanceSpace SavingsHPE Confidential8200 and 8400 Feature MatrixVarious software combinations supported on an array at the same timeConfigurationFile PersonaAFCDedupCompressionSync RCPeriodic RCAsync RCOption 1Option 2Option

47、3Option 4There restrictions are for 8200 and 8400 models onlyThese restrictions are in place due to the limited Control cache (16GB) available on these two models8440, 8450 and 20000 Feature Matrix At release, there will be no support for compression and Async Streaming together on the same systemTh

48、is restriction is under review and may be lifted at a later dateConfigurationFile PersonaAFCDedupCompressionSync RCPeriodic RCAsync RCOption 1Option 27000 and 10000 Feature MatrixVarious software combinations supported on an array at the same timeCompression is not supported on Gen4 platforms (7000

49、and 10000 series)Async Streaming is not supported on Gen4 platforms (7000 and 10000 series)Only existing Gen4 systems running Async Streaming before upgrading to 3.3.1 will be supportedAny 7K or 10K customer who wants to start using Async Streaming on 3.3.1 will need to file a CERConfigurationFile P

50、ersonaAFCDedupCompressionSync RCPeriodic RCAsync RCOptionWhen to use whatDeduplicated volumesGood candidates for deduplication: Any data that has a high level of redundancyVDI - Persistent desktops can achieve excellent deduplication ratiosVM - OS images from multiple VMs can benefit from dedup. The

51、 app data may or may not dedup.Home directory and file shares - Users often store copies of the same file so this may benefit from dedupPoor candidates for deduplication: Databases - Most databases do not contain redundant data blocksPreviously deduplicated, compressed or encrypted data will not com

52、pact furtherThis does not include self-encrypting drives where data is deduped before it is written41HPE ConfidentialWhen to use whatCompressed volumesGood candidates for compression: Data with little redundancy will not dedup well but can benefit from compressionDatabases Typically do not have redu

53、ndant blocks but do have redundant data within blocksVM images with a lot of application data can benefit from compression of the application dataVDI with non-persistent desktops can achieve excellent compression ratiosPoor candidates for compression: Compressed data data that is compressed at the h

54、ost will not compress furtherEncrypted data - Host or SAN encrypted data will not benefit from storage compressionThis does not include self-encrypting drives where data is compressed before it is writtenBe careful with file data as it many contain compressed data such as jpegs and mp3s42HPE Confide

55、ntialWhen to use whatDeduplicated and Compressed volumes (DECO)Good candidates for DECO: VM images - OS images from multiple VMs can benefit from dedup and the application data will compressVDI Both persistent and non-persistent desktops can achieve excellent data reduction ratiosHome directory and

56、file shares - Deduplication and compression can offer significant space savingsEmail applications such as ExchangePoor candidates for DECO: Databases - Most databases will not dedup. Compression only is best for databases.Deduplicated data - Data that has already been deduplicated on the host will n

57、ot dedup furtherData compressed or encrypted at the host or switch will not dedup or compress furtherThis does not include self-encrypting drives where data is deduped before it is written43HPE ConfidentialSelective Adaptive Data ReductionAllowing more efficient use of system resourcesDifferent data

58、 types have different requirementsFor each data type, enable the technologies that provide benefits and disable the technologies that dontOracle databaseCompressed(2:1)Exchange serverDeduplicatedCompressed(1.5:1)Compressed videoThin ProvisionedVDI environmentDeduplicatedCompressed(2:1+)Understanding

59、 Compaction ratiosCompaction is a factor of total system efficiency and includes all VVs45Deduplicated 2:1VV1Compressed 2:1Thin Savings: 1.5:1VV2Compressed 2:1Thin Savings: 1.5:1VV3DeduplicatedThin Savings: 1.5:1VV4DeduplicatedThin Savings: 1.5:1VV5Thin ProvisionedThin Savings: 1.5:1VV6Thin Provisio

60、nedThin Savings: 1.5:1VV7Thick ProvisionedVV8Thick ProvisionedVV9Thick ProvisionedSystem Compaction: 1.2:1Why?Because of the Thick Provisioned VVs which consume a lot of capacitySSMC and CLI changes46Estimating savings from dedup and compressionFor existing VVs, dedup and compression start dry-runs

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論