云計算與云數(shù)據(jù)管理_第1頁
云計算與云數(shù)據(jù)管理_第2頁
云計算與云數(shù)據(jù)管理_第3頁
云計算與云數(shù)據(jù)管理_第4頁
云計算與云數(shù)據(jù)管理_第5頁
已閱讀5頁,還剩162頁未讀 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)

文檔簡介

云計算與云數(shù)據(jù)管理陸嘉恒中國人民大學《先進數(shù)據(jù)管理》前沿講習班2023/12/61主要內(nèi)容2

云計算概述Google云計算技術(shù):GFS,Bigtable和MapreduceYahoo云計算技術(shù)和Hadoop云數(shù)據(jù)管理的挑戰(zhàn)2023/12/62人民大學新開的《分布式系統(tǒng)與云計算》課程3

分布式系統(tǒng)概述分布式云計算技術(shù)綜述分布式云計算平臺分布式云計算程序開發(fā)2023/12/63第一篇分布式系統(tǒng)概述4第一章:分布式系統(tǒng)入門第二章:客戶-服務(wù)器端構(gòu)架第三章:分布式對象第四章:公共對象請求代理結(jié)構(gòu)(CORBA)2023/12/64第二篇云計算綜述5第五章:云計算入門

第六章:云服務(wù)第七章:云相關(guān)技術(shù)比較7.1網(wǎng)格計算和云計算7.2Utility計算(效用計算)和云計算7.3并行和分布計算和云計算7.4集群計算和云計算

2023/12/65第三篇云計算平臺6第八章:Google云平臺的三大技術(shù)第九章:Yahoo云平臺的技術(shù)第十章:Aneka云平臺的技術(shù)第十一章:Greenplum云平臺的技術(shù)第十二章:Amazondynamo云平臺的技術(shù)2023/12/66第四篇云計算平臺開發(fā)7第十三章:基于Hadoop系統(tǒng)開發(fā)第十四章:基于HBase系統(tǒng)開發(fā)第十五章:基于GoogleApps系統(tǒng)開發(fā)第十六章:基于MSAzure系統(tǒng)開發(fā)第十七章:基于AmazonEC2系統(tǒng)開發(fā)2023/12/67Cloudcomputing2023/12/682023/12/69Whyweusecloudcomputing?2023/12/610Whyweusecloudcomputing?Case1:WriteafileSaveComputerdown,fileislostFilesarealwaysstoredincloud,neverlost2023/12/611Whyweusecloudcomputing?Case2:UseIEdownload,install,useUseQQdownload,install,useUseC++download,install,use……Gettheservefromthecloud2023/12/612Whatiscloudandcloudcomputing?CloudDemandresourcesorservicesoverInternetscaleandreliabilityofadatacenter.2023/12/613Whatiscloudandcloudcomputing?

CloudcomputingisastyleofcomputinginwhichdynamicallyscalableandoftenvirtualizedresourcesareprovidedasaserveovertheInternet.Usersneednothaveknowledgeof,expertisein,orcontroloverthetechnologyinfrastructureinthe"cloud"thatsupportsthem.

2023/12/614CharacteristicsofcloudcomputingVirtual.software,databases,Webservers,operatingsystems,storageandnetworkingasvirtualservers.Ondemand.addandsubtractprocessors,memory,networkbandwidth,storage.2023/12/615IaaSInfrastructureasaServicePaaSPlatformasaServiceSaaSSoftwareasaServiceTypesofcloudservice2023/12/616SoftwaredeliverymodelNohardwareorsoftwaretomanageServicedeliveredthroughabrowserCustomersusetheserviceondemandInstantScalabilitySaaS2023/12/617ExamplesYourcurrentCRMpackageisnotmanagingtheloadoryousimplydon’twanttohostitin-house.UseaSaaSprovidersuchasS

Youremailishostedonanexchangeserverinyourofficeanditisveryslow.OutsourcethisusingHostedExchange.SaaS2023/12/618PlatformdeliverymodelPlatformsarebuiltuponInfrastructure,whichisexpensiveEstimatingdemandisnotascience!Platformmanagementisnotfun!PaaS2023/12/619ExamplesYouneedtohostalargefile(5Mb)onyourwebsiteandmakeitavailablefor35,000usersforonlytwomonthsduration.UseCloudFrontfromAmazon.Youwanttostartstorageservicesonyournetworkforalargenumberoffilesandyoudonothavethestoragecapacity…useAmazonS3.PaaS2023/12/620ComputerinfrastructuredeliverymodelAplatformvirtualizationenvironmentComputingresources,suchasstoringandprocessingcapacity.

VirtualizationtakenastepfurtherIaaS2023/12/621ExamplesYouwanttorunabatchjobbutyoudon’thavetheinfrastructurenecessarytorunitinatimelymanner.UseAmazonEC2.

Youwanttohostawebsite,butonlyforafewdays.UseFlexiscale.IaaS2023/12/622Cloudcomputingandothercomputingtechniques2023/12/623The21stCenturyVisionOfComputingLeonardKleinrock,oneofthechiefscientistsoftheoriginalAdvancedResearchProjectsAgencyNetwork(ARPANET)projectwhichseededtheInternet,said:“Asofnow,computernetworksarestillintheirinfancy,butastheygrowupandbecomesophisticated,wewillprobablyseethespreadof‘computerutilities’which,likepresentelectricandtelephoneutilities,willserviceindividualhomesandofficesacrossthecountry.”2023/12/624The21stCenturyVisionOfComputingSunMicrosystemsco-founderBillJoyHealsoindicated“Itwouldtaketimeuntilthesemarketstomaturetogeneratethiskindofvalue.Predictingnowwhichcompanieswillcapturethevalueisimpossible.Manyofthemhavenotevenbeencreatedyet.”2023/12/625The21stCenturyVisionOfComputing2023/12/626DefinitionsCloudGridClusterutility2023/12/627DefinitionsCloudGridClusterutilityUtilitycomputingisthepackagingofcomputingresources,suchascomputationandstorage,asameteredservicesimilartoatraditionalpublicutility2023/12/628DefinitionsCloudGridClusterutilityAcomputerclusterisagroupoflinkedcomputers,workingtogethercloselysothatinmanyrespectstheyformasinglecomputer.2023/12/629DefinitionsCloudGridClusterutilityGridcomputingistheapplicationofseveralcomputerstoasingleproblematthesametime—usuallytoascientificortechnicalproblemthatrequiresagreatnumberofcomputerprocessingcyclesoraccesstolargeamountsofdata2023/12/630DefinitionsCloudGridClusterutilityCloudcomputingisastyleofcomputinginwhichdynamicallyscalableandoftenvirtualizedresourcesareprovidedasaserviceovertheInternet.2023/12/631GridComputing&CloudComputingsharealotcommonalityintention,architectureandtechnology

Differenceprogrammingmodel,businessmodel,computemodel,applications,andVirtualization.2023/12/632GridComputing&CloudComputingtheproblemsaremostlythesamemanagelargefacilities;definemethodsbywhichconsumersdiscover,requestanduseresourcesprovidedbythecentralfacilities;implementtheoftenhighlyparallelcomputationsthatexecuteonthoseresources.2023/12/633GridComputing&CloudComputingVirtualizationGriddonotrelyonvirtualizationasmuchasCloudsdo,eachindividualorganizationmaintainfullcontroloftheirresourcesCloudanindispensableingredientforalmosteveryCloud2023/12/6342023/12/6352023/12/636Anyquestionandanycomments?2023/12/636主要內(nèi)容37

云計算概述Google云計算技術(shù):GFS,Bigtable和MapreduceYahoo云計算技術(shù)和Hadoop云數(shù)據(jù)管理的挑戰(zhàn)2023/12/637GoogleCloudcomputingtechniques2023/12/638TheGoogleFileSystem 2023/12/639TheGoogleFileSystem (GFS)AscalabledistributedfilesystemforlargedistributeddataintensiveapplicationsMultipleGFSclustersarecurrentlydeployed.Thelargestoneshave:1000+storagenodes300+TeraBytesofdiskstorageheavilyaccessedbyhundredsofclientsondistinctmachines2023/12/640IntroductionSharesmanysamegoalsaspreviousdistributedfilesystemsperformance,scalability,reliability,etcGFSdesignhasbeendrivenbyfourkeyobservationofGoogleapplicationworkloadsandtechnologicalenvironment2023/12/641Intro:Observations11.Componentfailuresarethenormconstantmonitoring,errordetection,faulttoleranceandautomaticrecoveryareintegraltothesystem2.Hugefiles(bytraditionalstandards)MultiGBfilesarecommonI/Ooperationsandblockssizesmustberevisited2023/12/642Intro:Observations23.MostfilesaremutatedbyappendingnewdataThisisthefocusofperformanceoptimizationandatomicityguarantees4.Co-designingtheapplicationsandAPIsbenefitsoverallsystembyincreasingflexibility2023/12/643TheDesignClusterconsistsofasinglemasterandmultiplechunkserversandisaccessedbymultipleclients2023/12/644TheMasterMaintainsallfilesystemmetadata.namesspace,accesscontrolinfo,filetochunkmappings,chunk(includingreplicas)location,etc.PeriodicallycommunicateswithchunkserversinHeartBeatmessagestogiveinstructionsandcheckstate2023/12/645TheMasterHelpsmakesophisticatedchunkplacementandreplicationdecision,usingglobalknowledgeForreadingandwriting,clientcontactsMastertogetchunklocations,thendealsdirectlywithchunkserversMasterisnotabottleneckforreads/writes2023/12/646ChunkserversFilesarebrokenintochunks.Eachchunkhasaimmutablegloballyunique64-bitchunk-handle.handleisassignedbythemasteratchunkcreationChunksizeis64MBEachchunkisreplicatedon3(default)servers2023/12/647ClientsLinkedtoappsusingthefilesystemAPI.CommunicateswithmasterandchunkserversforreadingandwritingMasterinteractionsonlyformetadataChunkserverinteractionsfordataOnlycachesmetadatainformationDataistoolargetocache.2023/12/648ChunkLocationsMasterdoesnotkeepapersistentrecordoflocationsofchunksandreplicas.Pollschunkserversatstartup,andwhennewchunkserversjoin/leaveforthis.StaysuptodatebycontrollingplacementofnewchunksandthroughHeartBeatmessages(whenmonitoringchunkservers)2023/12/649OperationLogRecordofallcriticalmetadatachangesStoredonMasterandreplicatedonothermachinesDefinesorderofconcurrentoperationsAlsousedtorecoverthefilesystemstate2023/12/650SystemInteractions:

LeasesandMutationOrderLeasesmaintainamutationorderacrossallchunkreplicasMastergrantsaleasetoareplica,calledtheprimaryTheprimarychosestheserialmutationorder,andallreplicasfollowthisorderMinimizesmanagementoverheadfortheMaster2023/12/651AtomicRecordAppendClientspecifiesthedatatowrite;GFSchoosesandreturnstheoffsetitwritestoandappendsthedatatoeachreplicaatleastonceHeavilyusedbyGoogle’sDistributedapplications.NoneedforadistributedlockmanagerGFSchosestheoffset,nottheclient2023/12/652AtomicRecordAppend:How?FollowssimilarcontrolflowasmutationsPrimarytellssecondaryreplicastoappendatthesameoffsetastheprimaryIfareplicaappendfailsatanyreplica,itisretriedbytheclient.Soreplicasofthesamechunkmaycontaindifferentdata,includingduplicates,wholeorinpart,ofthesamerecord2023/12/653AtomicRecordAppend:How?GFSdoesnotguaranteethatallreplicasarebitwiseidentical.Onlyguaranteesthatdataiswrittenatleastonceinanatomicunit.Datamustbewrittenatthesameoffsetforallchunkreplicasforsuccesstobereported.2023/12/654DetectingStaleReplicasMasterhasachunkversionnumbertodistinguishuptodateandstalereplicasIncreaseversionwhengrantingaleaseIfareplicaisnotavailable,itsversionisnotincreasedmasterdetectsstalereplicaswhenachunkserversreportchunksandversionsRemovestalereplicasduringgarbagecollection2023/12/655GarbagecollectionWhenaclientdeletesafile,masterlogsitlikeotherchangesandchangesfilenametoahiddenfile.Masterremovesfileshiddenforlongerthan3dayswhenscanningfilesystemnamespacemetadataisalsoerasedDuringHeartBeatmessages,thechunkserverssendthemasterasubsetofitschunks,andthemastertellsitwhichfileshavenometadata.Chunkserverremovesthesefilesonitsown2023/12/656FaultTolerance:

HighAvailabilityFastrecoveryMasterandchunkserverscanrestartinsecondsChunkReplicationMasterReplication“shadow”mastersprovideread-onlyaccesswhenprimarymasterisdownmutationsnotdoneuntilrecordedonallmasterreplicas2023/12/657FaultTolerance:

DataIntegrityChunkserversusechecksumstodetectcorruptdataSincereplicasarenotbitwiseidentical,chunkserversmaintaintheirownchecksumsForreads,chunkserververifieschecksumbeforesendingchunkUpdatechecksumsduringwrites2023/12/658Introductionto

MapReduce2023/12/659MapReduce:Insight

”Considertheproblemofcountingthenumberofoccurrencesofeachwordinalargecollectionofdocuments”Howwouldyoudoitinparallel?2023/12/660MapReduceProgrammingModel

InspiredfrommapandreduceoperationscommonlyusedinfunctionalprogramminglanguageslikeLisp.Usersimplementinterfaceoftwoprimarymethods:1.Map:(key1,val1)→(key2,val2)2.Reduce:(key2,[val2])→[val3]

2023/12/661Mapoperation

Map,apurefunction,writtenbytheuser,takesaninputkey/valuepairandproducesasetofintermediatekey/valuepairs.e.g.(doc—id,doc-content)DrawananalogytoSQL,mapcanbevisualizedasgroup-byclauseofanaggregatequery.

2023/12/662Reduceoperation

Oncompletionofmapphase,alltheintermediatevaluesforagivenoutputkeyarecombinedtogetherintoalistandgiventoareducer.Canbevisualizedasaggregatefunction(e.g.,average)thatiscomputedoveralltherowswiththesamegroup-byattribute.2023/12/663Pseudo-codemap(Stringinput_key,Stringinput_value)://input_key:documentname//input_value:documentcontentsforeachwordwininput_value: EmitIntermediate(w,"1");reduce(Stringoutput_key,Iteratorintermediate_values)://output_key:aword//output_values:alistofcountsintresult=0;foreachvinintermediate_values: result+=ParseInt(v);Emit(AsString(result));2023/12/664MapReduce:Executionoverview

2023/12/665MapReduce:Example

2023/12/666MapReduceinParallel:Example

2023/12/667MapReduce:FaultToleranceHandledviare-executionoftasks.TaskcompletioncommittedthroughmasterWhathappensifMapperfails?Re-executecompleted+in-progressmaptasksWhathappensifReducerfails?Re-executeinprogressreducetasksWhathappensifMasterfails?Potentialtrouble!!2023/12/668MapReduce:

WalkthroughofOnemoreApplication2023/12/6692023/12/670MapReduce:PageRank

PageRankmodelsthebehaviorofa“randomsurfer”.C(t)istheout-degreeoft,and(1-d)isadampingfactor(randomjump)The“randomsurfer”keepsclickingonsuccessivelinksatrandomnottakingcontentintoconsideration.Distributesitspagesrankequallyamongallpagesitlinksto.Thedampeningfactortakesthesurfer“gettingbored”andtypingarbitraryURL.2023/12/671PageRank:KeyInsights

Effectsateachiterationislocal.i+1thiterationdependsonlyonithiterationAtiterationi,PageRankforindividualnodescanbecomputedindependently2023/12/672PageRankusingMapReduce

UseSparsematrixrepresentation(M)MapeachrowofMtoalistofPageRank“credit”toassigntooutlinkneighbours.TheseprestigescoresarereducedtoasinglePageRankvalueforapagebyaggregatingoverthem.2023/12/673PageRankusingMapReduceMap:distributePageRank“credit”tolinktargetsReduce:gatherupPageRank“credit”frommultiplesourcestocomputenewPageRankvalueIterateuntilconvergenceSourceofImage:Lin20082023/12/674

Phase1:ProcessHTML

Maptasktakes(URL,content)pairsandmapsthemto(URL,(PRinit,list-of-urls))PRinitisthe“seed”PageRankforURLlist-of-urlscontainsallpagespointedtobyURLReducetaskisjusttheidentityfunction2023/12/675

Phase2:PageRankDistribution

Reducetaskgets(URL,url_list)andmany(URL,val)valuesSumvalsandfixupwithdtogetnewPREmit(URL,(new_rank,url_list))Checkforconvergenceusingnonparallelcomponent2023/12/676MapReduce:SomeMoreAppsDistributedGrep.CountofURLAccessFrequency.Clustering(K-means)GraphAlgorithms.IndexingSystemsMapReduceProgramsInGoogleSourceTree2023/12/677MapReduce:Extensionsandsimilarapps

PIG(Yahoo)Hadoop(Apache)DryadLinq(Microsoft)2023/12/678LargeScaleSystemsArchitectureusingMapReduceUserAppMapReduceDistributedFileSystems(GFS)2023/12/679BigTable:ADistributedStorageSystemforStructuredData2023/12/680IntroductionBigTableisadistributedstoragesystemformanagingstructureddata.DesignedtoscaletoaverylargesizePetabytesofdataacrossthousandsofserversUsedformanyGoogleprojectsWebindexing,PersonalizedSearch,GoogleEarth,GoogleAnalytics,GoogleFinance,…Flexible,high-performancesolutionforallofGoogle’sproducts2023/12/681MotivationLotsof(semi-)structureddataatGoogleURLs:Contents,crawlmetadata,links,anchors,pagerank,…Per-userdata:Userpreferencesettings,recentqueries/searchresults,…Geographiclocations:Physicalentities(shops,restaurants,etc.),roads,satelliteimagedata,userannotations,…ScaleislargeBillionsofURLs,manyversions/page(~20K/version)Hundredsofmillionsofusers,thousandsorq/sec100TB+ofsatelliteimagedata2023/12/682WhynotjustusecommercialDB?ScaleistoolargeformostcommercialdatabasesEvenifitweren’t,costwouldbeveryhighBuildinginternallymeanssystemcanbeappliedacrossmanyprojectsforlowincrementalcostLow-levelstorageoptimizationshelpperformancesignificantlyMuchhardertodowhenrunningontopofadatabaselayer2023/12/683GoalsWantasynchronousprocessestobecontinuouslyupdatingdifferentpiecesofdataWantaccesstomostcurrentdataatanytimeNeedtosupport:Veryhighread/writerates(millionsofopspersecond)EfficientscansoverallorinterestingsubsetsofdataEfficientjoinsoflargeone-to-oneandone-to-manydatasetsOftenwanttoexaminedatachangesovertimeE.g.Contentsofawebpageovermultiplecrawls2023/12/684BigTableDistributedmulti-levelmapFault-tolerant,persistentScalableThousandsofserversTerabytesofin-memorydataPetabyteofdisk-baseddataMillionsofreads/writespersecond,efficientscansSelf-managingServerscanbeadded/removeddynamicallyServersadjusttoloadimbalance2023/12/685BuildingBlocksBuildingblocks:GoogleFileSystem(GFS):RawstorageScheduler:schedulesjobsontomachinesLockservice:distributedlockmanagerMapReduce:simplifiedlarge-scaledataprocessingBigTableusesofbuildingblocks:GFS:storespersistentdata(SSTablefileformatforstorageofdata)Scheduler:schedulesjobsinvolvedinBigTableservingLockservice:masterelection,locationbootstrappingMapReduce:oftenusedtoread/writeBigTabledata2023/12/686BasicDataModelABigTableisasparse,distributedpersistentmulti-dimensionalsortedmap(row,column,timestamp)->cellcontentsGoodmatchformostGoogleapplications2023/12/687WebTableExampleWanttokeepcopyofalargecollectionofwebpagesandrelatedinformationUseURLsasrowkeysVariousaspectsofwebpageascolumnnamesStorecontentsofwebpagesinthecontents:columnunderthetimestampswhentheywerefetched.2023/12/688RowsNameisanarbitrarystringAccesstodatainarowisatomicRowcreationisimplicituponstoringdataRowsorderedlexicographicallyRowsclosetogetherlexicographicallyusuallyononeorasmallnumberofmachines2023/12/689Rows(cont.)Readsofshortrowrangesareefficientandtypicallyrequirecommunicationwithasmallnumberofmachines.Canexploitthispropertybyselectingrowkeyssotheygetgoodlocalityfordataaccess.Example: ,,, VS edu.gatech.math,edu.gatech.phys,edu.uga.math,edu.uga.phys2023/12/690ColumnsColumnshavetwo-levelnamestructure:family:optional_qualifierColumnfamilyUnitofaccesscontrolHasassociatedtypeinformationQualifiergivesunboundedcolumnsAdditionallevelsofindexing,ifdesired2023/12/691TimestampsUsedtostoredifferentversionsofdatainacellNewwritesdefaulttocurrenttime,buttimestampsforwritescanalsobesetexplicitlybyclientsLookupoptions:“ReturnmostrecentKvalues”“Returnallvaluesintimestamprange(orallvalues)”Columnfamiliescanbemarkedw/attributes:“OnlyretainmostrecentKvaluesinacell”“KeepvaluesuntiltheyareolderthanKseconds”2023/12/692Implementation–ThreeMajorComponentsLibrarylinkedintoeveryclientOnemasterserverResponsiblefor:AssigningtabletstotabletserversDetectingadditionandexpirationoftabletserversBalancingtablet-serverloadGarbagecollectionManytabletserversTabletservershandlereadandwriterequeststoitstableSplitstabletsthathavegrowntoolarge2023/12/693Implementation(cont.)Clientdatadoesn’tmovethroughmasterserver.Clientscommunicatedirectlywithtabletserversforreadsandwrites.Mostclientsnevercommunicatewiththemasterserver,leavingitlightlyloadedinpractice.2023/12/694TabletsLargetablesbrokenintotabletsatrowboundariesTabletholdscontiguousrangeofrowsClientscanoftenchooserowkeystoachievelocalityAimfor~100MBto200MBofdatapertabletServingmachineresponsiblefor~100tabletsFastrecovery:100machineseachpickup1tabletforfailedmachineFine-grainedloadbalancing:MigratetabletsawayfromoverloadedmachineMastermakesload-balancingdecisions2023/12/695TabletLocationSincetabletsmovearoundfromservertoserver,givenarow,howdoclientsfindtherightmachine?Needtofindtabletwhoserowrangecoversthetargetrow2023/12/696TabletAssignmentEachtabletisassignedtoonetabletserveratatime.Masterserverkeepstrackofthesetoflivetabletserversandcurrentassignmentsoftabletstoservers.Alsokeepstrackofunassignedtablets.Whenatabletisunassigned,masterassignsthetablettoantabletserverwithsufficientroom.2023/12/697APIMetadataoperationsCreate/deletetables,columnfamilies,changemetadataWrites(atomic)Set():writecellsinarowDeleteCells():deletecellsinarowDeleteRow():deleteallcellsinarowReadsScanner:readarbitrarycellsinabigtableEachrowreadisatomicCanrestrictreturnedrowstoaparticularrangeCanaskforjustdatafrom1row,allrows,etc.Canaskforallcolumns,justcertaincolumnfamilies,orspecificcolumns2023/12/698Refinements:CompressionManyopportunitiesforcompressionSimilarvaluesinthesamerow/columnatdifferenttimestampsSimilarvaluesindifferentcolumnsSimilarvaluesacrossadjacentrowsTwo-passcustomcompressionsschemeFirstpass:compresslongcommonstringsacrossalargewindowSecondpass:lookforrepetitionsinsmallwindowSpeedemphasized,butgoodspacereduction(10-to-1)2023/12/699Refinements:BloomFiltersReadoperationhastoreadfromdiskwhendesiredSSTableisn’tinmemoryReducenumberofaccessesbyspecifyingaBloomfilter.AllowsusaskifanSSTablemightcontaindataforaspecifiedrow/columnpair.SmallamountofmemoryforBloomfiltersdrasticallyreducesthenumberofdiskseeksforreadoperationsUseimpliesthatmostlookupsfornon-existentrowsorcolumnsdonotneedtotouchdisk2023/12/6100Refinements:BloomFiltersReadoperationhastoreadfromdiskwhendesiredSSTableisn’tinmemoryReducenumberofaccessesbyspecifyingaBloomfilter.AllowsusaskifanSSTablemightcontaindataforaspecifiedrow/columnpair.SmallamountofmemoryforBloomfiltersdrasticallyreducesthenumberofdiskseeksforreadoperationsUseimpliesthatmostlookupsfornon-existentrowsorcolumnsdonotneedtotouchdisk2023/12/6101主要內(nèi)容102

云計算概述

Google云計算技術(shù):GFS,Bigtable和MapreduceYahoo云計算技術(shù)和Hadoop云數(shù)據(jù)管理的挑戰(zhàn)2023/12/6102Yahoo!Cloudcomputing2023/12/6103babycenterepicuriousSearchResultsoftheFutureLinkedInwebmdGawkerNewYorkTimes2023/12/6104What’sintheHorizontalCloud?CommonApproachestoQA,ProductionEngineering,PerformanceEngineering,DatacenterManagement,andOptimizationID&AccountManagementMonitoring&QoSSharedInfrastructureMetering,Billing,AccountingHorizontalCloudServicesEdgeContentServicese.g.,YCS,YCPIProvisioning&Virtualizatione.g.,EC2BatchStorage&Processinge.g.,Hadoop&PigOperationalStoragee.g.,S3,MObStor,SherpaOtherServicesMessaging,Workflow,virtualDBs&WebservingSecuritySimpleWebServiceAPI’s2023/12/6105Yahoo!CloudStackProvisioning(Self-serve)HorizontalCloudServices…YCSYCPIBrooklynEDGEMonitoring/Metering/SecurityHorizontalCloudServices…HadoopBATCHHorizontalCloudServices…SherpaMOBStorSTORAGEHorizontalCloudServicesVM/OS…APPHorizontalCloudServicesVM/OSyApacheWEBDataHighwayServingGridPHPAppEngine2023/12/6106WebDataManagementLargedataanalysis(Hadoop)Structuredrecordstorage(PNUTS/Sherpa)Blobstorage(SAN/NAS)ScanorientedworkloadsFocusonsequentialdiskI/O$percpucycleCRUDPointlookupsandshortscansIndexorganizedtableandrandomI/Os$perlatencyObjectretrievalandstreamingScalablefilestorage$perGB2023/12/6107TheWorldHasChangedWebservingapplicationsneed:Scalability!PreferablyelasticFlexibleschemasGeographicdistributionHighavailabilityReliablestorageWebservingapplicationscandowithout:ComplicatedqueriesStrongtransactions2023/12/6108PNUTS/SHERPAToHelpYouScaleYourMountainsofData2023/12/6109Yahoo!ServingStorageProblemSmallrecords–100KBorlessStructuredrecords–lotsoffields,evolvingExtremedatascale-TensofTBExtremerequestscale-Tensofthousandsofrequests/secLowlatencyglobally-20+datacentersworldwideHighAvailability-outagescost$millionsVariableusagepatterns-asapplicationsanduserschange

1102023/12/6110ThePNUTS/SherpaSolutionThenextgenerationglobal-scalerecordstoreRecord-orientation:Routing,datastorageoptimizedforlow-latencyrecordaccessScaleout:Addmachinestoscalethroughput(whilekeepinglatencylow)Asynchrony:Pub-subreplicationtofar-flungdatacenterstomaskpropagationdelayConsistencymodel:ReducecomplexityofasynchronyfortheapplicationprogrammerClouddeploymentmodel:Hosted,managedservicetoreduceapptime-to-marketandenableondemandscaleandelasticity1112023/12/6111E75656CA42342EB42521WC66354WD12352EF15677EWhatisPNUTS/Sherpa?E75656CA42342EB42521WC66354WD12352EF15677ECREATETABLEParts( IDVARCHAR, StockNumberINT, StatusVARCHAR …)ParalleldatabaseGeographicreplicationStructured,flexibleschemaHosted,managedinfrastructureA42342EB42521WC66354WD12352EE75656CF15677E1122023/12/6112WhatWillItBecome?E75656CA42342EB42521WC66354WD12352EF15677EE75656CA42342EB42521WC66354WD12352EF15677EE75656CA42342EB42521WC66354WD12352EF15677ECREATETABLEParts( IDVARCHAR, StockNumberINT, StatusVARCHA

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論