版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領
文檔簡介
CloudComputingNetworkComputingNetworkiscomputer(client-server)SeparationofFunctionalitiesEvolutionofComputingwithNetwork(1/2)2ClusterComputingTightlycoupledcomputingresources: CPU,storage,data,etc.UsuallyconnectedwithinaLANManagedasasingleresourceCommodity,OpensourceEvolutionofComputingwithNetwork(2/2)
GridComputingResourcesharingacrossseveraldomainsDecentralized,openstandardsGlobalresourcesharingUtilityComputingDon’tbuycomputers,leasecomputingpowerUpload,run,downloadOwnershipmodel3TheNextStep:CloudComputing
Serviceanddataareinthecloud,accessiblewithanydeviceconnectedtothecloudwithabrowserAkeytechnicalissuefordeveloper:ScalabilityServicesarenotknowngeographically45ApplicationsontheWebApplicationsontheWeb6ApplicationsontheWebCloudComputing
DefinitionCloudcomputingisaconceptofusingtheinternettoallowpeopletoaccesstechnology-enabledservices. Itallowsuserstoconsumeserviceswithoutknowledgeofcontroloverthetechnologyinfrastructurethatsupportsthem. -Wikipedia7MajorTypesofCloud
ComputeandDataCloudAmazonElasticComputingCloud(EC2),GoogleMapReduce,SciencecloudsProvideplatformforrunningsciencecodeHostCloudGoogleAppEngineHighly-available,faulttolerance,robustnessforwebcapability8CloudComputingExample-AmazonEC2/ec29CloudComputingExample-GoogleAppEngine
GoogleAppEngineAPIPythonruntimeenvironmentDatastoreAPIImagesAPIMailAPIMemcacheAPIURLFetchAPIUsersAPIAfreeaccountcanuseupto500MBstorage,enoughCPUandbandwidthforabout5millionpageviewsamonth/appengine/10CloudComputing
AdvantagesSeparationofinfrastructuremaintenancedutiesfromapplicationdevelopmentSeparationofapplicationcodefromphysicalresourcesAbilitytouseexternalassetstohandlepeakloadsAbilitytoscaletomeetuserdemandsquicklySharingcapabilityamongalargepoolofusers,improvingoverallutilization11CloudComputingSummary
CloudcomputingisakindofnetworkserviceandisatrendforfuturecomputingScalabilitymattersincloudcomputingtechnologyUsersfocusonapplicationdevelopmentServicesarenotknowngeographically12Countingthenumbersvs.Programmingmodel
PersonalComputerOnetoOneClient/ServerOnetoManyCloudComputingManytoMany13WhatPowersCloudComputinginGoogle?
CommodityHardwarePerformance:singlemachinenotinterestingReliabilityMostreliablehardwarewillstillfail:fault-tolerantsoftwareneededFault-tolerantsoftwareenablesuseofcommoditycomponentsStandardization:usestandardizedmachinestorunallkindsofapplications14WhatPowersCloudComputinginGoogle?InfrastructureSoftwareDistributedstorage:DistributedFileSystem(GFS)Distributedsemi-structureddatasystemBigTableDistributeddataprocessingsystemMapReduce15Whatisthecommonissuesofallthesesoftware?GoogleFileSystem
Filesbrokenintochunks(typically4MB)Chunksreplicatedacrossthreemachinesforsafety(tunable)Datatransfershappendirectlybetweenclientsandchunkservers16GFSUsage@Google
200+clustersFilesystemclustersofupto5000+machinesPoolsof10000+clients5+PetabyteFilesystemsAllinthepresenceoffrequentHWfailure17BigTable
Datamodel(row,column,timestamp)cellcontents18BigTable
Distributedmulti-levelsparsemapFault-tolerance,persistentScalableThousandofserversTerabytesofin-memorydataPetabytesofdisk-baseddataSelf-managingServerscanbeadded/removeddynamicallyServersadjusttoloadimbalance19WhynotjustusecommercialDB?
ScaleistoolargeorcostistoohighformostcommercialdatabasesLow-levelstorageoptimizationshelpperformancesignificantlyMuchhardertodowhenrunningontopofadatabaselayerAlsofunandchallengingtobuildlarge-scalesystems20BigTableSummary
DatamodelapplicabletobroadrangeofclientsActivelydeployedinmanyofGoogle’sservicesSystemprovideshigh-performancestoragesystemonalargescaleSelf-managingThousandsofserversMillionsofops/secondMultipleGB/sreading/writingCurrently–500+BigTablecellsLargestbigtablecellmanages–3PBofdataspreadoverseveralthousandmachines21DistributedDataProcessing
Problem:Howtocountwordsinthetextfiles?Inputfiles:NtextfilesSize:multiplephysicaldisksProcessingphase1:launchMprocessesInput:N/MtextfilesOutput:partialresultsofeachword’scountProcessingphase2:mergeMoutputfilesofstep122PseudoCodeofWordCount23TaskManagementLogisticsDecidewhichcomputerstorunphase1,makesurethefilesareaccessible(NFS-likeorcopy)Similarforphase2Execution:Launchthephase1programswithappropriatecommandlineflags,re-launchfailedtasksuntilphase1isdoneSimilarforphase2Automation:buildtaskscriptsontopofexistingbatchsystem24Technicalissues
Filemanagement:wheretostorefiles?StoreallfilesonthesamefileserverBottleneckDistributedfilesystem:opportunitytorunlocallyGranularity:howtodecideNandM?Joballocation:assignwhichtasktowhichnode?Preferlocaljob:knowledgeoffilesystemFault-recovery:whatifanodecrashes?RedundancyofdataCrash-detectionandjobre-allocationnecessary25MapReduceAsimpleprogrammingmodelthatappliestomanydata-intensivecomputingproblemsHidemessydetailsinMapReduceruntimelibraryAutomaticparallelizationLoadbalancingNetworkanddisktransferoptimizationHandleofmachinefailuresRobustnessEasytouse26MapReduceProgrammingModel
Borrowedfromfunctionalprogramming map(f,[x1,…,xm,…])=[f(x1),…,f(xm),…] reduce(f,x1,[x2,x3,…])=reduce(f,f(x1,x2),[x3,…])=…(continueuntilthelistisexhausted)Usersimplementtwofunctions map(in_key,in_value)(key,value)list reduce(key,[value1,…,valuem])f_value27MapReduce–ANewModelandSystemTwophasesofdataprocessingMap:(in_key,in_value){(keyj,valuej)|j=1…k}Reduce:(key,[value1,…valuem])(key,f_value)28MapReduceVersionofPseudoCode
29NoFileI/OOnlydataprocessinglogicExample–WordCount(1/2)InputisfileswithonedocumentperrecordSpecifyamapfunctionthattakesakey/valuepairkey=documentURLValue=documentcontentsOutputofmapfunctioniskey/valuepairs.Inourcase,output(w,”1”)onceperwordinthedocument30Example–WordCount(2/2)
MapReducelibrarygatherstogetherallpairswiththesamekey(shuffle/sort)Thereducefunctioncombinesthevaluesforakey.Inourcase,computethesumOutputofreducepairedwithkeyandsaved
31MapReduceFrameworkForcertainclassesofproblems,theMapReduceframeworkprovides:Automatic&efficientparallelization/distributionI/Oscheduling:RunmapperclosetoinputdataFault-tolerance:restartfailedmapperorreducertasksonthesameordifferentnodesRobustness:tolerateevenmassivefailures: e.g.large-scalenetworkmaintenance:oncelost1800outof2000machinesStatus/monitoring32TaskGranularityAndPipelining
Finegranularitytasks:manymoremaptasksthanmachinesMinimizestimeforfaultrecoveryCanpipelineshufflingwithmapexecutionBetterdynamicloadbalancingOftenuse200,000map/5000reducetaskswith2000machines33343536373839404142MapReduce:UsesatGoogle
Typicalconfiguration:200,000mappers,500reducerson2,000nodesBroadapplicabilityhasbeenapleasantsurpriseQualityexperiences,loganalysis,machinetranslation,ad-hocdataprocessingProductionindexingsystem:rewrittenwithMapReduce~10MapReductions,muchsimplerthanoldcode43MapReduceSummary
MapReduceisproventobeusefulabstractionGreatlysimplifieslarge-scalecomputationatGoogleFuntouse:focusonproblem,letlibrarydealwithmessydetails44ADataPlayground
MapReduce+BigTable+GFS=DataplaygroundSubstantialfractionofinternetavailableforprocessingEasy-to-useteraflops/petabytes,quickturn-aroundCoolproblems,greatcolleagues4546OpenSourceCloudSoftware:ProjectHadoop
GooglepublishedpapersonGFS(‘03),MapReduce(‘04)andBigTable(‘06)ProjectHadoopAnopensourceprojectwiththeApacheSoftwareFountationImplementGoogle’sCloudtechnologiesinJavaHDFS(GFS)andHadoopMapReduceareavailable.Hbase(BigTable)isbeingdevelopedGoogleisnotdirectlyinvolvedinthedevelopmentavoidconflictofinterest47IndustrialInterestinHadoop
Yahoo!hiredcoreHadoopdevelopersAnnouncedthattheirWebmapisproducedonaHadoopclusterwith2000hosts(dual/quadcores)onFeb.19,2008.AmazonEC2(ElasticComputeCloud)supportsHadoopWriteyourmapperandreducer,uploadyourdataandprogram,runandpaybyresourceutilizationTiff-to-PDFconversionof11million
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 供貨合同范例豆丁網(wǎng)
- 合伙開影院合同范例
- 賣房合同與租房合同范例
- 傭金服務合同范例
- 家具物資購銷合同模板
- 店面轉(zhuǎn)讓 合同范例范例
- 建房租房合同范例
- 上海出境旅游合同范例
- 引入客戶入股合同范例
- 動物世界解析模板
- 河上建壩糾紛可行性方案
- 第五單元學雷鋒在行動(教案)全國通用五年級下冊綜合實踐活動
- 2024年華融實業(yè)投資管理有限公司招聘筆試參考題庫含答案解析
- 2024年1月普通高等學校招生全國統(tǒng)一考試適應性測試(九省聯(lián)考)歷史試題(適用地區(qū):貴州)含解析
- 兒童心理健康問題的評估與干預方案
- NB-T 47013.7-2012(JB-T 4730.7) 4730.7 承壓設備無損檢測 第7部分:目視檢測
- 空氣化工高精度氣體分裝及儲運中心一期項目環(huán)評報告書
- 肝吸蟲護理查房課件
- 小腿抽筋的原因以及緩解和自救方法定稿
- 社區(qū)工作者案件調(diào)解流程
- 2023年度高級會計實務真題及答案解析
評論
0/150
提交評論