版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
1、How Uber Builds A Cross Data Center Replication Platform on Apache Kafka基于Kafka的跨數(shù)據(jù)中心復制平臺架構(gòu)Apache Kafka at UberApache Kafka pipeline & replicationuReplicatorData loss detectionQ&AAgendaReal-time Dynamic PricingStream ProcessingDynamic pricingApp ViewsVehicle InformationApache KafkaUber Eats - Real-T
2、ime ETAsA bunch more .Fraud DetectionDriver & Rider Sign-ups, etc.Apache Kafka - Use CasesGeneral pub-sub, messaging queueStream processingAthenaX - self-service streaming analytics platform (Apache Samza & Apache Flink)Database changelog transportCassandra, MySQL, etc.IngestionHDFS, S3LoggingData I
3、nfrastructure UberPRODUCERSCONSUMERSReal-time Analytics, Alerts, DashboardsSamza / FlinkApplications Data ScienceAnalytics ReportingKafkaVertica / HiveRider AppDriver AppAPI / ServicesEtc.Ad-hoc ExplorationELKDebuggingHadoopMobile AppCassandra MySQLDATABASES(Internal) ServicesAWS S3PaymentSurgeTrill
4、ionsPBsMessages / DayDataTens of ThousandsTopicsScaleexcluding replicationApache Kafka at UberApache Kafka pipeline & replicationuReplicatorData loss detectionQ&AAgendaApache Kafka Pipeline UberDC2DC1Applications ProxyClientKafka REST ProxyRegional KafkaApplications ProxyClientKafka REST ProxyRegion
5、al KafkaSecondaryAggregate KafkauReplicatorOffset Sync ServiceAggregate KafkauReplicatorAggregationAggregate KafkauReplicatorOffset Sync ServiceAggregate KafkauReplicatorDC1Regional KafkaDC2Regional KafkaGlobal viewCross-Data Center FailoverAggregate KafkauReplicatorOffset Sync ServiceAggregate Kafk
6、auReplicatorDC1Regional KafkaDC2Regional KafkaDuring runtimeuReplicator reports offset mapping to offset sync serviceOffset sync service is all-active and the offset info is replicated across data centersDuring failoverConsumers ask offset sync service for offsets to resume consumption based on its
7、last commit offsetsOffset sync service translates offsets between aggregate clustersIngestionHDFS/S3DC1Aggregate KafkaHDFSS3DC2Aggregate KafkaHDFS/S3Topic MigrationConsumerDC1Regional KafkaDC2Regional KafkaSetup uReplicator from new cluster to old clusterMove producerMove consumerProducerTopic Migra
8、tionRegional KafkaConsumerDC1DC2Regional KafkaSetup uReplicator from new cluster to old clusterMove producerMove consumerRemove uReplicatorProduceruReplicatorTopic MigrationRegional KafkaConsumerDC1DC2Regional KafkaSetup uReplicator from new cluster to old clusterMove producerMove consumerRemove uRe
9、plicatorProduceruReplicatorConsumerTopic MigrationRegional KafkaDC1DC2Regional KafkaSetup uReplicator from new cluster to old clusterMove producerMove consumerRemove uReplicatorProduceruReplicatorConsumerTopic MigrationConsumerDC1Regional KafkaDC2Regional KafkaSetup uReplicator from new cluster to o
10、ld clusterMove producerMove consumerRemove uReplicatorProducerReplication - Use CasesAggregationGlobal viewAll-activeIngestionMigrationMove topics between clusters/DCsApache Kafka at UberApache Kafka pipeline & replicationuReplicatorData loss detectionQ&AAgendaMotivation - MirrorMakerPain pointExpen
11、sive rebalancingDifficulty adding topicsPossible data lossMetadata sync issuesRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingDesign - uReplicatorApache HelixuReplicator controllerStable replicationAssign topic partitions to each worker processHandle topic/worker ch
12、angesSimple operationsHandle adding/deleting topicsDesign - uReplicatoruReplicator workerApache Helix agentDynamic Simple ConsumerApache Kafka producerCommit after flushApache ZooKeeper Apache HelixControllerWorker ThreadApache HelixAgentFetcherManagertopic: testTopic partition:0FetcherThreadSimpleC
13、onsumerLeaderFind ThreadLinkedBlockingQueueApache Kafka ProducerRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingPerformance IssuesCatch-up time is too long (4 hours failover)Full-speed phase: 23 hoursLong tail phase: 56 hoursProblem: Full Speed PhaseDestination brok
14、ers are bound by CPUsSolution 1: Increase Batch SizeDestination brokers are bound by CPUsIncrease throughputproducer.batch.size: 64KB = 128KBproducer.linger.ms: 100 = 1000EffectBatch size increases: 1022KB = 5090KBCompress rate: 4062% = 2735% (compressed size)Solution 2: 1-1 Partition MappingRound r
15、obinTopic with N partitionsN2 connectionDoS-like trafficDeterministic partitionTopic with N partitionsN connectionReduce contentionMM 1MM 2MM 3p0p1p3MM 4SourceDestinationp2p0p1p2p3MM 1MM 2MM 3p0p1p3MM 4SourceDestinationp2p0p1p2p3Full Speed Phase ThroughputFirst hour: 27.2MB/s = 53.6MB/s per aggregat
16、e brokerBeforeAfterProblem 2: Long Tail PhaseDuring full speed phase, all workers are busySome workers catch up and become idle, but the others are still busySolution 1: Dynamic Workload BalanceDuring full speed phase, all workers are busyOriginal partition assignmentNumber of partitionsHeavy partit
17、ion on the same workerWorkload-based assignmentTotal workload when addedSource cluster bytes-in-rateRetrieved from Chaperone3Dynamic rebalance periodicallyExceeds 1.5 times the average workloadSolution 2: Lag-Feedback RebalanceDuring full speed phase, all workers are busyMonitors topic lagsBalance t
18、he workers based on lagsLarger lag = heavier workloadDedicated workers for lagging topicsDynamic Workload RebalancePeriodically adjust workload every 10 minutesMultiple lagging topics on the same worker spreads to multiple workersRequirementsStable replicationSimple operationsHigh throughputNo data
19、lossAuditingRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingMore IssuesScalabilityNumber of partitionsUp to 1 hour for a full rebalance during rolling restartOperationNumber of deploymentsNew clusterSanityLoopDouble routeDesign - Federated uReplicatorScalabilityMult
20、iple routeOperationDeployment groupOne managerDesign - Federated uReplicatoruReplicator ManagerDecide when to create new routeDecide how many workers in each routeAuto-scalingTotal workload of routeTotal lag in the routeTotal workload / expected workload on workerAutomatically add workers to routeDesign - Federated uReplicatoruReplicator Frontwhitelist/blacklist topics (topic, src, dst)Persist in DBSanity checksAssigns to uReplicator ManagerRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingApache Kafka at UberApache Kafka pi
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年五年級品社下冊《香港和澳門》教學設(shè)計 未來版
- 2024日用百貨采購合同格式
- 2024天津市(事業(yè))單位勞動合同范本
- 2024如何免除倉儲合同違約責任
- 2024國際刊號代理服務合同
- 國有企業(yè)全面預算管理存在的問題與對策研究
- 年度復雜精密壓鑄模具戰(zhàn)略市場規(guī)劃報告
- 方程的知識課件
- 《大數(shù)據(jù)采集與預處理》Scrapy各組件的用法 教案 -2
- 尋找春娃娃大班課件
- 2024年青海三新農(nóng)電限責任公司招聘專業(yè)技能技術(shù)人員29人(高頻重點提升專題訓練)共500題附帶答案詳解
- 二、制訂課余生活計劃(教學設(shè)計)魯科版四年級上冊綜合實踐活動
- 人工智能訓練師職業(yè)技能大賽備賽試題庫(含答案)
- ISO50001-2018能源管理體系全套資料管理手冊及程序文件1
- 2024特變電工新能源股份限公司招聘191人【重點基礎(chǔ)提升】模擬試題(共500題)附帶答案詳解
- 口語交際:講民間故事 (教學設(shè)計) 統(tǒng)編版語文五年級上冊
- 2024年新修訂公司法知識競賽題庫及答案
- 阿爾茲海默病致病機制及新型治療藥物研究進展
- 中考物理總復習知識點總結(jié)
- 部編版六年級年冊《第七單元習作 我的拿手好戲》課件
- (附答案)2024公需課《百縣千鎮(zhèn)萬村高質(zhì)量發(fā)展工程與城鄉(xiāng)區(qū)域協(xié)調(diào)發(fā)展》試題廣東公需科
評論
0/150
提交評論