基于Kafka的跨數(shù)據(jù)中心復制平臺架構(gòu)_第1頁
基于Kafka的跨數(shù)據(jù)中心復制平臺架構(gòu)_第2頁
基于Kafka的跨數(shù)據(jù)中心復制平臺架構(gòu)_第3頁
基于Kafka的跨數(shù)據(jù)中心復制平臺架構(gòu)_第4頁
基于Kafka的跨數(shù)據(jù)中心復制平臺架構(gòu)_第5頁
已閱讀5頁,還剩42頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)

文檔簡介

1、How Uber Builds A Cross Data Center Replication Platform on Apache Kafka基于Kafka的跨數(shù)據(jù)中心復制平臺架構(gòu)Apache Kafka at UberApache Kafka pipeline & replicationuReplicatorData loss detectionQ&AAgendaReal-time Dynamic PricingStream ProcessingDynamic pricingApp ViewsVehicle InformationApache KafkaUber Eats - Real-T

2、ime ETAsA bunch more .Fraud DetectionDriver & Rider Sign-ups, etc.Apache Kafka - Use CasesGeneral pub-sub, messaging queueStream processingAthenaX - self-service streaming analytics platform (Apache Samza & Apache Flink)Database changelog transportCassandra, MySQL, etc.IngestionHDFS, S3LoggingData I

3、nfrastructure UberPRODUCERSCONSUMERSReal-time Analytics, Alerts, DashboardsSamza / FlinkApplications Data ScienceAnalytics ReportingKafkaVertica / HiveRider AppDriver AppAPI / ServicesEtc.Ad-hoc ExplorationELKDebuggingHadoopMobile AppCassandra MySQLDATABASES(Internal) ServicesAWS S3PaymentSurgeTrill

4、ionsPBsMessages / DayDataTens of ThousandsTopicsScaleexcluding replicationApache Kafka at UberApache Kafka pipeline & replicationuReplicatorData loss detectionQ&AAgendaApache Kafka Pipeline UberDC2DC1Applications ProxyClientKafka REST ProxyRegional KafkaApplications ProxyClientKafka REST ProxyRegion

5、al KafkaSecondaryAggregate KafkauReplicatorOffset Sync ServiceAggregate KafkauReplicatorAggregationAggregate KafkauReplicatorOffset Sync ServiceAggregate KafkauReplicatorDC1Regional KafkaDC2Regional KafkaGlobal viewCross-Data Center FailoverAggregate KafkauReplicatorOffset Sync ServiceAggregate Kafk

6、auReplicatorDC1Regional KafkaDC2Regional KafkaDuring runtimeuReplicator reports offset mapping to offset sync serviceOffset sync service is all-active and the offset info is replicated across data centersDuring failoverConsumers ask offset sync service for offsets to resume consumption based on its

7、last commit offsetsOffset sync service translates offsets between aggregate clustersIngestionHDFS/S3DC1Aggregate KafkaHDFSS3DC2Aggregate KafkaHDFS/S3Topic MigrationConsumerDC1Regional KafkaDC2Regional KafkaSetup uReplicator from new cluster to old clusterMove producerMove consumerProducerTopic Migra

8、tionRegional KafkaConsumerDC1DC2Regional KafkaSetup uReplicator from new cluster to old clusterMove producerMove consumerRemove uReplicatorProduceruReplicatorTopic MigrationRegional KafkaConsumerDC1DC2Regional KafkaSetup uReplicator from new cluster to old clusterMove producerMove consumerRemove uRe

9、plicatorProduceruReplicatorConsumerTopic MigrationRegional KafkaDC1DC2Regional KafkaSetup uReplicator from new cluster to old clusterMove producerMove consumerRemove uReplicatorProduceruReplicatorConsumerTopic MigrationConsumerDC1Regional KafkaDC2Regional KafkaSetup uReplicator from new cluster to o

10、ld clusterMove producerMove consumerRemove uReplicatorProducerReplication - Use CasesAggregationGlobal viewAll-activeIngestionMigrationMove topics between clusters/DCsApache Kafka at UberApache Kafka pipeline & replicationuReplicatorData loss detectionQ&AAgendaMotivation - MirrorMakerPain pointExpen

11、sive rebalancingDifficulty adding topicsPossible data lossMetadata sync issuesRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingDesign - uReplicatorApache HelixuReplicator controllerStable replicationAssign topic partitions to each worker processHandle topic/worker ch

12、angesSimple operationsHandle adding/deleting topicsDesign - uReplicatoruReplicator workerApache Helix agentDynamic Simple ConsumerApache Kafka producerCommit after flushApache ZooKeeper Apache HelixControllerWorker ThreadApache HelixAgentFetcherManagertopic: testTopic partition:0FetcherThreadSimpleC

13、onsumerLeaderFind ThreadLinkedBlockingQueueApache Kafka ProducerRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingPerformance IssuesCatch-up time is too long (4 hours failover)Full-speed phase: 23 hoursLong tail phase: 56 hoursProblem: Full Speed PhaseDestination brok

14、ers are bound by CPUsSolution 1: Increase Batch SizeDestination brokers are bound by CPUsIncrease throughputproducer.batch.size: 64KB = 128KBproducer.linger.ms: 100 = 1000EffectBatch size increases: 1022KB = 5090KBCompress rate: 4062% = 2735% (compressed size)Solution 2: 1-1 Partition MappingRound r

15、obinTopic with N partitionsN2 connectionDoS-like trafficDeterministic partitionTopic with N partitionsN connectionReduce contentionMM 1MM 2MM 3p0p1p3MM 4SourceDestinationp2p0p1p2p3MM 1MM 2MM 3p0p1p3MM 4SourceDestinationp2p0p1p2p3Full Speed Phase ThroughputFirst hour: 27.2MB/s = 53.6MB/s per aggregat

16、e brokerBeforeAfterProblem 2: Long Tail PhaseDuring full speed phase, all workers are busySome workers catch up and become idle, but the others are still busySolution 1: Dynamic Workload BalanceDuring full speed phase, all workers are busyOriginal partition assignmentNumber of partitionsHeavy partit

17、ion on the same workerWorkload-based assignmentTotal workload when addedSource cluster bytes-in-rateRetrieved from Chaperone3Dynamic rebalance periodicallyExceeds 1.5 times the average workloadSolution 2: Lag-Feedback RebalanceDuring full speed phase, all workers are busyMonitors topic lagsBalance t

18、he workers based on lagsLarger lag = heavier workloadDedicated workers for lagging topicsDynamic Workload RebalancePeriodically adjust workload every 10 minutesMultiple lagging topics on the same worker spreads to multiple workersRequirementsStable replicationSimple operationsHigh throughputNo data

19、lossAuditingRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingMore IssuesScalabilityNumber of partitionsUp to 1 hour for a full rebalance during rolling restartOperationNumber of deploymentsNew clusterSanityLoopDouble routeDesign - Federated uReplicatorScalabilityMult

20、iple routeOperationDeployment groupOne managerDesign - Federated uReplicatoruReplicator ManagerDecide when to create new routeDecide how many workers in each routeAuto-scalingTotal workload of routeTotal lag in the routeTotal workload / expected workload on workerAutomatically add workers to routeDesign - Federated uReplicatoruReplicator Frontwhitelist/blacklist topics (topic, src, dst)Persist in DBSanity checksAssigns to uReplicator ManagerRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingApache Kafka at UberApache Kafka pi

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論