Serengeti虛擬化你的大數(shù)據(jù)應(yīng)用（VMWare）課件

上傳人：星*** IP屬地：貴州上傳時(shí)間：2022-12-10 格式：PPTX 頁數(shù)：82 大小：6.34MB 積分：25 舉報(bào) 版權(quán)申訴

Serengeti虛擬化你的大數(shù)據(jù)應(yīng)用（VMWare）課件_第2頁

Serengeti虛擬化你的大數(shù)據(jù)應(yīng)用（VMWare）課件_第3頁

Serengeti虛擬化你的大數(shù)據(jù)應(yīng)用（VMWare）課件_第4頁

Serengeti虛擬化你的大數(shù)據(jù)應(yīng)用（VMWare）課件_第5頁

已閱讀5頁，還剩77頁未讀，繼續(xù)免費(fèi)閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

2009

VMware

Inc.

All

rights

reservedSerengeti

虛擬化你的大數(shù)據(jù)應(yīng)用藺永華Vmware,

Inc.?2009VMwareInc.Allrights1Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste2Today’s

Big

Data

System:ETLUnstructured

Data

(HDFS)

Real

TimeStructured

DatabaseBig

SQLData

Parallel

BatchProcessingReal

Time

Streams

Real-Time

Processing

(s4,storm)AnalyticsToday’sBigDataSystem:ETLUns3Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste4Challenges

Use

Hadoop

physical

infrastructureDeployment?

Difficult

deploy,

cost

several

people

for

several

days

even

months?

Difficult

tune

cluster

performanceLow

Efficiency?

Hadoop

clusters

are

typically

not

100%

utilized

across

all

hardware

resources.?

Difficult

resources

safely

between

different

workloadSingle

Point

Failure?

Single

point

failure

for

Name

Node

and

Job

tracker?

for

Hive,

HCatalog,

etc.ChallengesToUseHadoopinph5Why

Virtualize

Hadoop?

Get

your

Hadoop

cluster

minutes

1/1000humanefforts,

LeastHadoopoperation

knowledgeFullyautomated

process,10

minutesto

get

aHadoop/HBaseclusterfromscratch

Server

preparation

installation

Automateby

Serengeti

vSpherewith

best

practice

Network

Configuration

Hadoop

Installation

and

ConfigurationManual

process,

costdaysWhyVirtualizeHadoop?-Gety6Why

Virtualize

Hadoop?

Consolidate

sprawling

clustersClustersshareserverswithstrongisolation

Single

Hardware

Infrastructure

Unified

operations

Optimize

Shared

Resources

higher

utilization

Elastic

resources

faster

on-demand

accessHadoop

DevHadoop

ProdHBase

ClusterSprawlingSingle

purpose

clusters

for

variousbusiness

applications

lead

clustersprawl.Cluster

Consolidation

SimplifyFinanceHadoopVirtualization

PlatformHadoop

DevHadoop

ProdHBase...

PortalHadoop

PortalHadoop30%CAPEXDownWhyVirtualizeHadoop?-Conso750%+

resourcesaresittingidlewhilehighpriorityjob

isburningup

its

cluster.Utilizeall

resourcesfrompool

demand.

Dynamic

elasticscalingonshared

resourcepoolWhy

Virtualize

Hadoop?

–Utilize

all

your

resources

solve

the

priority

problem

fasterto

getanalyticresults50%+resourcesaresittingUtiliz8vSphere

High

Availability

(HA)

protection

against

unplanned

downtimeOverview

Protection

against

host

and

failures

Automatic

failure

detection

(host,

guest

OS)

Automatic

virtual

machine

restart

minutes,

any

available

host

cluster

and

application-independent,does

not

require

complex

configuration

changesvSphereHighAvailability(HA)9(Coordination)ZookeeprManagement

ServerHigh

Availability

for

the

Hadoop

Stack(Hadoop

Distributed

File

System)HBase

(Key-Valuestore)

HDFSMapReduce

(Job

Scheduling/Execution

System)Pig

(DataFlow)HiveBI

ReportingETLToolsRDBMSJobtracker

Namenode(SQL)

Hive

MetaDB

HCatalogHcatalog

MDBServer(Coordination)ZookeeprManageme10X

XHA

HAApp

OSApp

App

OSApp

OSVMwareESX

XVMwareESX?

Zero

downtime,

zero

data

loss

failover

for

all

virtual

machines

case

hardware

failures?

Integrated

with

VMware

HA/DRS?

complex

clustering

specialized

hardware

required?

Single

common

mechanism

for

all

applications

and

operatingFTvSphere

Fault

Tolerance

provides

continuous

protection

Overview

Single

identical

VMs

running

lockstep

separate

hosts

systemsZerodowntimeforNameNode,JobTrackerandothercomponentsin

HadoopclustersXXHAHAAppAppA11Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste12Easy

and

rapid

deployment

and

managementOpen

sourceprojectlaunched

June

2012,

0.8

released

Apr.and

willrelease0.9

Jun.Toolkitthat

leveragevirtualizationto

simplifyHadoop

deploymentand

operations

Deploy

cluster

Minutes

fully

automated

Customize

Hadoop

and

HBase

cluster

Automated

cluster

operationCome

with

eco-system

componentsSupport

all

popular

Hadoop

DistributionsSerengetiEasyandrapiddeploymentand13Demo:

minutes

Hadoop

cluster

with

SerengetiDemo:10minutestoaHadoopc14Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste15Common

questions

about

virtualization

Local

Disk?????

Can

local

disk

used

virtualization

environment?Flexibilityand

Scalability

How

flexible

schedule

resources

between

clusters

and

different

applications

mentioned

above?Data

stability

virtual

environment,

how

can

distribute

data

across

host

and

rack?Data

locality

Hadoop

will

schedule

compute

tasks

near

the

data,

reduce

network

for

data

R/W.

Can

virtual

environment

get

the

same

result?Performance

How

about

the

performance

virtual

environment?Commonquestionsaboutvirtual16Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste17Can

use

local

diskeasily?CanIuselocaldiskeasily?18Other

VMOther

VMHadoopHadoopHadoopHadoopHadoopHadoopHadoopHadoopHadoopHadoopSerengetiExtend

Virtual

StorageArchitectureto

IncludeLocalDiskShared

Storage:SAN

NAS

Easy

provision

Automated

cluster

rebalancingHybrid

Storage

SAN

for

boot

images,

other

workloads

Local

disk

for

Hadoop

HDFSHostHostHostHostHostHostOtherVMOtherVMOtherVMOther19How

flexiblescalein/scaleoutHow

flexiblescheduleresourcesbetween

clustersanddifferentapplications?Howtoflexiblescalein/scaleou20-ComputeCurrentHadoop:T1T2VMVMVMVM

Combined

Storage/Com

puteHadoopinVM-

lifecycle

determined

Datanode-

Limited

elasticityVM

Storage

SeparateStorageVM

Storage

SeparateComputeClusters-

Separate

compute

fromdata-

Remove

elasticconstrain-

Datanode-

Elastic

compute-

Raise

utilization-*

Separate

virtual

compute*

Compute

clusterpertenant*

Stronger

VM-grade

securityand

resourceisolationEvolution

Hadoop

VMs

–

Data/Compute

separation

Slave

Node-ComputeCurrentT1T2VMVMVMVM Co21Serengeti

Node

Scale

Out

Scale

InNameNode

Host

DHostJobTrackerCCCC

DHostCCC

CSerengetiNodeScaleOut/Sca22Serengeti

Ballooning

Enhancement

for

Java

ApplicationJVMGuest

OSHostJVMGuest

OSHostGuest

JVMSerengetiBallooningEnhanceme23How

keep

data

stability?How

access

data

locallyif

data

node

and

computenodeare

located

differentVM?Howtokeepdatastability?How24DatanodeandtasktrackercombinedclusterDataComputeseparatedclustermaster

Hostworker

Hostmaster

HostData

node

HostTasktrackerData

node

HostTasktrackerTasktrackerTasktracker

Data

node

HostComputeonly

cluster1Computeonly

cluster2HDFS

cluster

Compute

OnlyclusterRack1Rack2Rack1Distributed

and

Data/Compute

Associated

Placement

Rack2

Rack1Job

trackerJob

trackerName

node

Host

Rack2TasktrackerTasktracker

Data

node

HostDatanodeandtasktrackercombined25HadoopTopologyChangesfor

VirtualizationHadoop

Topology

Awareness

–

Serengeti

HVE

/D1D2R1R2N1H1H2H3H4H5H6H7H8H9H10H11H12R3R43/D1D2R1R2H1H2H3H4H5H6H7H8H9H10H11H12R3R423N2N3N4N5N6N7N81

321

1234HadoopTopologyChangesforVirtu26HADOOP-8468(UmbrellaJIRA)HADOOP-8469HDFS-3495HDFS-3498HadoopNetworkTopologyExtension

Hadoop

Virtualization

Extensions

for

Topology

HVE

TaskScheduling

PolicyExtension

BalancerPolicy

ExtensionReplicaChoosing

PolicyExtensionReplicaPlacement

PolicyExtension

ReplicaRemovalPolicyExtensionHDFSMapReduceHadoop

CommonMAPREDUCE-4310MAPREDUCE-4309HADOOP-8470HADOOP-8472HADOOP-8468(UmbrellaJIRA)Hadoo27Is

there

significantperformancedegradationin

virtualizationenvironment?Is

there

any

performancedata?Istheresignificantperformanc28Virtualized

Hadoop

PerformanceVirtualizedHadoopPerformance29Native

versus

Virtual

Platforms,

hosts,

disks/hostNativeversusVirtualPlatform30Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste31RestAPISpringBatchUpdateMetaDBstepVMPlacementcalculationVMProvisionstepSoftwareMgmtstepUI

Client

Flex

UISerengeti

architecture

diagram

CLI

Client

Spring

Shell

Serengeti

Web

ServiceHibernate/

DAOvPostgresVC

adapter

Ironfan

service

ThriftService

ProgressIronfan

report

Chef

serverRestAPICookbookVHMstepRabbitMQVM

runtime

ManagerHostHostHostHostHostVirtualization

PlatformHadoop

NodeChefClient

kitHadoop

NodeHadoop

NodePackagerepositoryvCenterRestAPISpringBatchUpdateVMVMSo32Customizing

your

Hadoop/HBase

cluster

with

Serengeti

Choiceof

distros

Storageconfiguration

Choice

shared

storage

Local

disk

Resourceconfiguration

High

availabilityoption

nodes…

"distro":"apache",

"groups":[

{

"name":"master",

"roles":[

"hadoop_namenode",

"hadoop_jobtracker”],

"storage":

{

"type":

"SHARED",

"sizeGB":

20},

"instance_type":MEDIUM,

"instance_num":1,

"ha":true},

{"name":"worker",

"roles":[

"hadoop_datanode",

"hadoop_tasktracker"

"instance_type":SMALL,

"instance_num":5,

"ha":false

…CustomizingyourHadoop/HBase33One

command

scale

out

your

cluster

with

Serengeti>cluster

resize

–name

--nodegroup

worker

–instanceNum

<#>Onecommandtoscaleoutyour34Configure/reconfigure

Hadoop

with

ease

SerengetiModifyHadoop

clusterconfigurationfromSerengeti?

Use

the

“configuration”

section

the

json

spec

file?

Specify

Hadoop

attributes

core-site.xml,

hdfs-site.xml,

mapred-site.xml,hadoop-env.sh,

perties?

Apply

new

Hadoop

configuration

using

the

edited

spec

file"configuration":{"hadoop":{"core-site.xml":

{//

check

for

all

settings

/common/docs/r1.0.0/core-default.html},"hdfs-site.xml":{//

check

for

all

settings

/common/docs/r1.0.0/hdfs-default.html},"mapred-site.xml":{//

check

for

all

settings

/common/docs/r1.0.0/mapred-default.html"io.sort.mb":

"300"},"hadoop-env.sh":{//

"HADOOP_HEAPSIZE":"",//

"HADOOP_NAMENODE_OPTS":"",//

"HADOOP_DATANODE_OPTS":"",…>

cluster

config

--name

myHadoop

--specFile

/home/serengeti/myHadoop.jsonConfigure/reconfigureHadoopw35Freedom

Choice

and

Open

SourceCommunity

ProjectsDistributions?

Flexibilityto

choosefrom

major

distributions

cluster

create

--name

myHadoop

--distro

apache?

Supportfor

multipleprojects?

Open

architectureto

welcomeindustryparticipation?

ContributingHadoop

VirtualizationExtensions(HVE)to

open

sourcecommunityFreedomofChoiceandOpenSou36HDFS2

with

Namenode

Federation

and

HADeploy

CDH4

Hadoop

cluster

Name

Node

Federation

Name

Node

MapReduce

v1?

HBase,

Pig,

Hive,

and

Hive

ServerCDH4

configurationsScale

outElasticityJobTracker

HA/FTActiveNamenodeStandby

NamenodeActiveNamenodeStandby

NamenodeZookeeper

GroupZKZKZK

CoordinateNamenodeGroup1Coordinate

NamenodeGroup2Quorum-basedmetadatastore

Data

NodesDatanode

Datanode

DatanodeBlockreportBlockreportHDFS2withNamenodeFederation37Proactive

monitoring

and

tuning

with

VCOPsProactivelymonitoring

through

VCOPsGain

comprehensivevisibilityEliminatemanual

processeswith

intelligentautomationProactivelymanage

operationsProactivemonitoringandtunin38Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste39VMWarebringsAgility,

Efficiency,

and

Elasticityto

Big

DataElasticity

Enable

full

elasticity

through

separation

Data

and

Compute

Scale

In/Out

Hadoop

with

Resource

ConstrainAgility

Deploy,

configure

and

monitor

Hadoop

clusters

the

fly

Dynamic

reconfiguring

Hadoop

meet

changing

business

demandsEfficiency

Consolidate

Hadoop

achieve

higher

utilization

Pool

resources

allow

for

increased

performance

and

priority

job

processingVMWarebringsAgility,Efficienc40Serengeti

ResourcesDownload

and

try

Serengeti

VMware

Hadoop

site

/hadoopSerengetiResourcesVMwareHado41?

2009

VMware

Inc.

All

rights

reservedSerengeti

虛擬化你的大數(shù)據(jù)應(yīng)用藺永華Vmware,

Inc.?2009VMwareInc.Allrights42Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste43Today’s

Big

Data

System:ETLUnstructured

Data

(HDFS)

Real

TimeStructured

DatabaseBig

SQLData

Parallel

BatchProcessingReal

Time

Streams

Real-Time

Processing

(s4,storm)AnalyticsToday’sBigDataSystem:ETLUns44Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste45Challenges

Use

Hadoop

physical

infrastructureDeployment?

Difficult

deploy,

cost

several

people

for

several

days

even

months?

Difficult

tune

cluster

performanceLow

Efficiency?

Hadoop

clusters

are

typically

not

100%

utilized

across

all

hardware

resources.?

Difficult

resources

safely

between

different

workloadSingle

Point

Failure?

Single

point

failure

for

Name

Node

and

Job

tracker?

for

Hive,

HCatalog,

etc.ChallengesToUseHadoopinph46Why

Virtualize

Hadoop?

Get

your

Hadoop

cluster

minutes

1/1000humanefforts,

LeastHadoopoperation

knowledgeFullyautomated

process,10

minutesto

get

aHadoop/HBaseclusterfromscratch

Server

preparation

installation

Automateby

Serengeti

vSpherewith

best

practice

Network

Configuration

Hadoop

Installation

and

ConfigurationManual

process,

costdaysWhyVirtualizeHadoop?-Gety47Why

Virtualize

Hadoop?

Consolidate

sprawling

clustersClustersshareserverswithstrongisolation

Single

Hardware

Infrastructure

Unified

operations

Optimize

Shared

Resources

higher

utilization

Elastic

resources

faster

on-demand

accessHadoop

DevHadoop

ProdHBase

ClusterSprawlingSingle

purpose

clusters

for

variousbusiness

applications

lead

clustersprawl.Cluster

Consolidation

SimplifyFinanceHadoopVirtualization

PlatformHadoop

DevHadoop

ProdHBase...

PortalHadoop

PortalHadoop30%CAPEXDownWhyVirtualizeHadoop?-Conso4850%+

resourcesaresittingidlewhilehighpriorityjob

isburningup

its

cluster.Utilizeall

resourcesfrompool

demand.

Dynamic

elasticscalingonshared

resourcepoolWhy

Virtualize

Hadoop?

–Utilize

all

your

resources

solve

the

priority

problem

fasterto

getanalyticresults50%+resourcesaresittingUtiliz49vSphere

High

Availability

(HA)

protection

against

unplanned

downtimeOverview

Protection

against

host

and

failures

Automatic

failure

detection

(host,

guest

OS)

Automatic

virtual

machine

restart

minutes,

any

available

host

cluster

and

application-independent,does

not

require

complex

configuration

changesvSphereHighAvailability(HA)50(Coordination)ZookeeprManagement

ServerHigh

Availability

for

the

Hadoop

Stack(Hadoop

Distributed

File

System)HBase

(Key-Valuestore)

HDFSMapReduce

(Job

Scheduling/Execution

System)Pig

(DataFlow)HiveBI

ReportingETLToolsRDBMSJobtracker

Namenode(SQL)

Hive

MetaDB

HCatalogHcatalog

MDBServer(Coordination)ZookeeprManageme51X

XHA

HAApp

OSApp

App

OSApp

OSVMwareESX

XVMwareESX?

Zero

downtime,

zero

data

loss

failover

for

all

virtual

machines

case

hardware

failures?

Integrated

with

VMware

HA/DRS?

complex

clustering

specialized

hardware

required?

Single

common

mechanism

for

all

applications

and

operatingFTvSphere

Fault

Tolerance

provides

continuous

protection

Overview

Single

identical

VMs

running

lockstep

separate

hosts

systemsZerodowntimeforNameNode,JobTrackerandothercomponentsin

HadoopclustersXXHAHAAppAppA52Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste53Easy

and

rapid

deployment

and

managementOpen

sourceprojectlaunched

June

2012,

0.8

released

Apr.and

willrelease0.9

Jun.Toolkitthat

leveragevirtualizationto

simplifyHadoop

deploymentand

operations

Deploy

cluster

Minutes

fully

automated

Customize

Hadoop

and

HBase

cluster

Automated

cluster

operationCome

with

eco-system

componentsSupport

all

popular

Hadoop

DistributionsSerengetiEasyandrapiddeploymentand54Demo:

minutes

Hadoop

cluster

with

SerengetiDemo:10minutestoaHadoopc55Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste56Common

questions

about

virtualization

Local

Disk?????

Can

local

disk

used

virtualization

environment?Flexibilityand

Scalability

How

flexible

schedule

resources

between

clusters

and

different

applications

mentioned

above?Data

stability

virtual

environment,

how

can

distribute

data

across

host

and

rack?Data

locality

Hadoop

will

schedule

compute

tasks

near

the

data,

reduce

network

for

data

R/W.

Can

virtual

environment

get

the

same

result?Performance

How

about

the

performance

virtual

environment?Commonquestionsaboutvirtual57Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste58Can

use

local

diskeasily?CanIuselocaldiskeasily?59Other

VMOther

VMHadoopHadoopHadoopHadoopHadoopHadoopHadoopHadoopHadoopHadoopSerengetiExtend

Virtual

StorageArchitectureto

IncludeLocalDiskShared

Storage:SAN

NAS

Easy

provision

Automated

cluster

rebalancingHybrid

Storage

SAN

for

boot

images,

other

workloads

Local

disk

for

Hadoop

HDFSHostHostHostHostHostHostOtherVMOtherVMOtherVMOther60How

flexiblescalein/scaleoutHow

flexiblescheduleresourcesbetween

clustersanddifferentapplications?Howtoflexiblescalein/scaleou61-ComputeCurrentHadoop:T1T2VMVMVMVM

Combined

Storage/Com

puteHadoopinVM-

lifecycle

determined

Datanode-

Limited

elasticityVM

Storage

SeparateStorageVM

Storage

SeparateComputeClusters-

Separate

compute

fromdata-

Remove

elasticconstrain-

Datanode-

Elastic

compute-

Raise

utilization-*

Separate

virtual

compute*

Compute

clusterpertenant*

Stronger

VM-grade

securityand

resourceisolationEvolution

Hadoop

VMs

–

Data/Compute

separation

Slave

Node-ComputeCurrentT1T2VMVMVMVM Co62Serengeti

Node

Scale

Out

Scale

InNameNode

Host

DHostJobTrackerCCCC

DHostCCC

CSerengetiNodeScaleOut/Sca63Serengeti

Ballooning

Enhancement

for

Java

ApplicationJVMGuest

OSHostJVMGuest

OSHostGuest

JVMSerengetiBallooningEnhanceme64How

keep

data

stability?How

access

data

locallyif

data

node

and

computenodeare

located

differentVM?Howtokeepdatastability?How65DatanodeandtasktrackercombinedclusterDataComputeseparatedclustermaster

Hostworker

Hostmaster

HostData

node

HostTasktrackerData

node

HostTasktrackerTasktrackerTasktracker

Data

node

HostComputeonly

cluster1Computeonly

cluster2HDFS

cluster

Compute

OnlyclusterRack1Rack2Rack1Distributed

and

Data/Compute

Associated

Placement

Rack2

Rack1Job

trackerJob

trackerName

node

Host

Rack2TasktrackerTasktracker

Data

node

HostDatanodeandtasktrackercombined66HadoopTopologyChangesfor

VirtualizationHadoop

Topology

Awareness

–

Serengeti

HVE

/D1D2R1R2N1H1H2H3H4H5H6H7H8H9H10H11H12R3R43/D1D2R1R2H1H2H3H4H5H6H7H8H9H10H11H12R3R423N2N3N4N5N6N7N81

321

1234HadoopTopologyChangesforVirtu67HADOOP-8468(UmbrellaJIRA)HADOOP-8469HDFS-3495HDFS-3498HadoopNetworkTopologyExtension

Hadoop

Virtualization

Extensions

for

Topology

HVE

TaskScheduling

PolicyExtension

BalancerPolicy

ExtensionReplicaChoosing

PolicyExtensionReplicaPlacement

PolicyExtension

ReplicaRemovalPolicyExtensionHDFSMapReduceHadoop

CommonMAPREDUCE-4310MAPREDUCE-4309HADOOP-8470HADOOP-8472HADOOP-8468(UmbrellaJIRA)Hadoo68Is

there

significantperformancedegradationin

virtualizationenvironment?Is

there

any

performancedata?Istheresignificantperformanc69Virtualized

Hadoop

PerformanceVirtualizedHadoopPerformance70Native

versus

Virtual

Platforms,

hosts,

disks/hostNativeversusVirtualPlatform71Agenda?

Today’s

big

data

system?

Why

virtualize

hadoop??

Serengeti

introduction?

Common

questions

about

virtualization?

Serengeti

solution?

Deep

insight

into

Serengeti?

Summary?

Q&AAgenda?Today’sbigdatasyste72RestAPISpringBatchUpdateMetaDBstepVMPlacementcalculationVMProvisionstepSoftwareMgmtstepUI

Client

Flex

UISerengeti

architecture

diagram

CLI

Client

Spring

Shell

Serengeti

Web

ServiceHibernate/

DAOvPostgresVC

adapter

Ironfan

service

ThriftService

ProgressIronfan

report

Chef

serverRestAPICookbookVHMstepRabbitMQVM

runtime

ManagerHostHostHostHostHostVirtualization

PlatformHadoop

NodeChefClient

kitHadoop

NodeHadoop

NodePackagerepositoryvCenterRestAPISpringBatchUpdateVMVMSo73Customizing

your

Hadoop/HBase

cluster

with

Serengeti

Choiceof

distros

Storageconfiguration

Choice

shared

storage

Local

disk

Resourceconfiguration

High

availabilityoption

nodes…

"distro":"apache",

"groups":[

{

"name":"master",

"roles":[

"hadoop_namenode",

"hadoop_jobtracker”],

"storage":

{

"type":

"SHARED",

"sizeGB":

20},

"instance_type":MEDIUM,

"instance_num":1,

"ha":true},

{"name":"worker",

"roles":[

"hadoop_datanode",

人人文庫> 全部分類> 教育資料 > 輔導(dǎo)培訓(xùn)

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽，若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

Serengeti虛擬化你的大數(shù)據(jù)應(yīng)用（VMWare）課件

文檔簡介

溫馨提示

最新文檔

評論

Serengeti虛擬化你的大數(shù)據(jù)應(yīng)用（VMWare）課件

文檔簡介

溫馨提示

最新文檔

評論

相關(guān)文檔