![2022 數(shù)據(jù)峰會(huì) 數(shù)據(jù)庫(kù) -面向未來(lái)的數(shù)據(jù)倉(cāng)庫(kù) 理查德·溫特_第1頁(yè)](http://file4.renrendoc.com/view/b6b725464cc8128862bda2d78b59a21e/b6b725464cc8128862bda2d78b59a21e1.gif)
![2022 數(shù)據(jù)峰會(huì) 數(shù)據(jù)庫(kù) -面向未來(lái)的數(shù)據(jù)倉(cāng)庫(kù) 理查德·溫特_第2頁(yè)](http://file4.renrendoc.com/view/b6b725464cc8128862bda2d78b59a21e/b6b725464cc8128862bda2d78b59a21e2.gif)
![2022 數(shù)據(jù)峰會(huì) 數(shù)據(jù)庫(kù) -面向未來(lái)的數(shù)據(jù)倉(cāng)庫(kù) 理查德·溫特_第3頁(yè)](http://file4.renrendoc.com/view/b6b725464cc8128862bda2d78b59a21e/b6b725464cc8128862bda2d78b59a21e3.gif)
![2022 數(shù)據(jù)峰會(huì) 數(shù)據(jù)庫(kù) -面向未來(lái)的數(shù)據(jù)倉(cāng)庫(kù) 理查德·溫特_第4頁(yè)](http://file4.renrendoc.com/view/b6b725464cc8128862bda2d78b59a21e/b6b725464cc8128862bda2d78b59a21e4.gif)
![2022 數(shù)據(jù)峰會(huì) 數(shù)據(jù)庫(kù) -面向未來(lái)的數(shù)據(jù)倉(cāng)庫(kù) 理查德·溫特_第5頁(yè)](http://file4.renrendoc.com/view/b6b725464cc8128862bda2d78b59a21e/b6b725464cc8128862bda2d78b59a21e5.gif)
版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
HowtoSelectYour
CloudDataWarehousePlatformStrategically
May2022
RichardWinter,CEO
WinterCorpLLC
TheExpertsinAnalyticDataManagementatScale
WC
RichardWinter
Presenter
RichardWinter,CEOandPrincipalConsultant,WinterCorp
?Careerfocusinanalyticdatamanagementatlargescale
?Pioneeringdeveloperofdatawarehousetechnologyandproducts
?Hasperformedmorethan50datawarehouseevaluations,oftenincludingbenchmarks,forleadingenterprises
?Advisertoindustryleadersinanalytics,userandvendor
?Skilledatdatawarehousestrategies,requirements,strategicdesign,engineeringanalysis,testdesignandevaluation
?Experiencedconsultantandtechnologyeducator,widelypublished
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC2
TheDataDrivenEnterpriseofthe2020s
MostCompaniesareIncreasingUseof:
1.Digital,realtimebusinessprocesses
2.Automateddecisionmaking(AI/ML)
3.Focusoncompetingwithanalytics
Alltrendsacceleratedbytheincreasingspeedofbusinesscompetition
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC3
SourceSystems
…
Orders
Report1
Products
Products
Report2
Suppliers
ReportN
Shipments
Cleanseddata,integrated
acrosstheenterprise
WeeklyLoads
TheDataWarehouseImpetus:1990s
Orders
CONSISTENT,ACCURATEREPORTING
DataWarehouse
Inventory
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC4
ANALYZING
WHYdidit
happen?
REPORTING
WHAT
happened?
A
STRATEGICINTELLIGENCE
AT
PREDICTING
WHATWILL
happen?
StagesofDataWarehouseEvolution
OPERATIONALINTELLIGENCE
OPERATIONALIZING
WHATIS
happeningnow?
ACTIVATING
MAKEithappen!
A
T
A
T
Batch
AdHoc
AnalyticsContinuousupdates, tacticalqueriesEventdriven
A
T
Source:DiagramadaptedfromBrobst&Rarey,2003 ?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.WWCC5
WC
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
LeadingEdge–200%CAGR
EB
5
4
3
2
Estimated
1
2010201420182022
Data
Warehouse
Growth
Source:WinterCorpPrimaryResearch–surveysandcustomerreports
WC6
WC
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
Key
Dimensions
ofData
Warehouse
Scale
WC7
Averagecommercialbuilding(20ksf)
BurjKhalifa,Dubai–TallestBuildingintheWorld
Average
single
family
home
(2500sf)
?2717feettall
?3.3millionsquarefeetinteriorspace
?57elevators
?62milesofpipesforplumbing(plus132milesforthefireemergencysystem)
?24kwindows
?62khpcoolingsystem
?1100xthesizeofasinglefamilyhome
?165xthesizeoftheaveragecommercialbuilding
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC8
SCALE+COMPLEXITY
Ifyouonlyrememberonethingtoday…
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC9
DataMartvsIntegratedDataWarehouse(IDW)
?Purposebuilt
?Onelargetable
?Dimensionsusuallysmall
?1-1or1-to-manyrelationships
?Commonpartitioningkeys
?Mostjoinsnotsodemanding
?User-visiblepre-joinscanbeok
StarSchema–DataMart
Store
PK
storeid
storename
street
city
state
zip
managerid
storeclass
storesize
salesstrategy
buildingtype
dateopened
dateclosed
numberofregisterscachementpopulation
SMSA
censustract
Transaction
PK
transid
transtype
date
time
productid
storeid
customerid
associateid
basketid
quantity
unitprice
discount
taxcategory
taxamount
extendedpricepaymentmethodaccountid
Product
PK
productid
namedescription
SKU
package
size
color
style
attr1-n
Customer
PK
customerid
name
street
city
state
zip
phone
maritalstatusfamilysizeagegroupincomegroupeducationallevelgenderbirthyear
attr1-n
?Manylargetableswithmultiplejoinpaths
?Many-to-manyrelationships
?Differentpartitioningkeys
?Joinsaremoredemanding
?User-visiblepre-joinsareaproblem
RelationalSchema-IDW
Organization
PK
FK1
name
street
city
state
zip
country
phone
fax
emailwebaddressindustrycodenumberofempbusinesssizestatus
startdate
enddate
orgtype
partyID
Person
PK
personID
FK1
FK2
name
street
city
state
zip
phone
gender
dob
incomelevelstatus
startdateenddatehouseholdIDpartyID
Household
PK
householdID
FK1
street
ctiy
state
zip
phone
householdsizeincomelevelpartyID
Party
PK
partyID
partytypestartdateenddate
Arrangement
PK
arrangementID
arrangementtypenamedescription
startdate
enddate
notes
Resource
PK
resourceID
FK1
resourcetypestartdataenddatenamedescriptionarrangementID
Location
PK
locationID
FK1
locationtype
name
street
city
state
zip
partyID
Product
PK
productID
FK2
productclassnamedescriptionproductstatusarrangementID
Conditions
PK
conditionID
FK1
conditiontypestartdateenddateamountarrangementID
Statement
PK
statement_id
FK1
balancepast_due_amountpastduedayseventID
Event
PK
eventID
FK1
eventtypeeventdateeventtimeeventamountaccountID
Relationship
PK
relationshipID
FK1
FK2
FK3
rel_typeaccountIDarrangementIDpartyID
Account
PK
accountID
accounttypestartdateenddateaccountstatus
orgID
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC10
IllustrativeDataWarehouseUses:FinancialServices
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC11
WC
12
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
TheDataRelationshipsinanIntegratedDataWarehouse
?Analysisofactualdatarelationshipsinanintegrateddatawarehouse,largeUScompany
?Dotsaretablesorviews
?Linesaredatarelationshipsthatactuallyoccurinqueries
?Diagramillustratesthatacombinatorialexplosionofjoinsdevelopsasthebusinessvalueofthedatarelationshipsisexploited
WC
IntegratedDataWarehouse
TYPICALCOREREQUIREMENTS
?Supportefficient(bigtable,bigtable)joinswithoutco-locationorpre-join(Toomanydifferentjoinsarerequired)
?Oftenlargerscale
?Supportcomplexqueriesalongavarietyofjoinpaths
?Effectivecomplexqueryoptimization
?Efficient,fasttacticalquery
?Managecomplex,mixedworkloads
?Supportmanyconcurrentoperations
?Supportupdatesthroughoutthedayaswellasanynightlybatchruns
?High,ifnotcontinuous,availabilityofup-to-datedata
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC13
JoinPerformanceatScale
?Challengingproblemiswhenthesetstobematchedarelarge
?Costandtimearemanytimesgreaterwhen:
—Techniquesarenotscalable
—Intermediateresultsspilltodisk
?Customerscanbemisledwhenthesamequeriesworkfineatsmallerscale
?Dominatestheperformancepictureinmanydatawarehouses
?Allpopulardatawarehouseplatformsperformsomejoinswell
?Manyperformthelargerandmorecomplexjoinspoorly
?Differencebetweenthebestandtheworstisoftenverylarge
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC14
JoinTestDatabase
?TPC-Hbenchmarkdatabase
?ScaleFactors:10,100,1000,10000
?AtSF1databaseis1GB
TableName
RowCount
RowSize
(Bytes)
EstimatedTableSize(MB,Uncompressed)
LineItem
6,000,000
121
725
Order
1,500,00
110
165
Part
200,00
120
24
Part-Supplier
800,000
143
114
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC15
Orders
X4
6millionrows
660MB
Part-Supplier
800Krows
114MB
N
Part
200Krows
24MB
4
N
N
1
1
DatabaseQuery–BigJoin
AtSF1,resultingjoinisatablewith:
?24millionrows
?450bytesperrow
?≈10.8GBtotalsize,uncompressed
Allsizesincreaseproportionallywithscalefactor
Line
Item
6millionrows
725MB
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC16
QueryPerformanceandCostvsDataSize
?Runonsmallvirtualdatawarehouse
?Twoservers,eachwith8virtualprocessors
DataSize
ResponseTime
(Seconds)
OnDemand
Cost
Comparedto
LinearGrowth
10GB
156
$0.09
-
100GB
1626
$0.90
104%
1TB
30,656
$17
196%
10TB
736,072
$409
454%
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC17
$17at1TB
(196%ofprojectedcost)
QueryCostisWorseThanLinearwithDataSize
POPULARCLOUDDATAWAREHOUSESERVICE
QueryCost($)
$409at10TB
Source:WinterCorpCloudDatabaseLabTestingonGCP
400
(472%ofprojectedcost)
320
240
ActualProjected
160
$0.90at100GB
Shouldbe$90at10TBGB
80
DataSize(GB)
100
1000
500
Shouldbe
$9at1TB
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC18
BigJoinExperiment:Interpretation
?Forthesystemtested,causeoftheperformanceproblemwasthatintermediatequeryresultsweresignificantlylargerthanSSDperserver
?Largeintermediateresultsoccurwith:
—Joinofmultiplebigtables—commonintheintegrateddatawarehouse
—Self-joinofalargetable—commoninpathanalytics,networkanalytics,IoT
?Specificquerypatternoccursinpractice,thoughinfrequently
?Sameproblemoccursinnconcurrentquerieseachneedcreate(1/n)intermediateresults—canoccurinawiderangeofusecases
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC19
15
?Independentconsultant
?Differentrequirement
ProductB
12
mightfavoradifferentproduct
9
WC
20
AnnualCloudChargesfortheSameDataWarehouse
WorkloadwithThreePopularCloudDWEngines
?Realcustomerdatabase&workload
$Millions
ProductA
6
Nooneproductis
alwaysbest!!!
ProductC
3
Source:BEZNextWhitePaper-WhichPlatformIsBestforYourCloudDataWarehouse
/#readthepaper
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC
DataWarehousingintheCloud
THEPROMISE
—Agility
—Elasticity
—Cost
—Scalability
—WorkloadManagement
Thesub-text:don’tworry,moreresourcesarealwaysavailable
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC21
DataWarehousingintheCloud
THEREALITY
?Datawarehouseengineeringchallengesarestillfundamental—Performance
—Cost
—DataAvailability
majorriskareas
hardtodeliveratenterprisescaleandcomplexity
?Resourcesnotfree
?Doublingtheresourcesoftendoesn’tdoubletheperformanceorthroughput
?Scalerisesfast
DatawarehouseproductsdiffergreatlyinperformanceandcostandtodaywearefocusingonBIGdifferences(usually>3x)
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC22
WWCC
MostWidelyUsedCloudDataWarehousePlatforms
Vendor
AmazonWeb
Services
(AWS)
Cloudera
Microsoft
Oracle
Snowflake
Teradata
Product/Service
Redshift
ClouderaData
Warehouse
AzureSynapseAnalytics
BigQuery
AutonomousDataWarehouse(ADW)
SnowflakeDataPlatform
Vantage
CloudAvailability
AWS
AWS,Azure,GCP
Azure
GCP
OracleCloud(AWSNotAs-a-Service)
AWS,Azure,GCP
AWS
GCPAnnounced
MicrosoftAzure
TeradataIntellicloud
OnPremisesAvailability
No
Yes,onHadoop
MicrosoftSQLServer
(Similarbutnotidentical)
No
Yes
Oracle
Cloud@Customer
No
Yes,optionallywithconsumptionpricing
Serverless
No
No
No
Yes
Twomodes:dedicatedandshared(~serverless)
ServerT-ShirtSizes
No
PricingModel
Default:ondemand
pricingbythesecond;
reservedinstance
discount(1-3yearterm);
payseparatelyfordata
storage
Priceperhourperserver
forsoftware;customer
payscloudprovider
separatelyforcompute
andstorage
Separatecomputeand
storage.Payfordata
stored,withflexible
scaling.Payforthe
computelevel,with
optiontopausecompute
entirely.Monthlypricing
(with3yearterm)
Default:ondemandpricing-
payperquerybasedondata
scanned;flatratepricingfor
reserved"queryslots"(a
blendedresourceunit);pay
separatelyfordatastorage
Unitpricingfor"shared"
publiccloudusefor
computeandstorage;
dedicatedhardwarewith
licenseportability
availableinpubliccloud
andonpremise;private
cloudonpremise
availableonsubscription
basis.
Separationofcompute
andstorage;"Snowflake
onDemand"persecond
pricing;reservedcapacity
availableonrequest;3
editions
Licenseportability;
consumptionpricing--
payonlyforcompleted
queries(cloudandon
prem);customernot
chargedforoverhead
operationssuchas
statistics;alsooffer
capacitypricing;
?2020,2021WinterCorpLLC,TyngsboroMA
23
WC
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
The
Datanolongerneedstobein
onecentralstore:cloud,onprem,multi-cloud,allofabove
Modern
Data
Warehouse
WC24
TheModernAnalyticDataPlatform
DataLake
Alltypesofdata
including“unstructured”
(largevideo,scans,etc.)
Managesalltypesofdataforanalyticuseincludingdatathatis/has:
?largeandvariableinsize
?relatively“unstructured”
?lowervaluedensity
?lessshareduse
?variedlevelsofcuration
?colder
Data&Requests
Data&Requests
Data&requestsflowinboth
directions
Boundariesareunclearand
changing
Maybeitwillsoonbeone
“dataplatform”–sometimes
positionedasthe“datacloud”
DataWarehouse
Manytypesofdata,mosthassomestructure
Managesmanytypesof
dataforanalyticusethat
ismost:
?intensivelyused
?widelyshared
?highlycurated
?integrated
?valuable
DataLakehousehassomecharacteristicsofbothlakeandwarehouse
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC25
Industry
Trends
?QuantifiedRequirementsare:
—DatabaseMacro
structure
Projected
Growth**
—Scale
—Workload(e.g.,queryclassesandfrequencies)
—Userpopulation
—ServiceLevels
Estimateinranges,but
estimate!!
DefiningQuantifiedArchitecturalRequirements
Business**Interests/Vision
CurrentNeeds
Projected
Requirements
(Quantified)
?2019-2022WinterCorpLLC,Tyngsboro,MA.AllRightsReserved.
WC26
DoaQuantifiedEvaluation,withMeasurement
WorkingfromyourEstimateofMacro-Requirements
1.Canthecandidateplatformsperformatyourtargetlevelsofscaleandcomplexity?
?Notonlyfeatures,butanengineeringassessment
?Requireproductionreferenceswithsimilarmacro-structureandscale
2.Criti
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年度花卉基質(zhì)研發(fā)成果轉(zhuǎn)化應(yīng)用合同
- 2025年度國(guó)際物流樞紐國(guó)際公路貨物運(yùn)輸合同模板
- 二零二五年度成品油公路運(yùn)輸安全防護(hù)設(shè)施采購(gòu)合同4篇
- 二零二四年度企業(yè)專(zhuān)用拖車(chē)運(yùn)輸合同3篇
- 二零二四年城市停車(chē)場(chǎng)停車(chē)場(chǎng)租賃與增值服務(wù)合同3篇
- 2025年綠色出行合法車(chē)牌租賃合同范本
- 2025年新型環(huán)保化學(xué)產(chǎn)品代理分銷(xiāo)合同
- 2025合同模板動(dòng)產(chǎn)質(zhì)押合同范本
- 2025年度鍋爐房智能化改造工程承包合同
- 2025年度股權(quán)激勵(lì)計(jì)劃合同范本(適用于中小企業(yè))
- 人教版PEP版小學(xué)英語(yǔ)三年級(jí)下冊(cè)Unit 4 Healthy food Part A課件
- 2024年陜西省中考道德與法治真題(A卷)(原卷版)
- (2024)湖北省公務(wù)員考試《行測(cè)》真題及答案解析
- 中小學(xué)校食品安全與膳食經(jīng)費(fèi)管理工作指引
- 電商平臺(tái)客服人員績(jī)效考核手冊(cè)
- 04S519小型排水構(gòu)筑物(含隔油池)圖集
- YB∕T 4146-2016 高碳鉻軸承鋼無(wú)縫鋼管
- 多圖中華民族共同體概論課件第十三講先鋒隊(duì)與中華民族獨(dú)立解放(1919-1949)根據(jù)高等教育出版社教材制作
- 高考英語(yǔ)單詞3500(亂序版)
- 《社區(qū)康復(fù)》課件-第五章 脊髓損傷患者的社區(qū)康復(fù)實(shí)踐
評(píng)論
0/150
提交評(píng)論