版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
ModelingAnalyticsfor
ComputationalStorage
VeronicaLagrange
MemorySolutionsLab
SamsungSemiconductor,Inc.
SanJose,U.S.A.
veronica.l@
Harry(Huan)Li
MemorySolutionsLab
SamsungSemiconductor,Inc.
SanJose,U.S.A.
harry.li@
AnahitaShayesteh
MemorySolutionsLab
SamsungSemiconductor,Inc.
SanJose,U.S.A.
Abstract
Nextgenerationflashstoragewillbearmedwithasubstantialamountofcomputingpower.Inthis
paper,weinvestigateopportunitiestoutilizethiscomputationalcapabilitytooptimizeOnlineAnalytical
Processing(OLAP)applications.WehavedirectedouranalysisattheperformanceofasubsetofTPC-DS
queriesusingApacheHadoopclustersandtwodatabaseengines,ApacheSPARK-SQLandPresto1.We
modeltheexpectedspeed-upachievedbyoffloadingafewoperationsthatareexecutedfirstwithin
mostSQLplans.Offloadingtheseoperationsrequiresminimalcooperationfromthedatabaseengine,
andnochangestotheexistingplan.Weshowthatthespeed-upachievedvariessignificantlyamong
queriesandbetweenengines,andthatthequeriesbenefitingthemostareI/Oheavywithhighselectivity
ofthe“needleinthehaystack”variety.Ourmaincontributionisestimatingthespeed-upanticipated
frompushingtheexecutionofafewkeySQLbuildingblocks(scan,filter,andprojectoperations)to
computationalstoragewhenusingreadoptimized,columnarApacheParquetformatfiles2.
CCSConcepts
?Computingmethodologies→Modelingandsimulation→Modeldevelopmentandanalysis→Modelverificationandvalidation;
?Hardware→Communicationhardware,interfacesandstorage→Externalstorage;
?Informationsystems→Datamanagementsystems→Databasemanagementsystemengines→Databasequeryprocessing→Queryplanning;
?Informationsystems→Datamanagementsystems→Databasemanagementsystemengines→Onlineanalyticalprocessingengines;
1PrestoisaregisteredtrademarkofFacebook,Inc.
2AnearlierversionofthisreportwillappearintheProceedingsofICPE2020.
ModelingAnalyticsforComputationalStorage2
Keywords
ColumnarDatabase,Parquet,SQL,SmartStorage,acceleration,offloading,TPC-DS,Spark,Presto,OLAP
1Introduction
Currentdevelopmentsin“bigdata”storagesolutionsgeartowardsmovingdataprocessingcloserto
wherethedataresides,reducingunnecessarymovementandspeedingupdataprocessingconsiderably.
Computationalstorageisanemergingtrendwhereacomparativelylargeamountofdataprocessing
occursinsidethestoragelayer.Examplesofnewdevicesexposingflashstorageinternalcomputing
powerincludeSamsung’sSmartSSD[1],NGDSystems[2],andScaleFlux[3].Thisnewfunctionalitysignals
performanceimprovementopportunitiesforI/Oheavyworkloadscontainingoperationsamenableto
beingcompletednearthestoragesource.Oneofthemostcriticaltypesofdatabaseanalytics–OLAP–
wellexemplifiesthistypeofopportunity.ItistypicallyveryI/Ointensiveandcontainsquiteafewbuilding
blocksthatmaybeseamlesslymovedto,orexecutedby,acomputationalstoragedevice.
Offloadingisnotanewconcept.Networkprocessors,GPUsandrecentlymachinelearningspecialized
processorsarewidelyusedtoacceleratespecificcomputekernelswhilefreeingCPUresources.Wewill
showthattheoffloadingofmanymoretime-consumingoperationsfromthehostCPUtostorageimproves
bothworkloadperformanceandsystemefficiency.Theimmediatebenefit,ofcourse,isasizeabledecrease
inI/Ovolume.ThisreductioninI/Oleadstolesshostresourceutilization,whichnotonlyimproves
performanceofindividualqueries,butalsoincreasesservercapacity.Besidesdatabaseoperations,other
frequentoperationsthatcanbeexecutednearthestoragedeviceincludeencryptionandcompression.
Databaseanalyticsworkloadsareespeciallyread-intensive.ItisnotuncommonforI/Oreadstotake90%
ormoreofthetotalexecutiontime.OffloadingsomeofthattostoragereducesI/Obandwidthalong
withotherhostresourceusage,andmayimproveperformanceconsiderably.Furthermore,SSDshavean
internalbandwidththatismuchhigherthanthatwhichisexposedtothehostcomputerthroughexisting
channels(SAS,SATA,PCI-E,etc.)[4],whichmeansthatcomputationalstoragehasalargeamountof
untappedpotentialtoexploit.
Thispaperdiscussestheexpectedperformancebenefitsofoffloadingsomeimportantbasicdatabase
operations–namelyScan,FilterandProject–tocomputationalstorage.Weevaluatetheperformance
estimatemodelusingTPC-DSworkloadandtwodatabaseenginesrunningonHadoopclusters:SPARK-
SQLandPresto.
Thispaperisorganizedasfollows:aftercoveringpreviouscomputationalstoragedatabaseoffloading
work,weexplaintheOLAPworkloadselection,andtheconfigurationofourtwoclusters.InSection
IVwediveintoTPC-DScharacteristicsandexaminetheoverallperformancefromrunningonthetwo
Hadoopclusters,whichhavebeenthefocusofourexperimentation.InSectionV,weexplainourmodeling
methodologies,andinSectionVIwedescribeandanalyzeresultsfromthatmodeling.Specifically,we
showhowasubstantialspeed-upfromcomputationalstorageoptimizationcandependonmultiple
factors.Finally,webrieflydiscussotherSQLbuildingblocksamenabletocomputationalstorage
pushdowns,andconclude.
ModelingAnalyticsforComputationalStorage3
2PreviousWork
MostpreviousworkonpushingSQLfunctionsdowntocomputationalstorageconcentrateonspecific
functionsofaspecificDatabaseEngine.Summarizer[5]modifiestheexistingNVMecommandinterface
toimplementfouroperations:initializevariablesorsetqueries;readdataandexecutecomputation;read
dataandfilter–theselectioncase;andthetransferoftheoutputresultstothehost.Fromtheircase
studies,using3TPC-Hqueriesandaverysmallscalefactor(100MB–0.1SF),wedeterminedthatthey
coulddosimilarityjoinsaswell.Theycomparedifferentdegreesofcomputationoffloadingforthesethree
queries.Theauthorsshowthatsomewhatcomplexcomputationscanbecarriedoutnearstorage,and
brieflydiscussthedataintegrationproblem:howtocombinedatafromdifferentformatsandsources.
Theyconcentrateononespecificintegrationproblem:similarityjoin,anddescribetheheuristicstheyuse.
Leftunansweredisthebiggerissueonhowtointegratetrulydistinctformats.
YourSQL[6]isbasedonMariaDB.YourSQLallowsforcomplexqueryoperationstobeoffloadedtoasmart
SSDintheformofanISCtask.Thatpaperspendsthebulkofitstimetalkingaboutoptimizerheuristics.
Oneveryinterestingobservation,fromtheauthors’performanceanalysisisthatwhileyourtypicalSQL
application–OLTPorOLAP–cannotexhaustanNVMebandwidth,itsnear-storageimplementationcan.
Biscuit[7]iswhatYourSQLusestoenableitscomputationalstorageoperations.Itprovidestheuser
applicationwithC++APIs.Theuser’sSSD-sideC++programwithBiscuitAPIs,calledanSSDlet,isloaded
inthedevice.Ahost-sideprograminvokesandcoordinatesexecutionoftheSSDlettasksusinglibsisc;
communicationisdonebylinkinginputandoutputportstospecifictasks.Heretheyalsoclaimthatthe
APIsusedtoaccessfilesarenearlyidenticaltostandardlibraries.
ExtraV[8]isIBM’seffortatcomputationalstorageforgraphprocessingbasedontheirCAPRI[9].This
paperdescribesanFPGAprototypethatexecutescommongraphtraversalfunctionsnearthedevice.It
workslikevirtualmemoryforgraphapplications,asitprovidesthehostwiththeillusionthattheentire
graphlivesinmemory,whileitisactuallypartlystoredandcompressedinanSSD.Theauthorshavestated
thatgraphprocessingismostlydoneinmemory,eitherinsingleserversorclusters,andthatitcannotbe
doneefficientlywhengraphsgrowbeyondtheavailablememory.
PG-Strom[10]isanacceleratorforPostgreSQLthatoffloadspartoftheSQLworkloadtoaGPU.Supports
JoinsandAggregates.However,bythetimeofthatpublication[10]alldatafedtotheGPUcamefrom
mainmemory(notstorage).
NeteezawasthefirstsuccessfulproducttouseFPGAsascomputationalstoragecomputingaccelerators
foranalyticsdataengines.Itdoesnotrequireanysoftwareinstallationortuning.Justplugandplay.
NeteezadatabaseengineisbasedonPostgres[11],andimplementsfourfunctionsinitsFPGAengine:
Compress,Project,RestrictandVisibility.Francisco[12]claimsthatNeteeza’senginedecompresses
dataatwirespeed.ProjectandRestrictoperationsfilteroutcolumnsandrows,respectively,basedon
theparametersintheSELECTandWHEREclausesofaquery.TheNeteezaVisibilityengineisfocusedon
databaseintegrity,andtherein,filtersoutrowsthatshouldnotbeseenbythequery,suchasanyrows
beinginsertedbyatransactionthathasnotyetcommitted.
ModelingAnalyticsforComputationalStorage4
ComputationalstoragehasalsoattractedinterestbeyondSQLanddatabaseapplications.Forexample,
REGISTOR[24]isanFPGAplatformapplyingregexsearch,on-the-fly,toanyfilebeingtransferredfroman
SSDtothehost;INSIDER[25],alsoanFPGA-baseddrivecontroller,exposesavirtualfilesystemwithem-
beddedprogrammability,allowingprogrammerstopushdownoperationscustomizedtotheapplication’s
specificneeds.
3WorkloadandSetup
Here,weexplaintheTPC-DSbenchmark,aswellasthetwoclusterconfigurationsusedintheexperi-
mentsdescribed.Moreover,wedescribethetwodatabaseengines(SPARK-SQLandPresto),andexplain
therationalebehindusingtheParquetfileformattooffloadSQLoperationstocomputationalstorage.
3.1TPC-DS
“TheTPCBenchmarkDS(TPC-DS)isadecisionsupportbenchmarkthatmodelsseveralgenerallyappli-
cableaspectsofadecisionsupportsystem”[13].
TPC-DScontains24tables,organizedasasnowflakeschema.Itcontains6verylargeFACTtables,and
manysmallDIMENSIONtables.FurthermoreTPC-DSiscomprisedof99queries,eachonerepresenting
adifferentbusinessquestion.So,eventhoughthisisanartificialbenchmark,ittriestomirrorreal-life
applications.Schemaisscalable,withthesmallestbeing1GBandthelargest100TB.The1GBdatasetis
usedforQAonly.PerformanceismeasuredinQueriesperHour@ScaleFactor(QphDS@SF),andmust
includemultipletests(pertainingtopower,throughput,anddatamaintenance).Inthisstudy,wecon-
siderasubsetofthepowertest.ForamoredetailedexplanationoftheTPC-DSbenchmark,wereferthe
readerto[14].
TPC-DShasbeenaroundsince2007,butdidnotcatchupuntilrecentlyandafteramajorre-write,with
thefirstpublishedofficialreportdatedMarch2018(Cisco)[15].AsofJanuary2020,thereareonlysix
officialreportspublished.Nonetheless,subsetsofTPC-DSareheavilyusedinformallybytheindustryto
demonstrateupandcomingtrends[16][17].TPC-DSisoneofmanyTransactionProcessingPerformance
Council(TPC)benchmarks[18],andassuchcoversenoughgeneralOLAPcasestobeusefultopracti-
tioners.
BecauseFACTtablesareordersofmagnitudelargerthanDIMENSIONtables,wewillgravitatetowards
queriesthatare“FACTtableScanheavy,”asopposedtoqueriesthatare“DIMENSIONtableScanheavy.”
ModelingAnalyticsforComputationalStorage5
3.2TestConfiguration
Twoclustersareusedinthispaper,andsincetheyareconfiguredtorunSPARK-SQLandPresto,werefer
tothemsimplyastheSPARK-SQLclusterandthePrestocluster.Eachhaseightdatanodeswithdiffer-
enthardware.ThedetailedconfigurationislistedinTableI.BothenginesuseApacheHiveMetadata,
andtheParquetfileformat.
SPARK-SQLisApacheSpark’shigh-leveltoolforstructureddataprocessing[19].Itisanin-memory,
distributed,RDBMSthatunderstandsSQLandaDatasetAPI(availableinJavaandScala).User
applicationsinterfacewithSPARK-SQLviaacommand-linemodule,JDBCorODBC.SPARK-SQLalso
supportsreadingandwritingdatastoredinanexistingApacheHiveinstallation.
Spark-SQL
Presto
DataNode
Hardware
CPU
Intel(R)Xeon(R)
Gold6152CPU@
2.10GHz
Intel(R)Xeon(R)
CPUE5-2699v4@
2.20GHz
Memory
256GB
256GBto1024GB
LocalStorage
2xNVMeSSD3.2TB
3xNVMeSSD1.6TB
SoftwareStack
OS
LinuxKernel4.13.0
LinuxKernel4.x.x
SPARK-SQL/Presto
2.3.0
0.205
Hadoop
2.7.3
2.9.0
Hive
1.2.1000
1.2.2
HDFSReplication
1
TPC-DS
ScaleFactor
10000
StorageFormat
Paquet
Table1-ClusterConfiguration
PrestoisadistributedSQLqueryenginedesignedtoquerylargedatasetsdistributedoveroneormore
heterogeneousdatasources[20].PrestoprovidesaCLIinterface,andqueryprocessing(parser,planner,
scheduler),butwillusedataandmetadataprovidedbyothersoftwarecomponents(ApacheHBase,
ApacheHive,MySQL,etc.).Prestointeractswiththeseothercomponentsviaconnectors,andthisisits
claimtofameasitispossibletocombinemultiple,differentdatasourcesintoonequeryseamlessly.
ThereisnoneedforveryexpensiveETL(Export-Transform-Load)datasetsinordertoanalyzethem.
Similartoclassicmassivelyparallelprocessing(MPP)DBMS[21],Prestoisadistributedsystemthatruns
onacluster.PrestoclientsubmitsSQLstatementstoamasterdaemoncoordinator.Usingmetadata
ModelingAnalyticsforComputationalStorage6
fromconnectors,thecoordinatorparsesthequery,generatestheplan,andthenschedulesandcoordi-
nateshowitisexecutedbytheworkers.Workersgetdatafromconnectors,executeassignedtasks,and
deliverresultstotheclient.Allprocessinghappensinmemory,anddataispipelinedacrossthenetwork
betweendifferentstages.
ParquetisanopensourcecolumnarfileformatthatwasdesignedtobeusedwithOLAPsystems[22].
TheParquetfileformatisREADoptimized,asinsertsorupdatescanbeexpensiveoperations.Itwas
inspiredbythe“Dremel”paper[23],andisextensivelyusedintheHadoopecosystem.Furthermore,each
Parquetdiskfilecontainsthetable’sschema.Thisfeatureresolvestheissueofthedevicebeingaware
ofthetablemetadata,arequirementforanycomputationalstorageprocessing.Furthermore,existing
Parquetreadersarecapableofprojectingandfilteringcertaindatatypesusingstatisticsprovidedin
metadata.Implementingsomefunctionalityinacomputationalstoragedeviceiscomplementaryandin
additiontotheexistingpushdowncapabilitiesofParquet.
TPC-DSqueriesaredownloadedfromtheTPCwebsiteandresultswereverifiedagainstsampleoutput
fromtheTPC.Allqueriesrunsequentiallyasasingletestjob.Beforeeachquery,thememorycacheis
cleared.Inaddition,
?SPARK-SQLisrestartedbeforeeveryquery
?Prestoisrestartedbeforethejob
4TPC-DSCharacterization
Inthissection,wediscussthemanystages(orfragments)oftheexecutionplangeneratedbythequery
optimizer.Next,weshowSPARK-SQLandPrestoqueryruntimeresultsforTPC-DS.Welistthemsideby
sidetoshowthattheybehavedifferentlyinordertoillustrateandexplainthedifferentspeed-upsthat
onemightseeforthesamequeryexecutedwithdifferentengines.Next,weexaminetheconceptofScan
Ratio,andhowweuseittocharacterizeandrankqueries.
4.1TypicalTPC-DSqueryplan
SQLqueryplansarecomposedofbasicbuildingblocks.Theyformanexecutiontree.Eachbuildingblock
typicallyfocusesononespecificoperation,andisscheduledbyaSQLengine.Howthesebuildingblocks
areassembleddictatesqueryperformance.Mainbuildingblocksinclude:Scan,Filter,Project,Aggregate,
Sort,Join,Merge,Union.Figure1(A)illustratesatypicalquerysequence,withbuildingblocksbeingexe-
cutedfromtoptobottom.Figure1(B)isthebuildingblocksequencecreatedbytheSPARK-SQLplanner
forTPC-DSQuery44.
ModelingAnalyticsforComputationalStorage7
(A)GenericSQLqueryplan(B)SPARK-SQLplanforQuery44
Figure1-SQLqueryplans
Thefunctionalityofsomebuildingblocksincludes:
?Scan:Readdatabasecontentfromstoragetocomputehostmemoryandapplyanyneededtransformations
?Filter:Filtertablerowsinmemorywithgivingcriteria
?Project:Selecttablecolumnsinmemory
?Join:Combinetwotablesbasedongivencriteria
ModelingAnalyticsforComputationalStorage8
QueryRuntime(sec.)
Figure2-TPC-DSruntimequerycomparison
4.2Performanceofallqueries
Figure2showstheruntimeforallTPC-DSqueriesforSPARK-SQLandPresto.With10TBdataset,SPARK-
SQLcompletes91andPrestocompletes61queries.Bothdatabaseenginesstoreallintermediateresults
inmemory,andthequeriesthatfailedincurredan“out-of-memory”error.Thequeryruntimehasawide
rangefromlessthanaminutetomanyhours.Wehavenotmatchedourclusterhardwareconfigurations
forSPARK-SQLandPresto,asitisnotourgoaltocomparetheperformancebetweenthem.Thepointof
thispaperistoshowcaseasubsetofthemanydifferentsystemparametersinfluencingthepotentially
substantialspeed-upaffordedbycomputationalstoragedevices.Wedemonstratethateventhough
computationalstoragecanprovideimpressivespeed-ups,thebenefitsvarysignificantlydependingon
manyotherparameterssuchastablesize,selectivity,queryplan,etc.
Inthefollowingsections,wewillselectfivequeriesfromeachclusterbasedonsystemcharacterization
ofthequeriesandpotentialoffloadbenefits,andprovidefurtheranalysisofeach.
4.3ScanRatio
ScanRatioisdefinedasthetotalCPUtimespentonadatabaseScanoperation,dividedbytotalCPU
timeconsumedbythequery.TheCPUtimeisreportedbyqueryplannerfromdatabaseengine.Thistime
isnotthewallclocktimeandshouldnotbeconfusedwithqueryruntime.
ForTPC-DSqueries,theScanRatiorangesfromnear0%to~93%onSPARK-SQLanduptonearly100%
forPresto.InFigure3,queriesaresortedbytheirScanRatio,fromlefttoright.Q9,withthehighestScan
ModelingAnalyticsforComputationalStorage9
Ratio,isfurthesttotheright.NoticethatmostCPUintensivequerieshaveasmallScanRatio,butnot
all.Somecomplexqueries,suchasQ44,arebothcomputeandI/Ointensive.
HighScanRatiodoesnotnecessarymeanthequeryreadsmoredatafromstorage,itonlyindicatesthat
timespentonI/Oishigherrelativetootherqueryoperations.Forexample,Query45hasatotaldisk
readof~1.3TB,itsScanRatioisonly2.99%.ButforQuery9,whichhasthehighestScanRatioof~93%,
totaldiskreadisonly~105GB.Althoughthetotalqueryruntimedifferenceisnotlarge(Q9,212.36sec,
Q45,176.02sec.),theCPUcyclesspentonnon-I/OoperationscausedtheScanRatiotobelowerforQ45.
AhighScanRatioindicatesthataqueryisastrongcandidateforcomputationalstorageoptimization,
sinceitsI/Ooperationsarelikelytobeinitscriticalpath,whilealowScanRatioindicatesthatopera-
tionsotherthanI/Oarethebottleneck.
5OffloadingModel
Here,weexplainhowweselectedeachplanstagetobeoffloadedtocomputationalstorage,followed
byadetaileddescriptionofthemodelmethodologyusedwithbothdatabaseengines.Noticethatthe
methodsaresomewhatdifferent,whichwechosetodoinordertocovermoreaspectsoftheoffloading
process.
ModelingAnalyticsforComputationalStorage10
Figure3-TPC-DSScanRatioandCPUutilization
5.1Offloadingcomponents(orkernels)
Inthissection,weexploitopportunitiestooffloadoperationsfromhosttocomputationalstorage.In
ordertoexecuteaquery,dataflowsfromtheleavesoftheplantotheroot.Usuallytheleavescontain
someformofSCANoperation:tablerowsandcolumnsarereadin(usuallyfromdisk,unlessthisdata
waspreviouslycached).TheSCANoperationusuallyincludessomesortofdatatransformation,from
theformatondisktotheoneunderstoodbythedatabaseengine.Onceatableisscanned(orsometimes
whilethetableisbeingscanned),rowsmaybefilteredorprojected.Nextplanstepsmaycontainaggre-
gates,sorts,joins,windowfunctions,orotheradvanceddatatransformations.Operationsneartheleaves
willgenerallybe“easier”topushdowntocomputationalstorage.BasicSCANs,FILTERs,andPROJEC-
TIONsmayhappenwithvirtuallynochangetothedatabaseenginequeryplan.Moreaggressivepush
downoptimizationsarepossible,butrequirethecooperationofthedatabaseengine,andre-factoringof
thequeryplan.
Forexample,inFigure1(B),weobservethispatterninbothFACTtableandDIMENSIONtableI/O.By
combining“Scan,”“Filter”and“Project”intoanewbuildingblock,wecanestimatetheperformance
benefitofoffloadingthisnewbuildingblock(“Scan/Filter”)tocomputationalstorage.Regardless,with
“Scan/Filter”offloading,theSPARK-SQLplanforQuery44stilllooksthesame.
ModelingAnalyticsforComputationalStorage11
5.2SPARK-SQLmodelmethodology
TheperformanceestimatemodelforSPARK-SQLisbasedonhowthedatabaseengineplanisexecuted
–instageswithdependencies.Weassumethereisnoresourcelimitationonthenumberofstagesthat
canbeexecutedconcurrently.
Forexample,Figure4showsagenericquerythatinvolves3tables,1DIMENSIONtableand2FACT
tables.Stage-0readsthecontentoftheDIMENSIONtable,whilereadingFACTtableshappensinStage-1
andStage-2.Then,Stage-3and4sorttheresultsfromStage-1and2.Theresultsaresubsequently
passedtoStage-5forthefinalJoinoperation.
Figure4-QueryStageScheduling
First3stages(0,1and2)includeScan/Filter/ProjectoperationsasmarkedwithlightdotshadeinFigure
4.Thetimespentontheoperationsare1,5and8secondsrespectively,andcouldbeoffloadedtocomputationalstorage.Theoffloadedexecutiontimeiscalculatedas:
?Reserve1secondforoffloading-relatedhandshaking.Thereservedtimeisanarbitrarynumber.
?AssumesthattheFilterruntimeonthedeviceisatwirespeedandcanbeomitted.Thisisanopti-misticassumptionthatprovidesanupperboundforouranalysis.TheactualFilterruntimedependsoncompute/IOcapabilitiesofthedevice,andcanbefurtherimprovedwithpre-processinginthedevice.
?Time-of-resultdatatransferbetweenthedeviceandthehostascalculatedbasedonthedeviceReadbandwidthspecification;inthispaper,3GB/sechasbeenused.
ModelingAnalyticsforComputationalStorage12
Withtheseassumptions,theexampleexecutiontimecanbereducedfrom18secondsto12seconds
(seeFigure5).
Figure5-SPARK-SQLOffloadModel
Asthemostfundamentalstepinbuildingtheestimatemodel,weneedtoknowthetimespentforScan/
Filter/ProjectoneachSPARK-SQLquerystage.Fortunately,withSPARK-SQLthelogfileprovidesthe
followingkeylogginginformation(Figure6):
?MeasWClock:TheStagewallclockruntime
?ThrTime:Totalexecutiontimeforthestagefromallexecutionthreads.Thisisnotwallclocktime
?ThrTime:TheexecutiontimebreakdownforScan,Filter,Project
Withtheaboveinformation,theestimatedtimespentonScan/Filter/Projectcanbecalculatedas
ThrTime
EstWClockTime=MeasWClockTime
ThrTime
InadditiontoScantime,wealsoconsiderthefollowing:
WClockTime-Thetimetoinitializecomputationalstorageforoffloading.Wealwaysas-
sumeonesecondfortheestimationcalculation.
WClockTime-Thetimerequiredtotransfertheresultsfromtheoffloadingdevice
backtothehost.ItiscalculatedbasedtheReadbandwidthofthecomputationaldevice.Inourmodel,
theFilteredresultisusuallylessthan0.5%oftheresultsthatareunfiltered.Itwouldtakeonlyafrac-
tionofasecondtoreadbacktothehost,thereforeweignoreditthistime.
WithParquetformat,weassumethatnoProjectoperationorProjecttimeisomitted.
Withtheaboveassumption,theestimatedstageruntimewithoffloadingforSPARK-SQLiscalculatedas:
EstWClockTime=WClockTime+EstWClockTime
ModelingAnalyticsforComputationalStorage13
Figure6-OneSPARK-SQLQueryStagewithStatistics
5.3PrestoModelmethodology
TomodelpushdownbenefitsofScan/Filter/Projectoperations,wecreateandpopulatesmallertables
wecall“modeltables.”These“modeltables”containonlytherowsandcolumnsthatwouldbeselected
byacomputationalstorageengineexecutingtheScan/Filter/Projectoperationsdefinedbythequery.
Werepeatthequeryusingthemodeltable,andcompareresultsagainstthesamequeryusingthe
originaltables–seeFigure7.ForPresto,bothoriginalandmodelqueriesgeneratethesamequeryplan.
SimilartoourSPARK-SQLmodel,theperformancedifferenceistheupperboundofthespeed-upthata
computationalstoragedevicewouldyield,becausethismodelassumesthatthestoragedevicewould
becapableoffilteringandprojectingrowsandcolumnsatwirespeed.However,ifwetakeintoconsid-
erationthehigherinternalflashstoragebandwidth[4],thisisarealisticapproximationoftheexpected
speed-up.
ModelingAnalyticsforComputationalStorage14
Figure7-PrestoOffloadModel.
6OffloadingEvaluation
Here,wedescribeindetailthequeryselectionprocess,andgiveahigh-levelviewoftheresultsobtained
bythemodelingofbothdatabaseengines.Furthermore,wepresentside-by-sideanalysisoftheexpect-
edspeed-upforafewselectedqueries.
ModelingAnalyticsforComputationalStorage15
6.1Thequeries
Inthisstudy,wepickedfivequeriesfromeachconfigurationfordeepanalysis.Thequerieswereselected
basedonwheretheyfallonthedifferentquadrantsoftheScanRatioversusaCPUutilizationchart(see
Figure8)tocoverawiderrangeofcharacteristics.BecausewefocusonoffloadingScan/Filter/Project,
wewantqueriesthatareI/OintensiveandshowhighselectivitywhenfilteringandprojectingFACT
tables.Thatis,welookforqueriesofthe“needleinthehaystack”variety.Threeofthequeries(Q9,Q44,
andQ75)arefoundinbothstudies,whiletheothertwoarefoundexclusivelyineitherSPARK-SQLor
Presto.Wechosethisapproachbecause,duetotheirdifferentarchitectureandoptimizer,interesting
queriesinoneenvironmentarenotnecessarilyinteresting,orpossible,intheother.Forexample,Presto
cannotexecuteQ4(out-of-memoryerror).
UsingthechartinFigure8,weselectedthefollowingfiveSPARK-SQLqueriesforanalysis:Queries9and
44havehighScan/Filterratio;Query4hashighCPUutilization;Query72hasthelongest
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 冀少版八年級生物上冊第五單元復(fù)習(xí)提升課件
- 電子教案格式
- 《螞蟻做操》教學(xué)設(shè)計與反思
- 服裝庫存管理技巧
- 新能源履約管理辦法
- 商場洗手間修繕合同
- 美容美發(fā)市場租賃合同
- 港口航道招投標(biāo)控標(biāo)要點分析
- 石油行業(yè)不合格產(chǎn)品處理
- 劇院表演中心演藝車庫改造合同
- 團隊建設(shè)創(chuàng)造和諧的人際關(guān)系與文化
- 新時代中小學(xué)思政課一體化建設(shè)探究
- 對分課堂:中國教育的新智慧
- 陪診項目商業(yè)計劃書
- 玉米種子銷售計劃書
- 刑事受害人授權(quán)委托書范本
- 《電動叉車培訓(xùn)教材》課件
- 第3.2課《簡單相信傻傻堅持》(課件)-【中職專用】高二語文同步課件(高教版2023·職業(yè)模塊)
- 小學(xué)科學(xué)項目式學(xué)習(xí)教學(xué)設(shè)計與實踐研究
- 家紡面料未來發(fā)展趨勢報告
- 供電企業(yè)作業(yè)安全風(fēng)險辨識防范課件
評論
0/150
提交評論