Querying Big Graphs within Bounded Resources查詢有界資源中的大圖_第1頁
Querying Big Graphs within Bounded Resources查詢有界資源中的大圖_第2頁
Querying Big Graphs within Bounded Resources查詢有界資源中的大圖_第3頁
Querying Big Graphs within Bounded Resources查詢有界資源中的大圖_第4頁
Querying Big Graphs within Bounded Resources查詢有界資源中的大圖_第5頁
已閱讀5頁,還剩17頁未讀 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

QueryingBigGraphswithinBoundedResources1YinghuiWuUCSantaBarbaraWenfeiFanUniversityofEdinburghSouthwestJiaotongUniversityXinWangBigreal-lifegraphs2socialscale100B(1011)Webscale1T(1012)brainscale,100T(1014)Real-lifescope100M(108)USroadHumanConnectome,(TheHumanConnectomeProject,NIH)knowledgegraphBTCSemanticWebWebgraph(Google)Internet(Opteproject)AnNSABigGraphexperiment,P.Burkhardt,etal,US.NationalSecurityAgency,May2013Socialgraph(300PBuserdata)Queryingbiggraphs3GivenaqueryQandadatagraphG,findanswersQ(G)Graphpatternmatching:knowledgediscovery,socialrecommendation,drugdesigning…Reachability:cybersecurity,metabolicanalysis,softwareengineering,Internetofthings…ChallengesGraphsaretoobigHardtoreducecomputationcomplexityLimitedresourceState-of-the-artTractableapproachesSSDlinearscanfornodesearch:1PB->1.9days,1EB->5.28yrsIndexing&CompressionCanwestillanswerQwithlimitedresource?Outline4ResourceboundedqueryansweringLocalized:GraphPatternQueriesResourceboundedsimulationqueriesResourceboundedsubgraphisomorphismNon-localized:ReachabilityResourceboundedreachabilityExperimentalstudyConclusion&FutureworkQueriesanddatagraph5Localizedqueries:canbeansweredlocallyGraphpatternqueries:simulationqueries(personalizedsocialsearch,egonetworkanalysis…)matchingrelationoverdQ-neighborhood

ofapersonalizednodeNon-localizedqueriesReachabilityqueriesMichael(Personalizednode)hikinggroupcyclingclub

member?cyclingloversMichael(uniquematch)hikinggroup………cycling

club

membercyclingfanshgmhg1hg2cc1cc2cc3cl1cl2cln-1clnMichael:“findcyclingfanswhoknowbothmyfriendsincyclingclubandmyfriendsinhikinggroups”(IBMWatson,FacebookGraphSearch,AppleSiri,WolframAlphaSearch…)Michaelcc1cl3cl7cln-1cl4cl9cl5……cl6cl16Eric…CanwestillanswerQwithlimitedresource?Makingbiggraph“small”6Idea:usingasmallgraphinsteadofGtomakeitfeasibletoanswerexpensivequeriesinbiggraphs.

Reduction(boundedresources:time,space,energy…)queryexactresultsqueryapproximate

resultsApproximation(guaranteedquality:

accuracy,errorrate,…)biggraphsmallgraphexpensive!Resource-boundedqueryanswering7onlinereductionsize|GQ|<=α|G|visitα*c|G|amountofdata(α*c<1)queryresultsqueryresultsApproximationAccuracy>=ηbiggraphGsmallgraphGQexpensive!Resource-boundedalgorithmAforqueryclassL

andanyG:withresourceboundαhasaccuracyguaranteeη

Hardnessresults8Exactresource-boundedquerying:η=100%IntractabilityNP-hardforsimulationqueries(evenwhenQisapathandGisaDAG)ReductionfromSetCoverNP-hardforsubgraphqueriesImpossibilityForanyα,thereexistsNOalgorithmforreachabilityquerieswithresource-boundαand100%accuracybound

Resource-boundedsimulation9

Reductionsize|GQ|<=α|G|inO(dG|Q||GQ|)timeSimulationqueryresultsqueryresultsApproximation100%forα>=biggraphGsmallgraphGQO(|Q||G|+|G|2)dG:maximumdegreeofdQ-neighborhoodgraphofp-node;d:diameterofQ;l:distinctlabelsizeinQf:maxnumberofnodeswithasamelabel&neighborinQResource-boundedsimulation:dynamicreduction10Preprocessing(auxiliaryinformation)dynamicreduction(computereducedsubgraph)ApproximatequeryevaluationoverreducedsubgraphlocalauxiliaryinformationGBooleanguardedcondition:labelmatchingCostfunctionc(u,v)Potentialfunctionp(u,v),estimatedprobabilitythatvmatchesuboundb,determinesanupperboundofthenumberofnodestobevisitedQdegree|neighbor|<label,frequency>…uvuvlabelmatchDynamicallyupdatedauxiliaryinformationuv?Resource-boundedsimulation:dynamicreduction11preprocessingdynamicreduction(computereducedsubgraph)ApproximatequeryevaluationoverreducedsubgraphMichaelhikinggroupcyclingclub?cyclingloversMichaelhikinggroup…cycling

club

membercyclingfanshgmhg1hg2cc1cc2cc3cl1cl2cln-1clncyclingclub

cc1cc2cc3cycling

club

member?cyclingloverscln-1clncyclingfanshgmhikinggrouphikinggroupFALSE---TRUECost=1Potential=3Bound=2TRUECost=1Potential=2Bound=2bound=14visited=16Matchrelation:(Michael,Michael),(hikinggroup,hgm),(cyclingclub,cc1),(cyclingclub,cc3),(cyclinglover,cln-1),(cyclinglover,cln)Resource-boundedreachability12

Reductionsize|GQ|<=α|G|ReachabilityqueryresultsReachabilityqueryresultsApproximation(experimentallyVerified;nofalsepositive,intimeO(α|G|)biggraphGsmalltreeindexGQO(|G|)Preprocessing:landmarks13Preprocessingdynamicreduction(computelandmarkindex)ApproximatequeryevaluationoverlandmarkindexMichaelcc1cl3cl7cln-1cl4cl9cl5……cl6cl16Eric…LandmarksalandmarknodecoverscertainnumberofnodepairsReachabilityofthepairsitcoverscanbecomputedbylandmarklabelscc1“Icanreachcl3”cl3cln-1,“cl3canreachme”cl4…cl6cl16HierarchicallandmarkIndex14LandmarkIndexlandmarknodesareselectedtoencodepairwisereachabilityHierarchicalindexing:applymultipleroundsoflandmarkselectiontoconstructatreeoflandmarkscc1cl7cln-1Michaelcc1cl3cl7cln-1cl4cl9cl5……cl6cl16Eric……cl16cl3cl5cl6cl4cl9…Booleanguardedcondition(v,source,dst)Costfunctionc(v):sizeofunvisitedlandmarksinthesubtreerootedatvPotentialP(v),totalcoversizeofunvisitedlandmarksasthechildrenofvCoversizeLandmarklabels/encodingTopologicalrank/rangeResource-boundedreachability15Michaelcc1cl3cl7cln-1cl4cl9cl5……cl6cl16Ericcc1…cl7cln-1…cl16cl3cl5cl6cl4MichaelEric“drilldown”?cl9…localauxiliaryinformation“rollup”Preprocessingdynamicreduction(computelandmarkindex)Approximatequeryevaluationoverlandmarkindexbi-directedguidedtraversalCondition=FALSE--Condition=?Cost=9Potential=46Condition=?Cost=2Potential=9Condition=TRUE……ExperimentalStudy16DatasetYoutube(1.61millionnodes,4.51millionedges)

(http://netsg.cs.sfu.ca/youtubedata)

YahooWebgraph(3millionnodes,14.98millionedges)(/catalog.php?datatype=g)AlgorithmsGraphpatternmatching:ResourceboundedsimulationalgorithmRBSimOptimizedstrongsimulationpatternmatchingMatchOptResourceboundedsubgraphisomorphismRBSubOptimizedVF2Reachability:ResourceboundedreachabilityRBReachBFSandoptimizedBFSovercompressedgraphsLM:applyinglandmarkvectors(4*Log|V|landmarks)Efficiencyofresourceboundedsimulation17Varyingα(10-5),YahooRbsimis5.5timesfasterthanMatch-OPT;RBSubis6.25timesfasterthanVF2-OPTonaverageVaryingα(10-5),YoutubeAccuracy18Varyingα(10-5),accuracy,Yahoo89%-100%forsimulationqueriesbothachieves100%accuracywhenα>0.0015%,Efficiencyofresourceboundedreachability19RBreachis62.5timesfasterthanBFSand5.7timesfasterthanBFS-OPT

Varyingα(10-4),YahooVaryingα(10-4),YoutubeAccuracy20Varyingα(10-4),accuracy,Yahoo>=96%achieves100%accuracywhenα>0.05%,Conclusion21ResourceboundedqueryingforbiggraphprocessingDynamicreduction+approximatequeryansweringLocalqueries:strongsimulation,subgraphisomorphismNon-localqueries:reachabilitytunableperformance,abalanceofresourceandanswerqualityMoretobedone…Maximumaccuracyratioη

resourceboundedalgorithmscanguarantee?Graphquerypatternswithoutpersonalizednodes,moregraphqueryclasses…Distributeddeployment(MapReduce,GraphLab)Deploymentinemergingapplications(knowledgegraph,cyber

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論