版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
QueryingBigGraphswithinBoundedResources1YinghuiWuUCSantaBarbaraWenfeiFanUniversityofEdinburghSouthwestJiaotongUniversityXinWangBigreal-lifegraphs2socialscale100B(1011)Webscale1T(1012)brainscale,100T(1014)Real-lifescope100M(108)USroadHumanConnectome,(TheHumanConnectomeProject,NIH)knowledgegraphBTCSemanticWebWebgraph(Google)Internet(Opteproject)AnNSABigGraphexperiment,P.Burkhardt,etal,US.NationalSecurityAgency,May2013Socialgraph(300PBuserdata)Queryingbiggraphs3GivenaqueryQandadatagraphG,findanswersQ(G)Graphpatternmatching:knowledgediscovery,socialrecommendation,drugdesigning…Reachability:cybersecurity,metabolicanalysis,softwareengineering,Internetofthings…ChallengesGraphsaretoobigHardtoreducecomputationcomplexityLimitedresourceState-of-the-artTractableapproachesSSDlinearscanfornodesearch:1PB->1.9days,1EB->5.28yrsIndexing&CompressionCanwestillanswerQwithlimitedresource?Outline4ResourceboundedqueryansweringLocalized:GraphPatternQueriesResourceboundedsimulationqueriesResourceboundedsubgraphisomorphismNon-localized:ReachabilityResourceboundedreachabilityExperimentalstudyConclusion&FutureworkQueriesanddatagraph5Localizedqueries:canbeansweredlocallyGraphpatternqueries:simulationqueries(personalizedsocialsearch,egonetworkanalysis…)matchingrelationoverdQ-neighborhood
ofapersonalizednodeNon-localizedqueriesReachabilityqueriesMichael(Personalizednode)hikinggroupcyclingclub
member?cyclingloversMichael(uniquematch)hikinggroup………cycling
club
membercyclingfanshgmhg1hg2cc1cc2cc3cl1cl2cln-1clnMichael:“findcyclingfanswhoknowbothmyfriendsincyclingclubandmyfriendsinhikinggroups”(IBMWatson,FacebookGraphSearch,AppleSiri,WolframAlphaSearch…)Michaelcc1cl3cl7cln-1cl4cl9cl5……cl6cl16Eric…CanwestillanswerQwithlimitedresource?Makingbiggraph“small”6Idea:usingasmallgraphinsteadofGtomakeitfeasibletoanswerexpensivequeriesinbiggraphs.
Reduction(boundedresources:time,space,energy…)queryexactresultsqueryapproximate
resultsApproximation(guaranteedquality:
accuracy,errorrate,…)biggraphsmallgraphexpensive!Resource-boundedqueryanswering7onlinereductionsize|GQ|<=α|G|visitα*c|G|amountofdata(α*c<1)queryresultsqueryresultsApproximationAccuracy>=ηbiggraphGsmallgraphGQexpensive!Resource-boundedalgorithmAforqueryclassL
andanyG:withresourceboundαhasaccuracyguaranteeη
Hardnessresults8Exactresource-boundedquerying:η=100%IntractabilityNP-hardforsimulationqueries(evenwhenQisapathandGisaDAG)ReductionfromSetCoverNP-hardforsubgraphqueriesImpossibilityForanyα,thereexistsNOalgorithmforreachabilityquerieswithresource-boundαand100%accuracybound
Resource-boundedsimulation9
Reductionsize|GQ|<=α|G|inO(dG|Q||GQ|)timeSimulationqueryresultsqueryresultsApproximation100%forα>=biggraphGsmallgraphGQO(|Q||G|+|G|2)dG:maximumdegreeofdQ-neighborhoodgraphofp-node;d:diameterofQ;l:distinctlabelsizeinQf:maxnumberofnodeswithasamelabel&neighborinQResource-boundedsimulation:dynamicreduction10Preprocessing(auxiliaryinformation)dynamicreduction(computereducedsubgraph)ApproximatequeryevaluationoverreducedsubgraphlocalauxiliaryinformationGBooleanguardedcondition:labelmatchingCostfunctionc(u,v)Potentialfunctionp(u,v),estimatedprobabilitythatvmatchesuboundb,determinesanupperboundofthenumberofnodestobevisitedQdegree|neighbor|<label,frequency>…uvuvlabelmatchDynamicallyupdatedauxiliaryinformationuv?Resource-boundedsimulation:dynamicreduction11preprocessingdynamicreduction(computereducedsubgraph)ApproximatequeryevaluationoverreducedsubgraphMichaelhikinggroupcyclingclub?cyclingloversMichaelhikinggroup…cycling
club
membercyclingfanshgmhg1hg2cc1cc2cc3cl1cl2cln-1clncyclingclub
cc1cc2cc3cycling
club
member?cyclingloverscln-1clncyclingfanshgmhikinggrouphikinggroupFALSE---TRUECost=1Potential=3Bound=2TRUECost=1Potential=2Bound=2bound=14visited=16Matchrelation:(Michael,Michael),(hikinggroup,hgm),(cyclingclub,cc1),(cyclingclub,cc3),(cyclinglover,cln-1),(cyclinglover,cln)Resource-boundedreachability12
Reductionsize|GQ|<=α|G|ReachabilityqueryresultsReachabilityqueryresultsApproximation(experimentallyVerified;nofalsepositive,intimeO(α|G|)biggraphGsmalltreeindexGQO(|G|)Preprocessing:landmarks13Preprocessingdynamicreduction(computelandmarkindex)ApproximatequeryevaluationoverlandmarkindexMichaelcc1cl3cl7cln-1cl4cl9cl5……cl6cl16Eric…LandmarksalandmarknodecoverscertainnumberofnodepairsReachabilityofthepairsitcoverscanbecomputedbylandmarklabelscc1“Icanreachcl3”cl3cln-1,“cl3canreachme”cl4…cl6cl16HierarchicallandmarkIndex14LandmarkIndexlandmarknodesareselectedtoencodepairwisereachabilityHierarchicalindexing:applymultipleroundsoflandmarkselectiontoconstructatreeoflandmarkscc1cl7cln-1Michaelcc1cl3cl7cln-1cl4cl9cl5……cl6cl16Eric……cl16cl3cl5cl6cl4cl9…Booleanguardedcondition(v,source,dst)Costfunctionc(v):sizeofunvisitedlandmarksinthesubtreerootedatvPotentialP(v),totalcoversizeofunvisitedlandmarksasthechildrenofvCoversizeLandmarklabels/encodingTopologicalrank/rangeResource-boundedreachability15Michaelcc1cl3cl7cln-1cl4cl9cl5……cl6cl16Ericcc1…cl7cln-1…cl16cl3cl5cl6cl4MichaelEric“drilldown”?cl9…localauxiliaryinformation“rollup”Preprocessingdynamicreduction(computelandmarkindex)Approximatequeryevaluationoverlandmarkindexbi-directedguidedtraversalCondition=FALSE--Condition=?Cost=9Potential=46Condition=?Cost=2Potential=9Condition=TRUE……ExperimentalStudy16DatasetYoutube(1.61millionnodes,4.51millionedges)
(http://netsg.cs.sfu.ca/youtubedata)
YahooWebgraph(3millionnodes,14.98millionedges)(/catalog.php?datatype=g)AlgorithmsGraphpatternmatching:ResourceboundedsimulationalgorithmRBSimOptimizedstrongsimulationpatternmatchingMatchOptResourceboundedsubgraphisomorphismRBSubOptimizedVF2Reachability:ResourceboundedreachabilityRBReachBFSandoptimizedBFSovercompressedgraphsLM:applyinglandmarkvectors(4*Log|V|landmarks)Efficiencyofresourceboundedsimulation17Varyingα(10-5),YahooRbsimis5.5timesfasterthanMatch-OPT;RBSubis6.25timesfasterthanVF2-OPTonaverageVaryingα(10-5),YoutubeAccuracy18Varyingα(10-5),accuracy,Yahoo89%-100%forsimulationqueriesbothachieves100%accuracywhenα>0.0015%,Efficiencyofresourceboundedreachability19RBreachis62.5timesfasterthanBFSand5.7timesfasterthanBFS-OPT
Varyingα(10-4),YahooVaryingα(10-4),YoutubeAccuracy20Varyingα(10-4),accuracy,Yahoo>=96%achieves100%accuracywhenα>0.05%,Conclusion21ResourceboundedqueryingforbiggraphprocessingDynamicreduction+approximatequeryansweringLocalqueries:strongsimulation,subgraphisomorphismNon-localqueries:reachabilitytunableperformance,abalanceofresourceandanswerqualityMoretobedone…Maximumaccuracyratioη
resourceboundedalgorithmscanguarantee?Graphquerypatternswithoutpersonalizednodes,moregraphqueryclasses…Distributeddeployment(MapReduce,GraphLab)Deploymentinemergingapplications(knowledgegraph,cyber
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年城市別墅裝修改造服務協(xié)議
- 2024水電項目工程承包協(xié)議范本
- 2024年酒店用品買賣協(xié)議
- 2024年房屋租賃三方協(xié)議樣本
- 店鋪裝修設計與施工一體化協(xié)議模板
- 2024年度勞動力成本協(xié)議樣本
- DB11∕T 1697-2019 動力鋰離子蓄電池制造業(yè)綠色工廠評價要求
- 2024年度中央空調系統(tǒng)翻新工程協(xié)議
- 2024商業(yè)采購協(xié)議模板全面指南
- 2024年輔導班家長服務協(xié)議
- 水系統(tǒng)中央空調工程材料清單
- 小學六年級數(shù)學上冊口算題300道(全)
- 《干粉滅火器檢查卡》
- 校園監(jiān)控值班記錄表(共2頁)
- 試樁施工方案 (完整版)
- 走中國工業(yè)化道路的思想及成就
- ESTIC-AU40使用說明書(中文100版)(共138頁)
- 河北省2012土建定額說明及計算規(guī)則(含定額總說明)解讀
- Prolog語言(耐心看完-你就入門了)
- 保霸線外加電流深井陽極地床陰極保護工程施工方案
- 藍色商務大氣感恩同行集團公司20周年慶典PPT模板
評論
0/150
提交評論