




版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
ModelingforCrimeBusting
Introduction
Aconspiracyisfoundtoembezzleacompany’spropertyandtostealmoneyfromthecreditcardsofthecompany’scustomersthroughthenetwork.Anorganization,theIntergalacticCrimeModelers(ICM),hasidentified7knownconspiratorsand8knownnon-conspiratorsandneedstofindouttheothermembersandtheleaders.Theconspiratorsandthepossiblesuspectedconspiratorsallworkforthesamecompanyinalargeofficecomplex.ICMhasrecentlyfoundasmallsetofmessagesfrom83workersinthecompany.Now,themessagesaredenotedwithanetworkthatshowsthecommunicationlinksandthetypesofmessages,wherethereare83nodes,400links(someinvolvingmorethanonetopic),over21,000wordsofmessagetraffic,15topics(3havebeendeemedtobesuspicious).Thegoalofthemodelingeffortistoidentifythemostpossibleconspiratorsintheofficecomplex.Itwouldbeidealtogiveaprioritylistofmostlikelyconspirators,adiscriminatelineseparatingconspiratorsfromnon-conspiratorsandthepossibleconspiracyleaders.
Assumptions
Node4,10,16aretheseniormanagersofthecompany.
Variablesymbol
Variable
Definition
P
Possibilityofbeingconspirator
Pc
Suspiciousrateofone’smessages.Pcequalstotherateofthenumberofone’smessagesinvolvingtopics7,11,13(theyareconsideredconspiratorialinthefollowingtext)tothenumberofhis/hertotalmessages.
wi
Weightdeterminedbasedonthesuspiciousdegreeofacertainmessagetopic
Wequ
Equivalentweight.Itdenotesthesumoftheweightsofallone’smessagetopicsdividedbythenumberofhis/hertotalmessages.
Cnc
Valueofnetworkcentrality
C’nc
RevisedCnc
K
Asetofkeywords
A
Asetofwords.
ai
Anywordi.It’sanelementofA.
si
Anysentencei
Sim(ai,aj)
Sentencesimilarityofwordsaiandaj
Sim(Si,Sj)
SentencesimilarityofsentencesSiandSj
Dis(a1,a2)
Worddistancebetweenwordsa1anda2
Len(Si)
NumberofwordsinsentenceSi
Analysisoftheproblem
Inordertogettheresultsofprioritylistaboutcrimesuspiciouspossibility,weshouldanalyzethenetworkandthetopics,andvalidatethecredibilityoftheresultswiththe7knownconspiratorsandthe8knownnon-conspirators.Wecanseefromtheproblemthatascertainingwhetherthethreeseniormanagersareconspiratorshasaguidingsignificanceforjudgingthesuspiciousdegreeofthe15topicsandtheprioritylist.However,wecannotdeterminetheweightvalueofthetopicbytheperson’sstatus(knownconspiratorsornon-conspirators)involved.Weshouldusemanualanalysesorsemanticandtextanalysesforthe15topicstodeterminetheweights.Thehighertheweightis,themoresuspiciousthetopicis.Basedonthenetworkandtheweightsoftopics,wehavecalculatedthesuspiciouspossibilityofeveryonewithcertainalgorithms.
★Task1.Theprioritylistofpossibleconspiratorsandleaders
Judgewhetherthethreeseniormanagersareconspirators
Node4isnotaconspirator
OnlyNode4ofthethreeseniormanagershasnotcommunicatedwiththe7knownconspirators.Inaddition,thesuspiciousrateofhismessagesisthelowestinthethreeseniormanagers,i.e.Pc4=0.286,Pc10=0.357,Pc16=0.625.
IfNode4isaconspirator,thenNodes10and16areconspiratorscalculatedwiththemodel,andalargeportionofthecompany’sworkersareconspirators.Thepossibilityoftheconditionissmallinreality,soNode4isconsideredanon-conspirator.
Node10isaconspirator
Findoutthepeoplewhodiscussthesuspicioustopicswiththe7knownconspiratorsinthenetworkmodel,andtheresultsareNodes7,10,1317,28,36,38,60,81,inwhich,Nodes7,10havediscussedfor6,4timesrespectively,andtheothersallhavediscussedonce.Thus,theseniormanagerNode10isconsideredaconspirator.
Node16isnotcertain
TheidentityofNode16isnotcertainforthemoment,sowedon’tmakeassumptionstemporarily.
Model1
Analysisthesuspiciousdegreeofthe15topicsandendowthemwithdifferentweightsrespectively,thus,wegotW=[w1,w2,...,w15].
Topics7,11,and13aredeterminedsuspicious.Hencew7,w11,w13areallsetto0.9;
BecauseNode10isaconspiratorasaseniormanager,thesuspiciousdegreeofhishighfrequencytopicisalsohigh.Accordingtothestatistics,Node10hasdiscussedfor14times,inwhichthefrequencyofoccurrenceofTopics4and6are0.429,andintheconditionthatonly3messagesinvolvetwoormoretopics,Topics4and6concurredtwice.Thereforew4,w6are0.7;
SomemessagesinTopic7usedSpanishwords,possiblyascodes.Topics2,12alsoinvolveusingSpanish,sow2,w12are0.6;
Topic5involvesseveralpersonsincludingtwoknownnon-conspirators,andtheconsensusinthediscussionisthatthecompanyhasgoneoverboardonsecuritytothedetrimentofoperations.Sothepossibilitythatpersonsinvolvedinthistopicarenon-conspiratorsishigh.Thus,w5is0.2;
ThecontentsoftheotherTopics1,3,8,9,10,14,15areneutral,andwecan’tdeterminewhetherpersonsinthetopicshavethetendencyofconspiracy.Sotheweightsofthemareall0.4.
Insum,W=[0.4,0.6,0.4,0.7,0.2,0.7,0.9,0.4,0.4,0.4,0.9,0.6,0.9,0.4,0.4].
FindoutconspiratorswithnewnetworkG
AccordingtoSection2.1,weendowthesetoflinksinthenetworkwithweights.Ifalinkinvolvesmorethanonetopic,itsweightissetwiththehighestweightoftopics.Foralinkl,W(l)iscalledtheweightofl,0<W(l)<1.
Inthemeantime,inordertoevaluatethepossibilityofeachpersontobeaconspirator,weendowthesetofnodesinthenetworkwithweights.Then,wegetafunctionP(v),0<P(v)<1.Foralltheknownconspirators,P(v)=1,andfortheothers,P(v)=0.Thus,weobtainanewnetworkGbasedontheoriginalnetwork.
Foranynodei,wedefineaniterativeequationofitsPfunctionasfollows:
Pi=maxPj×Wlijl(i,j)∈L}(1)
WiththeiterativeequationandtheoriginalPfunctionoftheconspirators,wecangetthePfunctionsoftheothersthroughiterativecalculation.Itsrationalityisevident:inadialogue,ifthepossibilityoftheinitiatortobeaconspiratorisdetermined,andthesuspiciousrateofthetopicisknown,thenthepossibilityofthereceivertobeaconspiratorisevaluated.Inaddition,basedontheprudenceoftheinvestigationofcrime,weareconservativeinaniterativecalculation,thatis,thePfunctionofanoderemainsitshighestvalueinhistory.
Duetothecomplexityofthenetwork,theiterativechangeofthePfunctionhasaftereffect.Sowecan’tsolvethePfunctionofeachnodewithnormaldynamicprogrammingalgorithminlinearstructure.ButwecansolvethePfunctionwithiterativemethodinnetworkandwecanprovethatthePfunctionisconvergentiniterativeprocedure.
WecallasuccessfuliterationtobeaRelaxoperation.ItisevidentthatifandonlyiftheRelaxoperationcan’tbeexecuted,thePfunctionisconvergent.
Sowegetthealgorithm1:
Algorithm1:
Begin
Initializeedgeweighsandnodeweighs;
Loop
Foreachl(i,j)∈Ldo
IfP(i)<P(j)×W(lij)andnodeiorjisnon-conspiratorinadvancedo
P(i)=P(j)×W(lij);//Relax
Untilthereisnorelaxoperate
End
WecangetthefinalconvergentPfunctionforallthepersonswiththerevisedalgorithm,whichisthepossibilityofconspiratorinthefollowingtable.Throughthisvaluewecanevaluatetheirpossibilitytobeconspiratorsinthiscase.Inthiscase,wesorttheresultsfromhightolowbydualkeywords.Thefirstkeywordisthepossibilityofconspirators(P),andthepersonswiththesameParesortedwiththesecondkeyword,theequivalentsuspiciousweights(Wequ),fromhightolow.Thuswegetthewholearrangementofthe83personswiththesuspiciousdegreefromhightolow.
Thenweusethegoldenratiotoseparatetheconspiratorsfromtheinnocent,thatis,theratioofthenumberofconspiratorstothetotalnumberis0.618:1.Thecalculatednumberofconspiratorsis31.7.Butastable1shows,thefirstkeywordofthepersonsfromthe32tothe43arethesame.SoitisnotappropriatetotakeNode32asaconspirator.Thusthereare31personsintotalareconspirators.Theinitialcalculationresultsareastable1shows.
Table1
theinitialresultsofprioritylistaboutcrimesuspiciouspossibility
Notes:inthe“rank〞column,theonesmarkedbyblackareconspirators,andgreennon-conspirators.Theonesmarkedbypinkareseniormanagers.Thedashedistheseparationlinebetweenconspiratorsandnon-conspirators.
Findout“Inez〞,“Bob〞and“Carol〞(CorrectionStrategyofModel1)
Inordertoavoidthesituationlike“Carol〞,weincreasetheproportionofinnocentsascomparedwiththegivencase,thatis,theproportionofinnocentsis0.618(greaterthanthegivencaseof0.5),anditcanreducethepossibilityofinnocentstobewronged.
Inordertofindoutsomeonelike“Inez〞and“Bob〞whohidedeeply
ThemessagesuspiciousratePcofeveryoneiscalculated.ItisfoundthatPcofnode7,16,21,28,33,51,54,56,57,67,72,75,79,80or81isequalorgreaterthan0.5.Comparingthemwiththe31conspiratorsinTable1,wefindthatNodes56,57,72,75,79,80areleftout.BytrackingwithwhomNodes56,57,72,75,79and80havetalkedandthetalkingtopics,wefoundthatalthoughNodes75and80areinvolvedinthesuspicioustopic,theyaretoldbytheknownnon-conspirators,sotheirpossibilitiesofconspiratorsarenothigh.
AccordingtotheresultsinTable1,seniormanagersNode10and16areconspirators.Trackingthepersonswhotalkaboutthesuspicioustopics7,11,13withthem,wefindNode3,6,8,11,17,30,35,49,51paringthemwiththe31conspiratorsinTable1,wefindthereisnoNodes8and35.
Insum,wethinkNodes8,35,56,57,72,79areconspiratorshiddendeeply.Finally,wedetermine37conspiratorsand46innocents.TheprioritylistisasTable2shows.
Table2
Theprioritylistaboutcrimesuspiciouspossibility
Notes:inthe“rank〞column,theonesmarkedbyblackareconspirators,andgreennon-conspirators.Theonesmarkedbypinkareseniormanagers.Thedashedistheseparationlinebetweenconspiratorsandnon-conspirators.
Validationoftheresults
Inordertovalidatethecredibilityofmodel1,the8knownnon-conspiratorsarenotsetto0frombeginningtoend.AccordingtoAlgorithm1,thesuspiciouspossibilitiesofthe8personsincreaseaslongastheysatisfysomeconditions.Thuswegetanewlistinwhichtheranksofthe8personsareasTable3shows.
Table3
therankof8knownnon-conspiratorsintheprioritylist
ItcanbeseenthatthePvaluesofmostofthemarestillsmallintheconditionthattheirPvaluesarenotrestrictedto0,whichmakesthemranklowinthelist,thereforetheyarestillconsiderednon-conspirators.TheonlyexceptionisNode2whosePvalueisbigandheisconsideredaconspiratorbymistake.Fromtheaboveweknowthatmostoftheknownnon-conspiratorsarestillnon-conspiratorswhentheconditionthatrestrictsthePvalueofthemto0iscancelled.Itisinagreementwiththereality.Thuswevalidatethefeasibilityofmodel1,andithasarelativelyhighcredibility.
Inthiscase,ICMhasalreadyknown7conspirators.Selectonefromthemrandomly,supposingthattheidentitiesofthemareunknown.Thatistosay,weonlyknowsixconspirators.Ceterisparibus,wecalculatethepossibilitiesofconspiratorbyAlgorithm1.Whenwechooseonefrom18,21,37,43,49,54,and67atwill,tracetheirRankwhentheprogramisrunningandtheresultscanbeseeninTable4.
Table4
therankofthe8knownnon-conspiratorsintheprioritylist
Aswecansee,whenchoosingonefrom18,21,37,43,and49,wethencalculatethepossibilityofconspiratorbyAlgorithm1.Finally,wecanseetheranksofpossibilitiesofconspiratorareallabove40.Althoughtherankof37and43donotconformtotheresultofTask1strictly,theerroriscontrolledwithinthereasonabletolerance.Thus,itcanbedecidedthattheyareaccessorytoacrimewhichconformtothefact.Whenwechoose54and67,wealsocalculatethepossibilitiesofconspiratorbyAlgorithm1.Finally,weobtainthelowranks.HoweverwhentheyarecorrectedbystrategiesinModel1,wefoundthatthepossibilitiesoftheconversationsuspicionof54and67,whichare0.8and0.867,arehigh.Therefore,itcanbedecidedthattheyareaccessorytoacrime.Asaconsequenceoftheabove,thevalidityoftheresultofmodeoneisdependable.
Thus,thismodelcouldprovideamorecomprehensiveguidetovariousfieldsindissimilarsituation,notjustcrime.
WeanalyzetheEZcasewiththeabovemodelandalgorithm,andtheprioritylistaboutcriminalsuspiciouspossibilityisasthefollowingshows.
Table5
theprioritylistaboutcrimesuspiciouspossibility
WeknowfromTable5thatthefirst6intheranklist,thatis,George,Dave,Ellen,Inez,BobandCarol,aremostpossiblyconspirators.ButtoInez,BobandCarol,theirfirstpriorityorderandsecondpriorityorderarethesame.
Thenweneedtoanalyzethe27messagesfurtherintextandmeaningtodecidetheirorder.
Nominatetheconspiracyleaders
Weselectthe16personswhosePvaluesareequalorgreaterthan0.9inTable2,theyconstituteaspecialnodesetintheoriginalnetwork.Selectingthenodesandremainingthelinksbetweenthem,wegotasub-networkG’.Itsnetworktopologicalstructureishelpfulforustoanalysistheorganizationandcompositionofthewholecriminalgang.
Model2:Findouttheleadercandidates
Asaleader,hisrepresentingnodeshouldhaveaveryhighnetworkcentralityinthenetwork.Anodewithahighnetworkcentralitynotonlyrequiresitisinthecentreofthenetworkstructure,butalsorequiresthatitisanimportantlinktoothernodes.Weintroducethreeindicatorsingraphtheorytoassessthenetworkcentrality.
Degree:measurementtonodeactivity.Itisusedtomeasurewhetherthisnodeisthecentralnodeofthecriminalnetwork.
ThecalculationofDegreeis:
Cdk=i=1nA(i,k)(2)
whereAisa0-1matrixdescribingthelinkstructureofG’.
Closeness:sumoftheshortestdistancesbetweenthisnodeandothernodes.Itrepresentstheclosenessofthisnodetoothernodes.
ThecalculationofClosenessis:
Cck=i=1nL(i,k)(3)
whereLisamatrixdescribingtheshortestpathofanytwonodesinG’.
Betweenness:measurementoftheextentthatanodecanmaketheothernodesinterconnected.Itmeasurestheabilityofthenodeasamediator,thatis,itoccupiesthepositionbetweentheothertwonodes,andifthisnodeloseseffectiveness,theothertwonodescan’tbeaccessed.Thisindicatorisimportant,anditdirectlymeasuresthestatusofthenodeinnetworkinformationcommunication.
ThecalculationofBetweennessis:
Cbk=i=1nj=1nFij(k)(4)
whereFij(k)denoteswhethertheshortestpathbetweennodesiandjpassesnodek,anditis1whenitpassesand0whennot.
Inordertocoordinatethethreeindicators,wedefinethevalueofnetworkcentralityas
Cnc=αCd+βCc+γCb(α+β+γ=1)(5)
Basedonthedifferentimportanceofthethreeindicators,wesetα=0.2,β=0.3,γ=0.5.
Weselectthe16nodeswhosepossibilitiesofconspiratorsarehighestintheoriginalnetwork,remaintheirlinks,andconstituteasub-networkG’.Withtheaboveevaluationmethod,wegettheCncvalueofthe16nodes,asshowninTable6(a).
Optimizationofmodel2:togivetheorderofleaderpossibilities
Theabovemethodisbasedonthenetworkstructureofthecriminalgangtosearchtheleaders,butneglectthespecifictalkingcontent,sotheresultisnotveryreasonable.Thedialoguesofsomeonehaveinvolvedonlyafewsuspicioustopics,sohecan’tbeanimportantperson.Totheworkerrepresentedbynode18,althoughhisCncisveryhigh,heonlytalksabouttwoofthethreesuspicioustopics,anditisimpossibleforaleader.
Therefore,weintroducethemessagesuspiciousratePcintheprevioustexttomodulateCnc,andgetridoftheobjectswhoarenotinvolvedinallthethreesuspicioustopics.
Thevalueofthenewnetworkcentralityis:
Cnc'=Cnc×Pc(6)
Resultsanalysis
Throughmodulation,wegettheorderofsuspiciousextentforthesuspiciouscandidatesoffiveleaders,asshowninTable6(b).
Table6
Theprioritylistofpossibleconspiracyleaders
Contrastingtheresultsbeforeandafteroptimization,wefindthatthenominationofcriminalleadersonlyinasmallscaleaffectstheorderofcertainindividuals,andtheoverallrankingtrendsandpersonnelarebasicallythesame.Theoptimizedmodelimprovestheaccuracyoftheresults.
★Task2
Accordingtotherequirement2,Topic1isalsoconnectedtotheconspiracyandthatChrisisoneoftheconspirators.
FindoutconspiratorswithnewnetworkG
BecauseTopic1isalsoconnectedtotheconspiracy,W1is0.9.Thevaluesofotherweightsdonotchange.Thus,hereisanewsetthatW=[0.9,0.6,0.4,0.7,0.2,0.7,0.9,0.4,0.4,0.4,0.9,0.6,0.9,0.4,0.4].
AccordingtoSection2.1inModel1,weendowthesetoflinksinthenetworkwithweights.Foralinkl,W(l)iscalledtheweightofl,0<W(l)<1.Inthemeantime,weendowthesetofnodesinthenetworkwithweights,andwegetafunctionP(v),0<P(v)<1.Foralltheknownconspirators,P(v)=1,andfortheothers,P(v)=0.Then,wegetanewnetworkG.
ByapplyingModel1andAlgorithm1inTask1,theresultsareasstatedinTable7.
Table7
Theinitialresultsofprioritylist
Notes:inthe“rank〞column,theonesmarkedbyblackareconspirators,andgreennon-conspirators.Theonesmarkedbypinkareseniormanagers.Thedashedistheseparationlinebetweenconspiratorsandnon-conspirators.
Findout“Inez〞,“Bob〞
WiththesamemethodasinTask1,weanalyzethosewithahighsuspiciousdialoguerateinordertofindouttheconspiratorswhohidedeeply,andwegettheconspiratorswhosenodesare56,57,72,and79.Thenwesearchthosewhohavetalkedaboutthesuspicioustopicswiththeseniormanagers10and16,andcomparewiththe31conspiratorsinTable8,wefindNode3,8,11,30,and35arepossiblyconspirators.
Insum,wethinkNode3,8,11,30,35,56,72,and79areconspiratorswhohidedeeply.Finallywedetermine40conspiratorsand43innocents,andtheirprioritylistisasTable8shows.
Table8
Finalresultofprioritylistaboutcriminalsuspiciouspossibility
Notes:inthe“rank〞column,theonesmarkedbyblackareconspirators,andgreennon-conspirators.Theonesmarkedbypinkareseniormanagers.Thedashedistheseparationlinebetweenconspiratorsandnon-conspirators.
Differencesofresults
IntheresultofTask1,theconspiratorsintheoriginalprioritylistincorporateallthosewhosePcareequalto0.81,butinTask2,onlypartofthemareincorporated,andfindsomethatmaybeleftoutasfishescapingfromtheseine.Therefore,theconspiratorsdeterminedinTask2aremoreandmostofthesuspiciouspossibilitiesdonothaveameaningfullyevidentchange.
★Task3
InthecalculationofTask1and2,the400linksinthenetworkofmessagesareequivalentwith15topics.Inotherwords,therewillbenomorethan15valuesofW(l)inthewholenetwork.Suchsimplificationmayweakentheexactnessoftheresult,becausethesuspiciousratesofdifferentdialogueinasingletopicmaybedifferentaswell.
Itisobservedthatthestaffswhohadbeenjudgedasnon-conspiratorwouldgetinvolvedinthosesuspicioustopics,whichcausemoreerrorsintheresult.Iftheoriginalmessagescouldbeobtained,the400messageswillbedealtwithindividually.Firstly,tonarrowthesuspiciousmessagesdownandthentheTextAnalysis(TA)andSemanticNetworkAnalysis(SNA)canbeappliedtoidentifythesuspiciousratesoftheothermessagesandtoendowthemwithdifferentweights.Thismethodcangiveexpressiontothediversityamongthemessages,enrichthevalueofthefunctionW(l),anddecreasethepossibilityofthemisjudgmentofthenon-suspicioustopicsatthesametime.
Semanticandtextanalysesofthecontentofthe400messages
Initialprocessofthe400messagesbyusingSemanticandTextAnalysisrespectively
Thestopwords(suchas“a〞and“the〞)canbedeletedbytext-analysissoftware.Thus,theremainedwordsofpracticalmeaningcanmake400setsofwords.(Asetofwordscanbemadefromonemessage.)
IntroducetheSNAandTAtechnologyandthemodelaboutSentenceSimilarity.NowletusdefinethattheSentenceSimilarityvalueoftwosentencesS1andS2isarealnumberSim(S1,S2)rangedfrom0to1,whichisrelatedtothemeaningsandstructuresofthetwosentences.
WordSimilarity
BeforetheexplanationofhowtocalculatetheSentenceSimilarity,letusdefinethemethodofcomputingtheWordSimilarity.WordSimilarityisanumericalvaluerangedfrom0to1.Thevalueis1comparedwithitsown.Ontheopposite,thevaluewouldbe0whentwowordscannotbereplacedbyeachother.
AnotheressentialindexofmeasuringtherelationshipbetweentwowordsisWordDistance.Ingeneral,worddistanceisininverseproportiontowordsimilarity.WecancalculatethedistancebetweentwowordsdirectlybyreferringtothetreeintheexistingHowNetdictionary.
WecaneasilygettheWordSimilarityvaluebyWordDistance.Fortwowordsa1anda2,wesetWordSimilarityasSim(a1,a2)andWordDistanceasDis(a1,a2).Thus,
SimW1,W2=αDisW1,W2+α(7)
whereαisanadjustableparameter,whichrepresentstheworddistancewhenthesimilarityvalueis0.5.BecausethedepthofHowNet'streewillnotbemorethan10levels,thefarthestdistancebetweentwopointswillnotover20.Sowedefinethatα=3.
AlgorithmaboutSentenceSimilarity
ByWordSimilarity,wecoulddesignanalgorithmtocalculateSentenceSimilarity.
Word-formSimilarity
Thesimilarityoftheword-formintwosentencesismeasuredbytheamountofsynonymsinthesentences.
Firstwedeletethestopwordsinthesentences,andthenwedefinetheword-formsimilaritytobe
Sim1S1,S2=2×SameWord(S1,S2)LenS1+Len(S2)(7)
whereSameWord(S1,S2)istheamountofwordpairsinS1andS2whoseWordSimilarityareequalorgreaterthan0.7,andLenSistheamountofwordsinasentence.
SemanticSimilarity
Toreflectthesemanticsimilarityoftwosentences,wedefinetheSemanticSimilarityasfollows.
Sim2S1,S2=121mi=1mmaxSima1i,a2j1≤j≤n+1ni=1nmaxSima2i,a1j1≤j≤m(8)
wheresentencesS1andS2includewordsa11,a12,……,a1manda21,a22,……,a2n,respectively.
SentenceSimilarity
Usingtheabovecharacteristics,wecancalculateSentenceSimilarityasfollows:
SimS1,S2=αSim1S1,S2+βSim2S1,S2α+β=1(9)
UsingSentenceSimilaritytoendowthecontentofthemessagewithweights
Supposeitisalreadyknownwhichmessageissuspicious.ThenwegotasetofsuspiciousmessagesU.IntroduceW(theweightofsuspicioustopics)inordertocalculatePc(thesuspiciousrateofmessage).Thesuspiciousrateofmessagescouldbecalculatedbycomparethesentencesimilaritybetweentwomessages.
Formessagei:
1)Ifitisalreadyconsideredconspiratorial,wi=0.9;
2)Otherwise,
wi=max0.9×SimSi,Sjj∈U}.(10)
Thenwecouldendowalllinkswithweightsreasonablyaccordingtotheoriginalmessages,andgettheprioritylistaboutcrimesuspiciouspossibilityofeveryonebyusingtheAlgorithm1inTask1.
TheapplicationofModel3inTask1.
Textanalysis
Thetextanalysissoftwarecouldoutput15setsofwordsautomaticallyifweinput15topicsinit.
Table9
15setsofwordsfromtheoutputofsoftware
Note:inthetopicnumbercolumn,theonesmarkedbyblueareknownsuspicioustopics,whichareusedasthebasistojudgeothertopics.
CalculateSentenceSimilarity
Comparedwiththewholemessages,thecontentoftopicswouldbefairlysmall.So,emphasisshouldbegiventothesemanticanalysisofeachsentencewhencalculatingSentenceSimilarity.Here,wedefineα=0.2,β=0.8.
Endowtopicswithweights
Asrequirement,Topics7,11,and13areconsideredsuspicious.Thus,theywillbethereferencestojudgeothertopics.ThenweuseSentenceSimilaritytoidentifythesuspiciousrateofmessages.
Fortopici:
1)Ifitisalreadyconsideredsuspicious,wi=0.9;
2)Otherwise,
wi=max0.9×SimSi,Sjj∈{7,11,13}(11)
ByusingthetechnologyofSNAwegotanew
W=[0.626,0.409,0.298,0.439,0.608,0.608,0.900,0.337,0.379,0.418,0.900,0.419,0.900,0.371,0.364].
Findouttheconspirators
Substitutingtheweightsof15topicsinSection2.3intothecalculationofModel1,wegotthegraphofeachnodeandrelevantsuspiciousratesasfollows:
Figure1thedifferenceinresultbetweensemanticnetworkanalysisandartificialanalysis
Infigure1,theredpolylineshowsthevaluesofweightsendowedbymanualanalysis.ThebluepolylineshowsthevaluesofweightsendowedbyTAandSNA.Itisseenthatthetrendsofeachtwolinesaresimilarwithonlyseveralminordifferencesofpoints.Soitisprovedthatbothmethodsarereasonable.ButthemethodappliedinModel3ismoreaccuratethanmanualanalysis.
★Task4
Ageneralmethod
Inreality,wemayneedtofacemorecomplexproblems,suchas:
morecomplexnetwork,
vastamountofdata.
noprovidedkeymessages.
TheWcannotbeidentifiedbyapplyingthemethodintheModel3becausenomessagecanbereferredtoendowothermessageswithweightsinthissituation.Sothereisaneedforimprovingthealgorithmbyusingothermethods.Togiveanewmethodtoendowmessagewithweightasfollows:
1)Allcontentsoftheconversationsarehonestandtrue;
2)Somedetailsabouttheconspiracyareknown(suchasmotivesofcrime,targetsetc.).
Allmessagescanbeendowedwithweightsbyfullyexploitingthedetailsofcrimeathand.WealsodrawtheconceptofWordSimilarityintothismodelatthesametime.
Extractingkeywords
Toabstractthedetailsofcrime,weneedtoextractsomecharacteristickeywords.
Principlesforextractingkeywords:
1)Keywordsshouldbesubstantialwordsasnoun,verbetc.;
2)Themeaningsofwordsshouldneedtobeconsistent;
3)Basedontheprinciplesabove,theamountofkeywordsshouldbeasmuchaspossible.
Thus,wegotasetofkeywordsK.
Toendowthemessagewithweight
After
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 農(nóng)藥銷售代理合同全文
- 化工原料進(jìn)口代理合同(范本)
- 夫妻和諧共處合同書
- 員工合同樣本集錦
- 國(guó)內(nèi)快遞運(yùn)輸服務(wù)合同細(xì)則
- 單位公益捐贈(zèng)合同協(xié)議
- 合資公司成立的投資合同范本
- 合成氣生產(chǎn)中的催化劑考核試卷
- 寵物友好公共設(shè)施清潔保養(yǎng)質(zhì)量監(jiān)管考核試卷
- 康復(fù)輔具適配與物理治療結(jié)合考核試卷
- 幼兒園百科常識(shí)100題
- 創(chuàng)意美術(shù)課3歲-12歲大師課《彼埃.蒙德里安》課件
- 哲學(xué)與人生(中職)PPT完整全套教學(xué)課件
- 社區(qū)免費(fèi)使用房屋協(xié)議書
- 一年級(jí)語(yǔ)文下冊(cè)《我多想去看看》教案
- 工程EPC總承包項(xiàng)目安全生產(chǎn)管理辦法
- 05臨水臨電臨時(shí)設(shè)施安全監(jiān)理細(xì)則
- 國(guó)家煙草行業(yè)物流管理
- “小學(xué)品德與生活教學(xué)關(guān)鍵問(wèn)題實(shí)踐研究”課題研究中期報(bào)告
- 采購(gòu)入庫(kù)單模板
- GB/T 15566.6-2007公共信息導(dǎo)向系統(tǒng)設(shè)置原則與要求第6部分:醫(yī)療場(chǎng)所
評(píng)論
0/150
提交評(píng)論