版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
IncrementalLearningSystemforFalseAlarmReductionPreprocess–FeatureConstruction12AssetInformationRecordtheinternaldeviceinformation,including…IPaddressOperationsystemDevicetypeRouterComputerServer…..Foranalyststorecordtheirownassets3SnapShot-AssetInformation4FeatureExtractorConstructthefeaturesbyextractingtheattributesofassetinformationandalertsinformationgeneratedbyIDSWechoice12featuresand1labelfeature6alertinformationSignaturename(1)SourceanddestinationIPaddress(2)Sourceanddestinationport(2)Protocoltype(1)6assetinformationIntranetorexternalnetworkincorrespondingsourceordestinationIP(2)OperationsystemincorrespondingsourceordestinationIPifitcanbeknown(2)OperationSystemincorrespondingsourceIPifitcanbeknown(2)5FeaturesAnalysisinDARPA2019alertsUsinginformationgaintoranktheabilityofdiscriminatingRankedattributes:0.790921sig_name0.787112ip_src0.773145layer4_sport0.736layer4_dport0.576783ip_dst0.443654ip_proto0.3036911dst_os0.019348src_os0.019229src_devtype0.019097src_intranet0.0053812dst_devtype0.0053810dst_intranetSelectedattributes:1,2,5,6,3,4,11,8,9,7,12,10:126SummaryClearly,wecanfindsourceIPhasagoodscore.Itrepresentsthatblacklistmaygiveagoodeffortfordiscrimination.ThedestinationOShasagoodscoreinalloftheextraassetinformation.ThatmightrepresentsattackerusuallyfocusonspecificOStoinjectspecificattackiftheyknowthetargethostinformation.Statisticmodelshouldhavenotbadresultaccordingtosomefeatureshavehighcapabilityofdiscrimination7FutureworkActivescantheinternaldeviceNessusOutgoingtrafficanalysisExtendmorefeatures8IncrementalLearningEngineIncrementalstrategyforconqueringconceptdrift910PreliminaryIncrementalEnsembleAlgorithmIncrementalLearning(Inspiredby[6,7,8])IncrementaltuningtheweightofthemodelsincommitteeAdvantagesContinuouslearning[11]Withoutseeingoldexamplesforre-trainingFacetheconceptdriftproblemBoostingweaklearnerbyre-weightingtheweightofexamples11PreliminaryIncrementalEnsembleAlgorithm(Cont.)StrategiesLearningstrategyCreatethenewlearnerforthenewincomingchunkdata“失敗為成功之母“,”以古鑑今”:LearningfromthepreviousmistakeofpredictionineachroundForgettingstrategyCommitteewillgetbiggerovertime,wehavetoforgetthehelplessinformation“績(jī)效評(píng)等”:Counttheerrorpredictiontimes(accuracy<1/2)“老化淘汰”:agingforgraduallyforgettingValidationteststrategy“物競(jìng)天擇“:Optimaselection,leavethecommitteewiththebestperformanceinvalidationsetCommitteeDecision“投票表決”:Majorityvotingforreducingvarianceandbias12NotationWegoteachchunkdataXTthroughtimeT=t1,t2,…,tk,…Xt1={(x11,w11,z11)1,(x12,w12,z12)2,…,(x1n,w1n,z1n)n,….,(x1N,w1N,z1N)N}Xt2={(x21,w21,z21)1,(x22,w22,z22)2,…,(x2n,w2n,z2n)n,….,(x2N,w2N,z2N)N}…Xtk={(xk1,wk1,zk1)1,(xk2,wk2,zk2)2,…,(xkn,wkn,zkn)n,….,(xkN,wkN,zkN)N}XT={(xtn,wtn,ztn)n=1..N}Wheren=1~Ndenotetheindexoftrainingexamples,inchunksizeNXtn:inputvectorwherenrepresentthen-thexampleint-thchunk.Wtn denoteXtn’scorrespondingweightZtn denoteXtn’scorrespondingclass13InitialExampleWeightinEachChunkWnbeset1/Nindefault(C-1=1andC+1=1)IftherearedifferentcostindifferentclassEx:RelevantalertsalmosttakethelowproportionofwholealertsThen,wemightgivedifferentweightinrelevantexamplessuchlikeC-1=1andC+1=5014SigmoidAgingFunctionforForgettingInspiredbyagingmethodandsigmoidfunctionφ(v,a,b):ProposedSigmoidAgingismodifiedbysigmoidfunctioninto0~1V={vt1,vt2,…,vtk,..}:Countthetimesoferrorprediction(accuracy<1/2)beforeeachroundlearningforeachlearnervt’sinitialvalueis0,aistheslopeofsigmoidfunctionbisthemoverightparameterwithdifferentinitialpoint15SigmoidFunctionOriginalSigmoidfunctionEx:a=116SigmoidAging-ForgettingCurveProposedSigmoidAgingFunction.Ex:a=2,b=417Step1:BegintoLearnSettheinitialweightinthechunkandtrainthelearnery1ThenwegotthecommitteeYTandthevotingweightα1ofy1{w1n}n=1~Ny1(xt1)Committeet1Xt1={..}18Step2:GotNewChunkDataNewincomingchunkXt2={(X2n,W2n,z2n)}n=1~N
y1(xt1)Xt2={..}19Step3:CounttheTimesofErrorPredictionNewincomingchunkXt2={(X2n,W2n,z2n)}n=1~N
y1(xt2)Xt2
={..}Ifaccuracy<?thenvt=vt+120Step4:CalculatetheExampleWeightCalculatetheexampleweightbypreviousmistakesSettheinitialweightW2nandpasstheoldcommitteeYt1togetnewweightW’2n{w2n}n=1~Ny1(xt2)Xt2={..}{w’2n}n=1~N21Step5:TraintheNewLearnerTrainthelearnery2bytheXt2={(X2n,W’2n,z2n)}n=1~N
ThenwegotthecommitteeYTnewandthevotingweightα2ofy2CommitteeYT{w’2n}n=1~Ny2(xt2)NewCommitteeYTnew22Usingvalidationsetrecordedbytherecentdatainbuffer.OptimaselectionleavethebestperformancecommitteeasthewinnerfromYTandYTnewStep6:ValidationTestOriginalCommittee:NewCommittee:23Step7:RepeatthisProcesstillterminatedRepeatthisprocesswhenwegotthenewchunk(gotoStep2)tillthisprocessbeterminated24CommitteeDecisionFunctionDecisionfunctionwithagingWhenanewalertinstanceisgenerated,thepredictionclasswillbedeterminedbythecommittee25Experiment(1)MotivationHowmanysizeoftrainingdataisenoughfortrainingaclassification?And,howlongwillitbecomeuselessorunreliable?ExperimentdesignDARPA2019alertdata26DifferentTrainingSizewithDifferentModels27SummaryInthisexperiment,wecangetMostmodelswillgetwellaccuratewhenithasmorethan30%ofwholedataasthetrainingdataThatmeansitmightdowellforpredictinglike3timessizedataoftrainingdataSo,wemightarrangethenextexperimenttodemonstratewhentimeisthebettertimetoinvokethenewlylearningprocess28Experiment(2)MotivationUnderstandtheeffectofconceptdriftproblemExperimentdesignCalculateaccuracyofeachchuckdatabythecommitteeatthattime29AccuracyComparisoninEachChunkDecisionStumpEmploythelastmodelforpredictioncurrentchunkdata30AccuracyComparisoninEachChunk
DecisionStumpComparewithholdingpreviousentiremodel(s)topredict31AccuracyComparisoninEachChunk
DecisionStumpIncrementallearningwithoutvalidation32AccuracyComparisoninEachChunk
DecisionStumpTakeadvantageofpruningvaluelessnewlearnedmodel33AccuracyComparisoninEachChunk
DecisionStump34ExperimentalResult
DecisionStumpModelTypeChunkSize2200ChunkAve.Accuracy(%)Accuracy(%)LeaveRecent1Model87.1972(Std.0.2826)87.6192EntireModels35.4277(Std.0.4325)65.7924ILwithoutValidationStrategy(a,b)=(2,3)75.5721(Std.0.3766)88.0392ILStrategies(a,b)=(2,3)88.4456(Std.0.2556)92.0935SummaryClearly,ouralgorithmhavethecharacteristicsofreducingvarianceandbias.Itcanreducetheeffectofconceptdriftwhenithappened.Therearefourgapsinthediagram,whichreferredtotheconceptchangeInthefirstgap,ouralgorithmbalancebetweenrecentoneandentiremodelsInthesecondone,ourvalidationstrategyissuccessfultojumpthegapbypruninguselessnewlearnedmodelInthethirdandfourthgaps,algorithmswithandwithoutvalidationstrategyhavedifferencesabilitiesinreducebiasandvarianceingoodway.Inthelittletailpart,thebalancemodelcantakeadvantageoftheentiremodels’accuracy.Therefore,wecangetbetterperformance.36AccuracyComparisoninEachChunk
C4.5Employthelastmodelforpredictioncurrentchunkdata37AccuracyComparisoninEachChunk
C4.5Comparewithholdingpreviousentiremodel(s)topredict38AccuracyComparisoninEachChunk
C4.5Incrementallearningwithoutvalidation39AccuracyComparisoninEachChunk
C4.5Takeadvantageofpruningvaluelessnewlearnedmodel40AccuracyComparisoninEachChunk
C4.541AccuracyComparisoninEachChunk
C4.542AccuracyComparisoninEachChunk
C4.5Comparewiththebatchmodelre-trainedwithpreviousentiredata43ExperimentalResult
C4.5ModelTypeChunkSize2200ChunkAve.Accuracy(%)Accuracy(%)LeaveRecent1Model82.3623(Std.0.3347)76.3236EntireModels50.7340(Std.0.4322)72.4316ILwithoutValidationStrategy(a,b)=(2,3)80.0802(Std.0.3321)81.1242ILStrategies(a,b)=(2,3)88.1399(Std.0.2557)93.6834Re-train88.01(Std.0.2337)93.9944Experiment(3)Motivation:ComparetheperformanceofbatchtrainingwiththeperformanceofincrementaltrainingExperimentdesignBatchtraining:weusetheentirelypreviousdataasthetrainingdatatore-trainthemodel.Incrementaltraining:Employourproposedalgorithm.45AccuracyComparisoninEachChunk
DecisionStump46AccuracyComparisoninEachChunk
DecisionStumpCompareourmodelwithbathlearningmodel(Re-trainbyentirepreviousdata)47AccuracyComparisoninEachChunk
DecisionStumpCompareourmodelwithbathlearningmodel(Re-trainbyentirepreviousdata)48ExperimentalResult
DecisionStumpModelTypeChunkSize2200ChunkAve.Accuracy(%)Accuracy(%)ILStrategies(a,b)=(2,3)88.4456(Std.0.2556)92.09Re-train93.9735(Std.0.1848)92.03Re-trainwithAdaboost93.7417(Std.0.1960)94.2249ExperimentalResult
NaiveBayesModelTypeChunkSize2200ChunkAve.Accuracy(%)Accuracy(%)ILStrategies(a,b)=(2,3)84.4670(Std.0.3051)92.3855Re-train90.5612(Std.0.2355)94.75Re-trainwithAdaboost88.2039(Std.0.2561)93.3250SummaryAlthoughcomparingourmodelwiththemodelsre-trainedbyseeingpreviousentiredatacouldgetlessperformance.Ourperformance,however,isveryclosewiththeothers.Thepointisourmodelwithoutseeentiredata,thatcouldsavemanyresourcesespeciallyintrainingphase.51Experiment(4)MotivationForthemethodologiesofcombiningmodels,holdingalltrainedmodelsorjustholdingrecentmodel,whichoneisthebetterstrategiesExperimentdesignHoldentiremodelsincommitteeforpredictionHoldconstantsizemodelsinrecentincommitteeforpredictionHolddynamicsize,ourproposedalgorithm,modelsincommittee.Withdifferencebasemodeltoobservetheresult52PerformanceCompare
DecisionStumpModelTypeAccuracy#ofActiveModel#ofEntireModelsLeaveRecent187.6192125LeaveRecent386.0509325EntireModels65.79242525ILwithoutValidation(2,1)84.1184924ILwithoutValidation(2,3)88.03921023ILS(2,1)91.2516410ILS(2,3)92.09410Re-Train92.031153SummaryThebestresultisourentireILstrategies.Especially,thewholeprocesswejustuselessthanhalfofthenumberofEntireModels.Fastanddecreasethememoryexhausted.54Question&Answer551.WhatarethedisadvantageofrulebaseSystem?DisadvantageofrulebasesystemRoughtodetectnoveltypedataDifficulttomaintainOverlapHardforanalysttorepresenttheirknowledgeinruleThroughput562.IsTrainingDatatheBatchDataorStreamData?BatchdataNeedanalystresponsehisfeedbackItcouldbebatchdataButthesizeofbatchchunkshouldbedifferentineachresponseNewAlertsraisedbyIDSAnalystcheckwhetheritisTPorFP(Givefeedback)Submitthechunkdataforincrementallearning573.Whyusingboostingapproachastheincrementalkernelapproach?ReducevarianceandbiasInfrequentlychangeableenvironment,thelearningmodelshouldbelearningfastandasfarasreducememoryexhausted.Weaklearnershouldbeboosted.Evidently,theresultshowoutthiskindofapproachcanreducetheeffectofconceptdriftespeciallyinthechangepoint.584.IsDARPA2019datasetincludingtheconceptdriftorconceptchange?Conceptdrift(Contextchange)infourthandfifthweeks’dataAccordingtotheproposalofthedataset,thenovelattackswereinvokedduringthefourthandfifthweeks.ConceptchangeIfweobservethedatathroughtime,theyalsohavenewdifferentbehaviorchangedsuchliketheattackfree(1,3weeks)andthenovelattacks(4,5weeks)Butifi
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 廣東科學(xué)技術(shù)職業(yè)學(xué)院《工程材料與構(gòu)造》2023-2024學(xué)年第一學(xué)期期末試卷
- 廣東交通職業(yè)技術(shù)學(xué)院《油層物理實(shí)驗(yàn)》2023-2024學(xué)年第一學(xué)期期末試卷
- 廣東建設(shè)職業(yè)技術(shù)學(xué)院《電子商務(wù)原理》2023-2024學(xué)年第一學(xué)期期末試卷
- 廣東工商職業(yè)技術(shù)大學(xué)《中國(guó)影視鑒賞》2023-2024學(xué)年第一學(xué)期期末試卷
- 閑聊培訓(xùn)課件
- 《動(dòng)態(tài)路由協(xié)議配置》課件
- 贛西科技職業(yè)學(xué)院《大數(shù)據(jù)金融應(yīng)用》2023-2024學(xué)年第一學(xué)期期末試卷
- 應(yīng)急救援安全培訓(xùn)課件
- 贛州師范高等??茖W(xué)?!缎畔踩夹g(shù)導(dǎo)論》2023-2024學(xué)年第一學(xué)期期末試卷
- 工程寶培訓(xùn)課件
- 職工代表大會(huì)表決辦法
- 2009-2022歷年新疆生產(chǎn)建設(shè)兵團(tuán)事業(yè)單位考試A崗真題附答案解析2023上岸甄選資料
- 專升本英語(yǔ)寫(xiě)作專題講解課件
- 平安保險(xiǎn)授權(quán)委托書(shū)
- 員工安全培訓(xùn)教育制度
- 深圳證券交易所
- 各種管道護(hù)理評(píng)分標(biāo)準(zhǔn)
- 體育賽事志愿者管理
- 遼寧華電高科環(huán)保技術(shù)有限公司油泥煤循環(huán)流化床協(xié)同資源化工業(yè)化試驗(yàn)項(xiàng)目環(huán)境影響評(píng)價(jià)報(bào)告書(shū)
- 復(fù)旦大學(xué)普物B期末考試試卷
- 剪映教程PPT版本新版剪影零基礎(chǔ)教學(xué)課程課件
評(píng)論
0/150
提交評(píng)論