探索機(jī)器遺忘的前景:綜述和分類學(xué) Exploring the Landscape of Machine Unlearning- A Comprehensive Survey and Taxonomy_第1頁
探索機(jī)器遺忘的前景:綜述和分類學(xué) Exploring the Landscape of Machine Unlearning- A Comprehensive Survey and Taxonomy_第2頁
探索機(jī)器遺忘的前景:綜述和分類學(xué) Exploring the Landscape of Machine Unlearning- A Comprehensive Survey and Taxonomy_第3頁
探索機(jī)器遺忘的前景:綜述和分類學(xué) Exploring the Landscape of Machine Unlearning- A Comprehensive Survey and Taxonomy_第4頁
探索機(jī)器遺忘的前景:綜述和分類學(xué) Exploring the Landscape of Machine Unlearning- A Comprehensive Survey and Taxonomy_第5頁
已閱讀5頁,還剩35頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

JOURNALOFLATEXCLASSFILES,VOL.14,NO.8,AUGUST2021

1

arXiv:2305.06360v2[cs.LG]14May2023

ExploringtheLandscapeofMachineUnlearning:AComprehensiveSurveyandTaxonomy

ThanveerShaik,XiaohuiTao,HaoranXie,LinLi,XiaofengZhu,andQingLi

Abstract—Machineunlearning(MU)isgainingincreasingattentionduetotheneedtoremoveormodifypredictionsmadebymachinelearning(ML)models.Whiletrainingmodelshavebecomemoreef?cientandaccurate,theimportanceofunlearningpreviouslylearnedinformationhasbecomeincreas-inglysigni?cantin?eldssuchasprivacy,security,andfairness.ThispaperpresentsacomprehensivesurveyofMU,coveringcurrentstate-of-the-arttechniquesandapproaches,includingdatadeletion,perturbation,andmodelupdates.Inaddition,commonlyusedmetricsanddatasetsarealsopresented.Thepaperalsohighlightsthechallengesthatneedtobeaddressed,includingattacksophistication,standardization,transferability,interpretability,trainingdata,andresourceconstraints.Thecontributionsofthispaperincludediscussionsaboutthepotentialbene?tsofMUanditsfuturedirections.Additionally,thepaperemphasizestheneedforresearchersandpractitionerstocontinueexploringandre?ningunlearningtechniquestoensurethatMLmodelscanadapttochangingcircumstanceswhilemaintainingusertrust.TheimportanceofunlearningisfurtherhighlightedinmakingArti?cialIntelligence(AI)moretrustworthyandtransparent,especiallywiththeincreasingimportanceofAIinvariousdomainsthatinvolvelargeamountsofpersonaluserdata.

IndexTerms—MachineUnlearning,Privacy,RighttobeForgotten,DataDeletion,DifferentialPrivacy,ModelUpdate,AdversarialAttacks

I.INTRODUCTION

Machinelearning(ML)referstotheprocessoftrainingan

algorithmtomakepredictionsordecisionsbasedondata[1]

.MLhasbecomeincreasinglyimportantinapplicationssuchashealth,highereducation,andotherrelevantdomains.Inhealth-care,MLmodelscanbeusedtopredictpatientoutcomes,

identifyhigh-riskpatientsandpersonalizetreatmentplans[2]

.Forhighereducation,MLhasbeenusedtoimprovestudentoutcomesandenhancethelearningexperience,orevenusedtoanalyzestudentdataandpredicttheironlineclassengagement. InML,analgorithmistrainedonadatasettolearnpatternsandrelationshipsinthedata.Oncethealgorithmhasbeentrained,itcanbeusedtomakepredictionsonnewdata.Thus,thegoalofMListocreateaccuratemodelsthatcan

ThanveerShaikandXiaohuiTaoarewiththeSchoolofMathematics,PhysicsandComputing,UniversityofSouthernQueensland,Queensland,Australia(e-mail:Thanveer.Shaik@.au,Xiaohui.Tao@.au).

HaoranXieiswiththeDepartmentofComputingandDecisionSciences,LingnanUniversity,TuenMun,HongKong(e-mail:hrxie@.hk)

LinLiiswiththeSchoolofComputerandArti?cialIntelligence,WuhanUniversityofTechnology,China(e-mail:cathylilin@)

XiaofengZhuiswiththeUniversityofElectronicScienceandTechnologyofChina(e-mail:seanzhuxf@)

QingLiiswiththeDepartmentofComputing,HongKongPolytechnicUniversity,HongKongSpecialAdministrativeRegionofChina(e-mail:qing-prof.li@.hk).

generalizewellontonewdata[3].Ontheotherhand,machine

unlearning(MU)istheprocessofremovingcertaindatapointsorfeaturesfromatrainedMLmodelwithoutaffectingits

performance[4].MUisarelativelynewandchallenging?eld

ofresearchthatisconcernedwithdevelopingtechniquesforremovingsensitiveorirrelevantdatafromtrainedmodels.ThegoalofMUistoensurethattrainedmodelsarefreefrombiasesandsensitiveinformationthatcouldleadtonegative

outcomes[5]

.

MUwas?rstintroducedbyCaoetal.[6],whorecognized

theneedfora“forgettingsystem”anddevelopedoneoftheinitialunlearningalgorithmscalledmachineunlearning.Thisapproachef?cientlyremovesdatatracesbyconvertinglearningalgorithmsintoasummationformwhichcanhelpcounterdatapollutionattacks.Theincreasingneedforregu-latorycompliancewithmodernprivacyregulationsledtotheestablishmentofMU,whichinvolvesdeletingdatanotonly

fromstoragearchives,butalsofromMLmodels[7].Existing

studiesupdatethemodelweightsforunlearningusingeitherthewholetrainingdata,asubsetoftrainingdata,orsome

metadatastoredduringtraining[8].Althoughstrictregulatory

compliancerequiresthetimelydeletionofdata,thereareinstanceswheredatapertainingtothetrainingprocessmaynotbeavailableforunlearningpurposes.CompaniesandorganizationscommonlyemployuserdatatotrainMLmodels,butlegalframeworkslikeGDPR,CCPA,andCPPAdemand

thatuserdatabeerasedwhenrequested[9].Thequestionis

whethermerelydeletingthedataissuf?cient,orifthemodels

trainedonthisdatashouldalsobeadjusted[10].However,

straightforwardtechniqueslikeretrainingmodelsfromscratchorcheck-pointingcanbecomputationallycostlyandrequire

signi?cantstorageresources[11].WithMU,wecanmodify

modelstoexcludespeci?cdatapointsmoreef?ciently[12]

.

Varioustechniqueshavebeensuggestedformanaginguserdatadeletionrequests,suchasoptimization,clustering,andregressionmethods.Conductingacomprehensivesurveyofexistingliteratureonmanaginguserdatadeletionrequestscansupporttheidenti?cationofgapsandtrendsinthe?eld,whichwillguidefutureresearchandprovideinsightsfororganizationshandlingsuchrequests.Inthisstudy,weaimtoaddressthefollowingresearchquestions.

1)WhatarethemosteffectivetechniquesforunlearningdatafromMLmodels?

2)Howcantheimpactofunlearningonmodelperformancebemeasuredandevaluated?

3)WhatarethechallengesinMU,andhowcanthesechallengesbeaddressed?

JOURNALOFLATEXCLASSFILES,VOL.14,NO.8,AUGUST2021

2

Thecontributionsofthisstudyareasfollows:

●Acomprehensiveandup-to-datetaxonomyabouttheemerging?eldofMU,includinganexplanationofitsimportanceandpotentialapplications.

●Adetailedtaxonomyofthevarioustechniquesandap-proachesthathavebeendevelopedforunlearningdatafromMLmodels,suchasdatadeletion,dataperturbation,andmodelupdatetechniques.

●AdiscussionofdifferentevaluationmethodsforassessingtheeffectivenessofMUtechniques,suchasmeasuringthedegreeofforgettingortheirimpactonmodelperfor-

mance.

●Ataxonomyofseveralkeychallengesinthe?eldofMU,includingattacksophistication,standardization,transferability,interpretability,trainingdata,andresourceconstraints.

●Finally,adiscussionofthepotentialbene?tsofMUanditsfuturedirectionsinnaturallanguageprocessing(NLP),computervision,andrecommendersystems.

Theremainderofthepaperisorganizedasfollows.Sec-tion

II

outlinestheaimsandobjectivesofMU.InSection

III,

wedelveintodatadeletion,dataperturbation,andmodelupdatetechniquesingreaterdepth.Section

IV

detailstheevaluationmetricsofMU,whileSection

V

discussesthechallengesassociatedwiththe?eldandproposespotentialsolutions.InSection

VI,weexplorethefuturedirections

ofMUinNLP,computervision,andrecommendersystems.

Finally,Section

VII

concludesthepaper.

II.OVERVIEWOFMACHINEUNLEARNING

Theconceptofthe“righttobeforgotten”referstotheabil-

itytohavepersonalinformationremovedfromonlinesearch

resultsanddirectoriesincertainsituations[13].However,there

isnoconsensusonitsde?nition,orwhetheritshouldbeclassi?edasahumanright.Nevertheless,variousinstitutionsandgovernments,suchasthoseinArgentina,theEuropeanUnion(EU),andthePhilippines,arebeginningtoproposeregulationsaroundthisissue

1.

Informationandeventsfromanindividual’spastcancontinuetocarryastigmaandleadtonegativeconsequences,evenafteraconsiderableamountoftimehaspassed.Forexample,inJuly2018,Disney?redwriteranddirectorJamesGunn,whowascreditedformuchofthestudio’ssuccesswith?lmssuchas“GuardiansoftheGalaxy”,afteroldtweetsresurfacedcontainingdarkhumoraboutpedophiliaandrape.FansandactorsralliedtoGunn’sdefense,withanopenlettercallingforhisreinstatementand

petitionstorehirehim[14].However,deletingwhataperson

haspostedonsocialmediaplatformssuchasFacebookandInstagrammaynotentirelyremovethedatafromtheInternet.AlthoughFacebooklaunchedatoolcalled“Off-FacebookActivity”tohelpusersdeletedatathatthird-partyappsandwebsitessharewithFacebook,itonlyde-linksthedatafromtheuser.In2014,aSpanishcourtruledinfavorofamanwhorequestedthatGoogleremovecertaininformationabout

himfromitssearchresults[15].Thecourtfoundthatthe

1https://link.library.eui.eu/portal/The-Right-To-Be-Forgotten–A-Comparative-Study/tw0VHCyGcDc/

informationwasnolongerrelevantoradequate,asthedebthadbeenpaidalongtimebefore.TheEUcourtalsoruledthatGoogleneededtoremovethesearchresults.

The“righttobeforgotten”isusedtodescribeanindi-vidual’srighttorequestthattheirpersonalinformationberemovedfromtheInternet,particularlysearchengineresults,

incertaincases[16].Supportersofthisrightarguethatitis

necessarytoprotectindividualsfromhavingpastmistakesorpersonalinformationusedagainstthem,suchasincasesofrevengeporn,pettycrimes,orunpaiddebts.However,criticsofthisrightclaimthatitinfringesuponfreedomofexpressionandtherighttocriticize.TheEUhastriedtoaddresstheseconcernsbystrikingabalancebetweentherighttoprivacyand

freedomofexpression[17].Theissueisfurthercomplicated

bytheuseofML,whichcancollectandanalyzevastamountsofdatainde?nitely.Thisdatacanthenbeusedinapplicationssuchasinsurance,medical,andloanevaluations,leadingtopotentialharmandamplifyingexistingbiases.Assuch,itisimportanttoconsidertheethicalimplicationsofMLmodelsanddatacollectioninthesecontexts.Wegenerateade?nitionofMUbasedonacomprehensivereviewofexistingresearch

literature,includingthestudiesin[6],[8],[18]–[22]:

GeneralDe?nition:Machineunlearningisaconceptthatreferstotheprocessofremovingor“forgetting”previ-ouslylearnedinformationfromamachinelearningmodel.Inessence,itistheoppositeofmachinelearning;whilemachinelearningisallabouttrainingmodelstorecognizepatternsandmakepredictionsbasedonthatdata,machineunlearningaimstoundoorreversethatprocess,byremovingpreviouslylearnedpatternsorpredictionsthatarenolongerrelevantoraccurate.

MUisanemerging?eldwithintherealmofarti?cialintelligence(AI)thatseekstoremovespeci?cdatapointsfromamodelwithoutcompromisingitsperformance.This

technique,alsoknownasselectiveamnesia[23],hasavariety

ofpotentialapplications,includingenablingindividualstoexercisetheir“righttobeforgotten”andpreventingAImodelsfrominadvertentlyleakingsensitiveinformation.MUcanalsohelpcombatdatapoisoningandadversarialattacks.Throughitsapplication,MUhasthefollowingobjectives:

●ToaddressprivacyconcernsinMLbyeliminatingsensi-tiveorpersonaldatafromthemodelwithoutsigni?cantlyreducingitsperformance.ItisdifferentfromML,whichfocusesontrainingmodelstopredictoutcomesbasedoninputdata.Theworkscitedinthiscontextincluderesearchonnoveltechniquesforprivacy-preservingML,statisticalmethodsfordataprotection,andadaptiveal-gorithmsthatadjusttochangingdataprivacyrequire-

ments[24]–[34]

.

●ToimprovetheaccuracyandfairnessofMLmodelsbyremovingbiasesorcorrectingerrorsthatmayhavebeenintroducedduringthelearningprocess.Thisistypicallydonebyanalyzingthemodel’sperformanceonvariousmetricsandidentifyingareaswhereimprovementisneeded.Theseworkscoverresearchonmitigatingbiasinmachinelearningmodelsandtechniquesforimproving

fairnessinalgorithmicdecision-making[35]–[45]

.

JOURNALOFLATEXCLASSFILES,VOL.14,NO.8,AUGUST2021

3

●ToimprovetheperformanceofMLmodelsovertimebyallowingthemtoadapttochangingdataandcircum-stances.Byunlearningoutdatedorirrelevantinformation,MLmodelscanbecomemoreaccurate,ef?cient,andadaptabletonewsituations.Thereferencescitedinthiscontextincludestudiesontransferlearning,whichin-volvesapplyingknowledgefrompreviouslylearnedtasks

tonewproblems[46]–[53]

.

Despitethesigni?cantinvestmentthatcompaniesmakeintraininganddeployinglargeAImodels,regulatorsinboththeEUandtheUnitedStatesarecautioningthatmodelstrainedonsensitivedatamayneedtoberemoved.InareportfocusedonAIframeworks,theUKgovernmentexplainedthatMLmodelsmaybesubjecttodatadeletionundertheGeneralDataProtectionRegulation(GDPR).Forinstance,ParavisionwasrecentlyfoundtohavecollectedmillionsoffacialphotosinappropriatelyandwasrequiredbytheUSFederalTradeCommissiontodeleteboththedataandanytrainedmodelsthatreliedonit

2.Themoststraightforwardstrategyforre

-movingadatapointfromtrainingdataandupdatingthemodel

istoconductretraining[54].Unfortunately,thisprocedure

incursconsiderablecosts,asexempli?edbyOpenAI’sreported

expensesofupto20milliondollarstotrainGPT-3[55]

.Hence,thereisaneedformorecost-effectiveandef?cientmethodstoaddressdatapointremovalinMLmodels.

Thechallengeistobalanceprivacyandtherighttoex-pressiontopreventtherighttobeforgottenfrombecoming

aformofcensorship[56].Balancingprivacyandtheright

toexpressioniscrucialinimplementingtherighttobeforgottenwithoutthisprocessbeingmisused.Theemergenceofnewtechnologies,suchasblockchain,presentsnewchal-lengesinmaintainingthisbalance.Furthermore,theincreasingpublicsensitivitytowarddataprivacyhaspromptedmanycompaniestoprioritizeuserprivacy.Forexample,GooglerecentlyannouncedanexpandedpolicyforUScitizensto

removepersonaldatafromsearchresults3.

However,whendatapointsareeliminated,theAImodelstrainedonthemneedtobeappropriatelycleaneduptoavoidperpetuatingbiasedorsensitiveinformation.WhileMUisacomplexchallenge,variousapproachesarebeingtestedanddevelopedtoaddressthisissue.Asregulationsondataprivacyincrease,MUisexpectedtoplayacriticalroleinensuringthatAImodels

aretransparentandethical.

III.TECHNIQUESANDAPPROACHES

ThissectioncategorizestheMUtechniquesintothreegroups,namelyDataDeletion,DataPerturbation,andModelUpdatetechniques,asillustratedinthetaxonomyinFig

1.The

?rstresearchquestionwillalsobeaddressedinthissection.

A.DataDeletion

Inthissubsection,wede?nethedatadeletiontechniquessuchasdatapoisoning,datasubsampling,anddatashuf?ing.

2

/story/startup-nix-algorithms-ill-gotten-facial-data/

3

/2022/09/28/google

-rolls-out-tool-to-request-

removal-of-personal-info-from-search-results-will-later-add-proactive-alerts/

1)Datapoisoning:Datapoisoningisatechniqueusedin

MUtointentionallyintroduceincorrectormisleadingdataintothetrainingdataset.ThegoalofdatapoisoningistodegradetheaccuracyoftheMLmodel,oftenwithmaliciousintent.Thistechniqueisoftenusedinattacksonprivacy-preservingsystemsortomanipulatetheresultsofautomateddecision-makingprocesses.

SupposewehaveadatasetD=

(x1,y1),(x2,y2),...,(xn,yn)andamaliciousadversarywantstoinjectabackdoorintothemodelbymodifyingafractionofthetrainingdata.Theattackeraddsapoisondatapoint(x/,y/)tothetrainingdata,withthegoalofmakingthemodelpredictaspeci?ctargetlabelytargetinsteadofthetruelabely/.Thepoisoneddatasetcanbewrittenas:

D/=(x1,y1),(x2,y2),...,(xi,ytarget),...,(xn,yn)(1)where(xi,ytarget)isthepoisoneddatapoint.

TominimizethelossfunctionL(θ;D/)butsubjecttotheconstraintthatthemodelaccuracyontheoriginaltrainingdata

D,denotedbyAcc(θ;D),doesnotfallbelowaprede?ned

thresholdofAcc0,wecanwrite:

D=(x1,y1),(x2,y2), ,(xn,yn)(

2)

minimizeL(θ;D/)subjecttoAcc(θ;D)>Acc0

Theprocessofdatapoisoninginvolvesanattackeridenti-fyingvulnerabilitiesinthedatacollectionprocessandthensubmittingmaliciouslycrafteddataintothesystem.Themaliciousdataisoftendesignedtolooklikelegitimatedatatobetterevadedetection.Oncethemaliciousdataisintroduced,theMLmodelcanbecomebiasedorproduceincorrectresults.

Aprojectedgradientdescent(PGD)solutionisformulated

forthedatapoisoningproblembyMarchantetal.[57].Their

articlediscussesthechallengeofcomplyingwithdataprotec-tionregulations,suchastherighttoerasure,whenitcomestotrainedMLmodels.Theyidenti?edanewvulnerabilityinMLsystems,namely“poisoningattacks”thatslowdownunlearning.Theirsuiteofexperimentsexplorestheeffectsoftheseattacksinvarioussettingsandhighlightstherisksofdeployingapproximateunlearningalgorithmswithdata-

dependentruntimes.Marchantetal.[57]callintoquestion

theextenttowhichunlearningimprovesperformanceoverfullretraining,showingthatdatapoisoningcanharmcomputationbeyondaccuracy,similartoconventionaldenial-of-serviceat-

tacks.Sunetal.[58]discussedthethreatofhowattackerscan

utilizefederatedlearningtolaunchdatapoisoningattacksondifferentnodes.Theseauthorsdemonstratedadatapoisoning

attackonFederatedMultitaskLearning[59],byformulating

anoptimalstrategyasageneralbi-leveloptimizationproblem.Theyalsode?nedthreeattacks:adirectattack,anindirectattack,andahybridattack.Inadirectattack,allthetargetnodesaredirectlyinjectedwithpoisoneddatawhiletraining,whereasintheindirectmodeofattack,theattackerstargetrelateddevicesduetocommunicationprotocols.Inhybridattackmode,theattackersadoptbothdirectandindirectattacks.Toovercometheseattacks,theauthorsproposedanattackonafederatedlearning(AT2FL)frameworkwhereinimplicitgradientsofpoisoneddatacanbecomputedinsidesourceattackingnodes.

DataPertubation

DataAnonymization

PGD[27]

ReviewonMU[41]

AT^2FL[28]

MLevaluationwithpoisoning[30]

Hashensembleapproach[31]

DifferentialPrivacy

Adaptive,non-

adaptivedeletion[47]

Unlearndata[34]

Tolowerdifferentialprivacy[35]

EvaluateMUexperiment[36]

DataShuffling

3DArnoldcatmap[37]

Twofold

framework[63]

Linearfiltration[68]

DSMixup[69]

JOURNALOFLATEXCLASSFILES,VOL.14,NO.8,AUGUST20214

MachineUnlearningTechniques

DataDeletion

DataPoisoning

Black-boxquery[43]

Social

engineering[44]

Re-identificationriskanalysis[45]

PSO[32]

DataSubsampling

Totalvariation

stability[48]

Ignoresetofdeleterequests[9]

Unlearningfeaturesandlabels[49]

ProtectDNNtrainingdata[34]

InverseDataGeneration

Zero-shotMU[50]

ModelUpdateTechniques

Regularization

L1norm

regularization[64]

Generateadversarialexamples[66]

TransferLearning

SISAtraining[70]

Distillation

Knowledgedistillation[76]

FedLU[77]

NAD[78]

ModelPruning

TF-IDFfor

pruning[71]

Prunefirst,thenunlearn[72]

ModelInversion

ModifiedInversion[79]

Few-shotunlearning[80]

Fig.1.MachineUnlearning-Taxonomy

Datapoisoningcanbeusedtomanipulatethetrainingwithadversarialattacks,suchasrandomlabel?ippinganddistance-basedlabel?ippingattacks.Intheirstudy,Yerlikayaetal.

[60]didempiricalexperimentstochecktheperformancesof

sixMLalgorithmsunderthetwoadversarialattacks.Theauthorsusedspam,botnet,malware,andcancerdetectiondatasetstoevaluatethealgorithmsbylaunchingadversarialattacksonthem.Theresultsshowedthatalgorithmbehaviordependsonthetypeofdataset.Poisoningattackstypicallyinvolvemaliciouslyalteringthetrainingdatasettodecreaseclassi?cationaccuracyormisclassifyingspeci?cinputswhenthemodelisdeployed.Thus,hash-basedensembleapproacheshavebeenproposedasasolutiontocounteractpoisoningattacks,buttheireffectivenessindifferentscenariossuchastabulardatasetsandensemble-basedMLalgorithms(e.g.RandomForests)hasnotbeenfullyevaluated.Therobustnessofahash-basedensembleapproachagainstdatapoisoningin

atabulardatasetwasevaluatedbyAnisettietal.[61]using

aRandomForest(RF)algorithmasaworst-casescenario.TheirresultsshowedthatevensmallensemblescanprotectagainstpoisoningandthatplainRFsarehighlysensitivetolabel?ipping,butalmostinsensitivetootherperturbations.Indatapoisoningcircumstances,selectingthehyperparametersfordeeplearning(DL)modelsiscriticaltomaintainingoren-

hancingtheperformancemetrics.Maabrehetal.[62]proposed

developingDLmodelsthatareoptimizedusingthenature-

inspiredalgorithm,particleswarmoptimizer(PSO)[63],while

someofthetrainingdatasamplesarefakei.e.poisoneddata.

Theresultsshowedthatanincreaseinthepoisoningratedecreasesalltheperformancemetrics,suchasaccuracy,recall,precision,andF1-score.PSOcanrecommenddifferentvaluesforimportantparametersandimprovemodelperformance,evenwithahighpoisoningrate.However,cautionshouldbetakenwhenusingPSO,asitmaytemporarilyhidetheexistenceoffakesamplesandfailwhenthereisasigni?cantconcentrationofpoisoninthedataset.

Thereareseveralapproachestodefendagainstdatapoison-ingattacks,includingrobusttrainingmethodsthatcanidentifyandremovemaliciousdata,aswellastechniquesthatcandetectchangesinthedatadistribution.However,sophisticatedattackerscanalsobypassthesetechniques,soongoingresearch

isneededtodevelopmoreeffectivedefenses.

2)DataSubsampling:DatasubsamplingisatechniqueusedinMUtoreducetheamountofdatausedinthemodeltrainingprocess.Inthistechnique,asubsetoftheoriginaldataisrandomlyselected,andonlythatsubsetisusedtotrainthemodel.Thistechniquecanbeusefulincaseswheretheoriginaldatasetisverylargeandthecomputationalresourcesrequiredfortrainingthemodelonthefulldatasetareprohibitive.

LetX=x1,x2,...,xnbetheoriginaltrainingdatasetwithcorrespondinglabelsY=y1,y2,...,yn.WerandomlyselectasubsetSofsizem<n,suchthatS=s1,s2,...,sm.WeremovetheselectedsubsetfromXandYtocreatenewtrainingsetsX/andY/:

(3)

X/=x1,x2,...,xn-S

Y/=y1,y2,...,yn-S

JOURNALOFLATEXCLASSFILES,VOL.14,NO.8,AUGUST20215

TABLEI

DATADELETIONTECHNIQUESFORMACHINEUNLEARNING

Technique

Reference

Problem

ProposedFramework

DataPoisoning

Marchantetal.

[57]

Complyingwithdataprotectionregulations

Projectedgradientdescent(PGD)

Sunetal.

[58]

DatapoisoningattackonFederatedLearning

AttackonFederatedLearning(AT2FL)framework.

Yerlikayaetal.

[60]

Adversarialattacksonmachinelearningalgorithms

Evaluationofmachinelearningalgorithmperformancesinattacksusingdatapoisoning.

Anisettietal.

[61]

Poisoninga

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論