版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
Adversarial
Example
Detection姜育剛,馬興軍,吳祖煊Recap:
week
3
1.
Adversarial
Examples
2.
Adversarial
Attacks
3.
Adversarial
Vulnerability
UnderstandingIn-class
Adversarial
Attack
Competitionhttps://codalab.lisn.upsaclay.fr/competitions/15669?secret_key=77cb8986-d5bd-4009-82f0-7dde2e819ff8
In-class
Adversarial
Attack
CompetitionIn-class
Adversarial
Attack
CompetitionAdversarial
attack
competition(account
for
30%)必須使用學(xué)校郵箱注冊比賽(否則無成績)比賽時間:Phase
1:10月1號–
10月28號Phase
2:評估階段,學(xué)生不參與沒卡的同學(xué)可以使用Google
Colab:/
按排名算分:第一名30分最后一名15分Adversarial
Example
Detection
(AED)A
binary
classification
problem:
clean
(y=0)
or
adv
(y=1)?An
anomaly
detection
problem:
benign
(y=0)
or
abnormal
(y=1)?
Principles
for
AEDAll
binary
classification
methods
can
be
applied
for
AEDPrinciples
for
AEDAll
anomaly
detection
methods
can
be
applied
for
AEDPrinciples
for
AEDUse
as
much
information
as
you
canInput
statisticsManual
featuresTraining
dataAttention
mapTransformationMixupDenoising…ActivationsDeep
featuresProbabilitiesLogitsGradientsLoss
landscapeUncertainty…Principles
for
AEDLeverage
unique
characteristics
of
adversarial
examplesTwinsStrangersExtremely
close
to
the
clean
sampleFar
away
in
predictionPrinciples
for
AEDBuild
detectors
based
on
existing
understandingsHigh
dimensional
pocketsLocal
linearityTilting
boundaryPrinciples
for
AEDIt’s
is
still
feature
engineering!Challenges
in
AEDThe
diversity
of
adversarial
examples
used
for
training
the
detectors
determine
the
detection
performanceDetectors
are
also
machine
learning
models:
they
are
also
vulnerable
to
adversarial
attacks
The
detectors
need
to
detect
both
existing
and
unknown
attacksThe
detectors
need
to
be
robust
to
adaptive
attacksExisting
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預(yù)測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Existing
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預(yù)測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Secondary
Classification
MethodsTake
adversarial
examples
as
a
new
classAdversarialRetraining
(對抗重訓(xùn)練)Grosse
et
al.
Onthe(Statistical)DetectionofAdversarialExamples,
arXiv:1702.06280Secondary
Classification
MethodsClean
samples
as
class
0,
adversarial
as
class
1AdversarialClassification
(對抗分類)Gong
et
al.
Adversarialandcleandataarenottwins,
arXiv:1704.04960Secondary
Classification
MethodsTraining
a
detector
for
each
intermediate
layerCascade
Classifiers
(級聯(lián)分類器)Metzen,JanHendrik,etal."Ondetectingadversarialperturbations."
arXivpreprintarXiv:1702.04267
(2017).Existing
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預(yù)測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Principle
Component
Analysis
(PCA)The
last
few
components
differentiate
adversarial
examplesHendrycks,Dan,andKevinGimpel.“Earlymethodsfordetectingadversarialimages.”
arXiv:1608.00530
(2016);
Carlini
and
Wagner."Adversarialexamplesarenoteasilydetected:Bypassingtendetectionmethods."
AISec.2017.Blue:
a
clean
sampleYellow:
an
adv
exampleAn
artifact
caused
by
the
black
backgroundDimensionality
ReductionBhagoji,ArjunNitin,DanielCullina,andPrateekMittal."Dimensionalityreductionasadefenseagainstevasionattacksonmachinelearningclassifiers."arXiv:1704.02654
2.1(2017).Train
on
PCA
reduced
dataExisting
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預(yù)測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Distribution
DetectionGrosse
et
al.
Onthe(Statistical)DetectionofAdversarialExamples,
arXiv:1702.06280MaximumMeanDiscrepancy
(MMD)Two
datasets:
Distribution
DetectionFeinman,Reuben,etal."Detectingadversarialsamplesfromartifacts."
arXivpreprintarXiv:1703.00410
(2017).KernelDensityEstimation
(KDE)Adversarial
examples
are
in
low
density
spaceDistribution
DetectionFeinman,Reuben,etal."Detectingadversarialsamplesfromartifacts."
arXivpreprintarXiv:1703.00410
(2017).KernelDensityEstimation
(KDE)Adversarial
examples
are
in
low
density
space
Bypassing
10
Detection
MethodsAdversarialExamplesAreNotEasilyDetected:BypassingTenDetectionMethods.
Carlini
and
Wagner,
AISec
2017.Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Definition(LocalIntrinsicDimensionality)AdversarialexamplesareinhighdimensionalsubspacesLocal
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018AdversarialSubspacesandExpansionDimension:
Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Estimatinglocalintrinsicdimensionality.Amsaleg
et
al.KDD
2015EstimationofLID:
Hill(MLE)estimator(Hill1975,Amsalegetal.2015):BasedonExtremeValueTheory:Nearestneighbordistancesareextremeevents.LowertaildistributionfollowsGeneralizedParetoDistribution(GPD).
Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018InterpretationofLIDforAdversarialSubspaces:LIDdirectlymeasuresexpansionrateoflocaldistancedistributions.Theexpansionofadversarialsubspaceishigherthannormaldatasubspace.LIDassessesthespace-fillingcapabilityofthesubspace,basedonthedistancedistributionoftheexampletoitsneighbors.Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018LID
of
adversarial
examples
(red)
are
higherLID
at
deeper
layers
are
more
differentiableLocal
Intrinsic
Dimensionality
(LID)Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Experiments&Results:DatasetFeatureFGMBIM-aBIM-bJSMAOptMNISTKD78.1298.1498.6168.7795.15BU32.3791.5525.4688.7471.30LID96.8999.6099.8392.2499.24CIFAR-10KD64.9268.3898.7085.7791.35BU70.5381.6097.3287.3691.39LID82.3882.5199.7895.8798.94SVHNKD70.3977.1899.5786.4687.41BU86.7884.0786.9391.3387.13LID97.6187.5599.7295.0797.60Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Experiments&Results:Train\TestattackFGMBIM-aBIM-bJSMAOptFGSMKD64.9269.1589.7185.7291.22BU70.5381.672.6586.7991.27LID82.3882.3091.6189.9393.32Detectors
trained
on
simple
attacks
FGSM
can
detect
complex
attacksAn
Improved
Detector
of
LID/pdf/2212.06776.pdf
An
Improved
Detector
of
LID/pdf/2212.06776.pdfMahalanobisDistance
(MD)Mahalanobis,PrasantaChandra."Onthegeneralizeddistanceinstatistics."NationalInstituteofScienceofIndia,1936.
The
MD
of
between
two
data
points:MahalanobisDistance
(MD)Leeetal.“Asimpleunifiedframeworkfordetectingout-of-distributionsamplesandadversarialattacks.”
NeurIPS
2018.
MahalanobisDistance
(MD)Leeetal.“Asimpleunifiedframeworkfordetectingout-of-distributionsamplesandadversarialattacks.”
NeurIPS
2018.MahalanobisDistance
(MD)Leeetal.“Asimpleunifiedframeworkfordetectingout-of-distributionsamplesandadversarialattacks.”
NeurIPS
2018.Experiments&Results:Existing
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預(yù)測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Bayes
UncertaintyBayesianUncertainty(BU)
Feinman,Reuben,etal."Detectingadversarialsamplesfromartifacts."
arXivpreprintarXiv:1703.00410
(2017).Feature
SqueezingXu
et
al."Featuresqueezing:Detectingadversarialexamplesindeepneuralnetworks."
arXiv:1704.01155
(2017).Bit
depth
reductionSqueezing
clean
and
adv
examplesReducing
input
dimensionality
improves
robustnessThe
prediction
inconsistency
before
and
after
squeezing
can
detect
advsRandom
TransformationTian
et
al."Detectingadversarialexamplesthroughimagetransformation."
AAAI2018.The
prediction
of
advs
will
change
after
random
transformationsLog-OddsRoth
et
al.“Theoddsareodd:Astatisticaltestfordetectingadversarialexamples.”
ICML2019.Add
random
noise
to
the
input
Log-OddsHuetal.“Anewdefenseagainstadversarialimages:Turningaweaknessintoastrength.”
NeurIPS
2019.原則1:對抗樣本的梯度更均勻原則2:對抗樣本難以被攻擊第二次測試準(zhǔn)則1:隨機(jī)噪聲不會改變預(yù)測結(jié)果測試準(zhǔn)則1:再次攻擊需要更多的擾動Existing
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預(yù)測不一致性)Reconstruction
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 廣東碧桂園職業(yè)學(xué)院《會計前沿》2023-2024學(xué)年第一學(xué)期期末試卷
- 廣安職業(yè)技術(shù)學(xué)院《數(shù)字媒體技術(shù)與應(yīng)用理論教學(xué)》2023-2024學(xué)年第一學(xué)期期末試卷
- 贛州職業(yè)技術(shù)學(xué)院《信息系統(tǒng)項目管理》2023-2024學(xué)年第一學(xué)期期末試卷
- 《課程學(xué)習(xí)資源的》課件
- 贛西科技職業(yè)學(xué)院《中學(xué)班主任工作導(dǎo)論》2023-2024學(xué)年第一學(xué)期期末試卷
- 贛南衛(wèi)生健康職業(yè)學(xué)院《世紀(jì)西方音樂》2023-2024學(xué)年第一學(xué)期期末試卷
- 《新龜兔賽跑》課件
- 六年級語文上冊養(yǎng)成讀報的好習(xí)慣課件
- 《狂犬病人的護(hù)理》課件
- 七年級語文上冊第二單元8世說新語二則教案新人教版
- 物業(yè)投訴處理培訓(xùn)課件
- 《春秋》導(dǎo)讀學(xué)習(xí)通章節(jié)答案期末考試題庫2023年
- 1.1、供應(yīng)商管理控制流程與風(fēng)險控制流程圖
- 初二年級勞動課教案6篇
- 箱變遷移工程施工方案
- 北師大版九年級數(shù)學(xué)下冊《圓的對稱性》評課稿
- 《遙感原理與應(yīng)用》期末考試試卷附答案
- 物流無人機(jī)垂直起降場選址與建設(shè)規(guī)范(征求意見稿)
- 工程分包管理制度
- 2023年湖南成人學(xué)位英語考試真題
- GB/T 9452-2023熱處理爐有效加熱區(qū)測定方法
評論
0/150
提交評論