版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
Adversarial
Defense姜育剛,馬興軍,吳祖煊Recap:
week
4
1.
Adversarial
Example
DetectionSecondary
Classification
Methods
(二級(jí)分類(lèi)法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測(cè)法)Prediction
Inconsistency
(預(yù)測(cè)不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測(cè)法)Adversarial
Attack
CompetitionLink:https://codalab.lisn.upsaclay.fr/competitions/15669?secret_key=77cb8986-d5bd-4009-82f0-7dde2e819ff8
Adversarial
Defense
vs
DetectionThe
weird
relationship
between
defense
and
detectionDetection
IS
defenseBut…
when
we
say
defense,
we
(most
of
the
time)
mean
the
model
is
secured,
yet
detection
cannot
do
that…In
survey
papers:
detection
is
defenseIn
technical
papers:
defense
is
defense
not
detectionDifferencesDefense
is
to
secure
the
model
or
the
systemDetection
is
to
identify
potential
threats,
which
should
befollowed
by
a
defense
strategy,
e.g.,
query
rejection
(but
mostly
ignored)By
defense,
it
mostly
means
robust
training
methodsDefense
MethodsEarly
Defense
MethodsEarly
Adversarial
Training
MethodsLater
Adversarial
Training
MethodsRemaining
Challenges
and
Recent
ProgressA
Recap
of
the
Timeline2014年Goodfellow等人提出快速單步攻擊FGSM及對(duì)抗訓(xùn)練2015年簡(jiǎn)單檢測(cè)方法(PCA)和對(duì)抗訓(xùn)練方法2016年提出對(duì)抗訓(xùn)練的min-max優(yōu)化框架2017年大量的對(duì)抗樣本檢測(cè)方法和攻擊方法(BIM、C&W)、10種檢測(cè)方法被攻破2019年TRADES及大量其他對(duì)抗訓(xùn)練方法、第一篇Science文章2018年物理世界攻擊方法、檢測(cè)方法升級(jí)、PGD攻擊與對(duì)抗訓(xùn)練、9種防御方法被攻破2020年AutoAttack攻擊、Fast對(duì)抗訓(xùn)練2021年增大模型、增加數(shù)據(jù)的對(duì)抗訓(xùn)練、領(lǐng)域延伸2022年尚未解決的問(wèn)題,攻擊越來(lái)越多,防御越來(lái)越難2013年Biggio等人與Szegedy等人發(fā)現(xiàn)對(duì)抗樣本Principles
of
DefenseBlock
the
attack
(掐頭去尾)Mask
the
input
gradientsRegularize
the
input
gradientsDistill
the
logitsDenoise
the
inputRobustify
the
model
(增強(qiáng)中間)Smooth
the
decision
boundaryReduce
the
Lipschitzness
of
the
modelSmooth
the
loss
landscapeAdversarial
AttackSzegedyC,ZarembaW,SutskeverI,etal.Intriguingpropertiesofneuralnetworks[J].ICLR
2014.GoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.
模型訓(xùn)練:
對(duì)抗攻擊:
分類(lèi)錯(cuò)誤擾動(dòng)很小測(cè)試階段攻擊
Performance
Metrics
Othermetrics:maximumperturbationfor100%attacksuccessrate
Defense
MethodsEarly
Defense
MethodsEarly
Adversarial
Training
MethodsAdvanced
Adversarial
Training
MethodsRemaining
Challenges
and
Recent
ProgressDefensive
DistillationMaking
large
logits
change
to
be
”small”Scaling
up
logits
by
a
few
magnitudes;Retrain
the
lastlayerwith
scaled
logits;Papernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016Defensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016
12Distillation
with
temperature
TDefensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016Defensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016Defensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016
Defensive
Distillation
Is
Not
RobustCarlini,Nicholas,andDavidWagner."Defensivedistillationisnotrobusttoadversarialexamples."
arXivpreprintarXiv:1607.04311
(2016).
It
can
be
evaded
by
attacking
the
distilled
network
with
the
temperature
T.
Lessons
LearnedCarlini,Nicholas,andDavidWagner."Defensivedistillationisnotrobusttoadversarialexamples."
arXivpreprintarXiv:1607.04311
(2016).Distillation
is
not
a
good
solution
for
adversarial
robustnessVanishing
input
gradients
can
still
be
recovered
by
a
reverse
operationA
defense
should
beevaluated
against
the
adaptive
attack
to
prove
real
robustnessInput
Gradients
RegularizationRoss
et
al."Improvingtheadversarialrobustnessandinterpretabilityofdeepneuralnetworksbyregularizingtheirinputgradients."
AAAI,2018.Drucker,Harris,andYannLeCun.“Improvinggeneralizationperformanceusingdoublebackpropagation.”
TNN,1992.
Classification
lossInput
gradients
regularization
Related
to
the
double
backpropagation
proposed
by
DruckerandLeCun(1992):Input
Gradients
RegularizationRoss
et
al."Improvingtheadversarialrobustnessandinterpretabilityofdeepneuralnetworksbyregularizingtheirinputgradients."
AAAI,2018.Issues:
1)
limited
adversarial
robustness,
2)
hurts
learning蒸餾的對(duì)抗訓(xùn)練的正則化的Feature
SqueezingXu
et
al."Featuresqueezing:Detectingadversarialexamplesindeepneuralnetworks."
NDSS,2018.Compress
the
input
spaceIt
also
hurts
performance
on
large-scale
image
datasets.ThermometerEncodingBuckmanetal."Thermometerencoding:Onehotwaytoresistadversarialexamples."
ICLR,2018.Discretize
the
input
to
break
small
noiseProposed
Thermometer
EncodingInput
TransformationsGuoetal."CounteringAdversarialImagesusingInputTransformations."
ICLR,2018.ImagecroppingandrescalingBit-depthreductionJPEGcompressionTotal
variance
minimizationImagequiltingObfuscated
Gradients
=
Fake
RobustnessAthalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018.
Athalye
et
al.Synthesizingrobustadversarialexamples.ICML,2018.BackwardPassDifferentiableApproximation(BPDA):
Expectation
Over
Transformation
(EOT)T:
a
set
of
randomized
transformationscan
break
non-differentiable
operation
based
defensescan
break
randomization
based
defensesfind
a
linear
approximation
of
the
non-differentiable
operations,
e.g.,
discretization,
compression
etc.
BPDA+EOT
breaks
7
defenses
published
at
ICLR
2018Athalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018.
Athalye
et
al.Synthesizingrobustadversarialexamples.ICML,2018.We
got
a
survivor!How
to
Properly
Evaluate
a
Defense?Carlini,Nicholas,etal.“Onevaluatingadversarialrobustness.”
arXivpreprintarXiv:1902.06705
(2019).
Athalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018Donotblindlyapplymultiple(similar)attacksTryatleastonegradient-freeattackandonehard-labelattackPerformatransferabilityattackusingasimilarsubstitutemodel.Forrandomizeddefenses,properlyensembleoverrandomnessFornon-differentiablecomponents,applydifferentiabletechniques
(BPDA)VerifythattheattackshaveconvergedundertheselectedhyperparametersCarefullyinvestigateattackhyperparametersandreportthoseselectedCompareagainstpriorworkandexplainimportantdifferencesTestbroaderthreatmodelswhenproposinggeneraldefensesRobust
Activation
FunctionsXiao
et
al."EnhancingAdversarialDefensebyk-Winners-Take-All."
ICLR,2020.Block
the
internal
activation:
break
the
continuityk-Winners-Take-All(k-WTA)activationRobust
Loss
FunctionPang
et
al.Rethinkingsoftmaxcross-entropylossforadversarialrobustness.ICLR,2020.Max-Mahalanobiscenter(MMC)
lossMax-Mahalanobiscenter(MMC);
SCE:
softmax
cross
entropyRobust
InferencePang
et
al.MixupInference:Betterexploitingmixuptodefendadversarialattacks.ICLR,2020.Mixup
Inference
(MI)New
Adaptive
Attacks
Break
These
DefensesTrameretal.“Onadaptiveattackstoadversarialexampledefenses.”
NeurIPS,
2020.T1:
AttackthefulldefenseT2:
Target
importantdefensepartsT3:
SimplifytheattackT4:
EnsureconsistentlossfunctionT5:
OptimizewithdifferentmethodsT6:
UsestrongadaptiveattacksHow
to
Evaluate
aDefense?CroceandHein.“Reliableevaluationofadversarialrobustnesswithanensembleofdiverseparameter-freeattacks.”
ICML,2020.
Gao
et
al.
FastandReliableEvaluationofAdversarialRobustnesswithMinimum-MarginAttack,
ICML
2022.Zimmermann
etal."IncreasingConfidenceinAdversarialRobustnessEvaluations."
arXivpreprintarXiv:2206.13991
(2022).Strong
attacks:AutoAttack
(one
must-to-test
attack)Margin
Decomposition
(MD)
attack
(better
than
AutoAttack
on
ViT)Minimum-Margin
(MM)
attack
(new
SOTA
attack
to
test?)Extra
robustness
tests
Attackunittests
(Zimmermannetal,2022)Adversarial
TrainingGoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.The
idea
is
simple:
just
train
on
adversarial
examples!
對(duì)抗訓(xùn)練是一種數(shù)據(jù)增廣方法原始數(shù)據(jù)->對(duì)抗攻擊->對(duì)抗樣本->模型訓(xùn)練Adversarial
Training
Adversarial
TrainingAdversarial
training
produces
smooth
decision
boundary正常邊界生成對(duì)抗樣本訓(xùn)練后Early
Adversarial
Training
MethodsSzegedyC,ZarembaW,SutskeverI,etal.Intriguingpropertiesofneuralnetworks[J].ICLR
2014.GoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.2014年,Szegedy
et
al.
在其解釋對(duì)抗樣本的論文中已探索了對(duì)抗訓(xùn)練,用L-BFGS攻擊對(duì)神經(jīng)網(wǎng)絡(luò)每一層生成對(duì)抗樣本,并添加到訓(xùn)練過(guò)程中。發(fā)現(xiàn):深層對(duì)抗樣本更能提高魯棒性2015年,Goodfellow
et
al.提出使用FGSM(單步)攻擊生成的對(duì)抗樣本來(lái)訓(xùn)練神經(jīng)網(wǎng)絡(luò)Goodfellow等人并未使用中間層的對(duì)抗樣本,因?yàn)榘l(fā)現(xiàn)中間層對(duì)抗樣本沒(méi)有提升Min-max
Robust
OptimizationNoklandetal.Improvingback-propagationbyaddinganadversarialgradient.
arXiv:1510.04189,
2015.Huang
et
al.
Learningwithastrongadversary,
ICLR
2016.
Shaham
et
al.
Understandingadversarialtraining:Increasinglocalstabilityofneuralnetsthroughrobustoptimization,
arXiv:1511.05432,
2015The
First
Proposal
of
Min-Max
Optimization內(nèi)部最大化Inner
maximization外部最小化Outer
minimization
Virtual
Adversarial
TrainingMiyato
etal.Distributionalsmoothingwithvirtualadversarialtraining.ICLR
2016.VAT:
a
method
to
improve
generalizationDifferences
to
adversarial
trainingL2
regularized
perturbationUse
both
clean
and
adv
examples
for
trainingUse
KL
divergence
to
generate
adv
examplesWeaknesses
of
Early
AT
MethodsMiyato
etal.Distributionalsmoothingwithvirtualadversarialtraining.ICLR
2016.
These
methods
are
fast!
Only
takes
x2
time
of
standard
trainingPGD
Adversarial
TrainingAthalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018.
Athalye
et
al.Synthesizingrobustadversarialexamples.ICML,2018.We
got
a
survivor!PGD
Adversarial
TrainingMadryetal."TowardsDeepLearningModelsResistanttoAdversarialAttacks."
ICLR.2018.A
Saddle
Point
Problem內(nèi)部最大化Inner
maximization外部最小化Outer
minimizationA
saddle
point(constrained
bi-level
optimization)
problemIn
constrained
optimization,
Projected
Gradient
Descent
(PGD)
is
the
best
first-order
solver
PGD
Adversarial
TrainingMadryetal."TowardsDeepLearningModelsResistanttoAdversarialAttacks."
ICLR.2018.Projected
Gradient
Descent
(PGD)PGD
is
an
optimizerPGD
is
also
known
as
an
adv
attack
in
the
field
of
AML
Projection(Clipping)
PGD
Adversarial
TrainingMadryetal."TowardsDeepLearningModelsResistanttoAdversarialAttacks."
ICLR.2018.Projected
Gradient
Descent
(PGD)Random
initialization
+
Uniform
Noise
PGD
Adversarial
TrainingMadryetal.“TowardsDeepLearningModelsResistanttoAdversarialAttacks.”
ICLR.2018.
Ilyasetal.“Adversarialexamplesarenotbugs,theyarefeatures.”
NeurIPS,
2019.Characteristics
of
PGD
adversarial
training
PGD
Adversarial
TrainingMadryetal.“TowardsDeepLearningModelsResistanttoAdversarialAttacks.”
ICLR.2018.
Ilyasetal.“Adversarialexamplesarenotbugs,theyarefeatures.”
NeurIPS,
2019.決策邊界:魯棒特征:普通訓(xùn)練對(duì)抗訓(xùn)練普通訓(xùn)練對(duì)抗訓(xùn)練對(duì)抗樣本Dynamic
Adversarial
Training
(DART)Wangetal.“OntheConvergenceandRobustnessofAdversarialTraining.”
ICML.2019.
PGD步長(zhǎng)對(duì)魯棒性影響PGD步數(shù)對(duì)魯棒性影響訓(xùn)練初期使用簡(jiǎn)單攻擊Dynamic
Adversarial
Training
(DART)Wangetal.“OntheConvergenceandRobustnessofAdversarialTraining.”
ICML.2019.
How
to
measure
the
convergence
of
the
inner
maximization?
Definition(
First-OrderStationaryCondition
(FOSC))
Dynamic
Adversarial
Training
(DART)Wangetal.“OntheConvergenceandRobustnessofAdversarialTraining.”
ICML.2019.
Dynamic
Adversarial
Training:Weak
attack
for
early
training,
strong
attack
for
later
trainingWeak
attack
improves
generalization,
strong
attack
improves
final
robustness.Convergence
analysis:DART
improves
robustnessRobustness
on
CIFAR-10
with
WideResNetTRADESZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Use
distribution
loss
(KL)
for
inner
and
outer
optimizationsTRADESZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Winning
solutions
of
NeurIPS2018AdversarialVisionChallengeCharacteristics
of
TRADES使用KL監(jiān)督對(duì)抗樣本的生成,魯棒性提升顯著干凈樣本也參與訓(xùn)練,有利于模型收斂和干凈準(zhǔn)確率基于KL的對(duì)抗樣本生成包含自適應(yīng)的過(guò)程能成訓(xùn)練得到比PGD對(duì)抗訓(xùn)練更平滑的決策邊界TRADES既改進(jìn)了內(nèi)部最大化又改進(jìn)了外部最小化TRADESZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Experimental
results
of
TRADESTRADES
vs
VAT
vs
ALPZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Miyato
etal.Distributionalsmoothingwithvirtualadversarialtraining.ICLR
2016.Kannan,Harini,AlexeyKurakin,andIanGoodfellow."Adversariallogitpairing."
arXivpreprintarXiv:1803.06373
(2018).TRADES:Virtual
Adversarial
Training:Adversarial
Logits
Pairing:相似的優(yōu)化框架,不同的損失選擇,結(jié)果差異很大MART:
MisclassificationAwareadveRsarialTrainingWang,etal.“Improvingadversarialrobustnessrequiresrevisitingmisclassifiedexamples.”
ICLR,
2019.Adversarialexamplesareonlydefinedoncorrectlyclassifiedexamples,
what
about
misclassified
examples
?
MART:
MisclassificationAwareadveRsarialTrainingWang,etal.“Improvingadversarialrobustnessrequiresrevisitingmisclassifiedexamples.”
ICLR,
2019.The
influence
of
misclassifiedandcorrectlyclassifiedexamples:
Misclassifiedexampleshaveasignificantimpactonthefinalrobustness!MART:
MisclassificationAwareadveRsarialTrainingWang,etal.“Improvingadversarialrobustnessrequiresrevisitingmisclassifiedexamples.”
ICLR,
2019.
differentmaximizationtechniques
have
negligibleeffectdifferentminimizationtechniques
have
significanteffectMART:
MisclassificationAwareadveRsarialTrainingMisclassificationawareadversarialrisk:Adversarial
risk:Correctlyclassified
and
misclassifiedexample:Misclassificationawareadversarialrisk:MART:
MisclassificationAwareadveRsarialTrainingSurrogatelossfunctions(existingmethodsandMART):Semi-supervisedextensionofMART:MART:
MisclassificationAwareadveRsarialTrainingRobustness
of
MART:White-boxrobustness:ResNet-18,CIFAR-10,??=8/255White-boxrobustness:WideResNet-34-10,CIFAR-10,??=8/255Using
More
Data
to
Improve
RobustnessAlayraceta
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 商業(yè)保密協(xié)議書(shū)合同七篇
- 頸部血管損傷病因介紹
- 隱匿性腎小球腎炎病因介紹
- 輸尿管狹窄病因介紹
- (范文)滾塑模具項(xiàng)目立項(xiàng)報(bào)告
- (2024)陶瓷膜系列產(chǎn)品生產(chǎn)建設(shè)項(xiàng)目可行性研究報(bào)告(一)
- (2024)PVC新型裝飾膜生產(chǎn)線項(xiàng)目可行性研究報(bào)告建議書(shū)立項(xiàng)(一)
- 廣東省普通高中2024屆高三合格性考試模擬沖刺數(shù)學(xué)試題(二)(原卷版)-A4
- 2023年厚、薄膜混合集成電路及消費(fèi)類(lèi)電路項(xiàng)目融資計(jì)劃書(shū)
- 智慧文旅行業(yè)解決方案全集
- 方案的構(gòu)思過(guò)程 課件-2023-2024學(xué)年高中通用技術(shù)蘇教版(2019)技術(shù)與設(shè)計(jì)1
- 青海中冠嘉頡環(huán)保科技有限公司鋁灰渣循環(huán)再利用項(xiàng)目(中試試驗(yàn))環(huán)評(píng)報(bào)告
- 抽水水施工方案
- 幼兒園保育員培訓(xùn)記錄(21篇)
- 班會(huì)尊重與理解(共32張PPT)
- 2023年四川省公需科目(數(shù)字經(jīng)濟(jì)與驅(qū)動(dòng)發(fā)展)考試題庫(kù)及答案
- 涉農(nóng)法律法規(guī)和會(huì)
- 防疫員技師考試題庫(kù)及答案
- 蘇教六年級(jí)數(shù)學(xué)上冊(cè)百分?jǐn)?shù)整理與復(fù)習(xí)課件
- 軟件工程項(xiàng)目實(shí)踐教程第四版課后參考答案
- 流行性感冒病人的護(hù)理
評(píng)論
0/150
提交評(píng)論