![數(shù)據(jù)與模型安全 課件 第5周:對(duì)抗防御_第1頁(yè)](http://file4.renrendoc.com/view12/M0B/0D/13/wKhkGWcJ9BWAWPK8AAC0vHR0o94473.jpg)
![數(shù)據(jù)與模型安全 課件 第5周:對(duì)抗防御_第2頁(yè)](http://file4.renrendoc.com/view12/M0B/0D/13/wKhkGWcJ9BWAWPK8AAC0vHR0o944732.jpg)
![數(shù)據(jù)與模型安全 課件 第5周:對(duì)抗防御_第3頁(yè)](http://file4.renrendoc.com/view12/M0B/0D/13/wKhkGWcJ9BWAWPK8AAC0vHR0o944733.jpg)
![數(shù)據(jù)與模型安全 課件 第5周:對(duì)抗防御_第4頁(yè)](http://file4.renrendoc.com/view12/M0B/0D/13/wKhkGWcJ9BWAWPK8AAC0vHR0o944734.jpg)
![數(shù)據(jù)與模型安全 課件 第5周:對(duì)抗防御_第5頁(yè)](http://file4.renrendoc.com/view12/M0B/0D/13/wKhkGWcJ9BWAWPK8AAC0vHR0o944735.jpg)
版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
Adversarial
Defense姜育剛,馬興軍,吳祖煊Recap:
week
4
1.
Adversarial
Example
DetectionSecondary
Classification
Methods
(二級(jí)分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測(cè)法)Prediction
Inconsistency
(預(yù)測(cè)不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測(cè)法)Adversarial
Attack
CompetitionLink:https://codalab.lisn.upsaclay.fr/competitions/15669?secret_key=77cb8986-d5bd-4009-82f0-7dde2e819ff8
Adversarial
Defense
vs
DetectionThe
weird
relationship
between
defense
and
detectionDetection
IS
defenseBut…
when
we
say
defense,
we
(most
of
the
time)
mean
the
model
is
secured,
yet
detection
cannot
do
that…In
survey
papers:
detection
is
defenseIn
technical
papers:
defense
is
defense
not
detectionDifferencesDefense
is
to
secure
the
model
or
the
systemDetection
is
to
identify
potential
threats,
which
should
befollowed
by
a
defense
strategy,
e.g.,
query
rejection
(but
mostly
ignored)By
defense,
it
mostly
means
robust
training
methodsDefense
MethodsEarly
Defense
MethodsEarly
Adversarial
Training
MethodsLater
Adversarial
Training
MethodsRemaining
Challenges
and
Recent
ProgressA
Recap
of
the
Timeline2014年Goodfellow等人提出快速單步攻擊FGSM及對(duì)抗訓(xùn)練2015年簡(jiǎn)單檢測(cè)方法(PCA)和對(duì)抗訓(xùn)練方法2016年提出對(duì)抗訓(xùn)練的min-max優(yōu)化框架2017年大量的對(duì)抗樣本檢測(cè)方法和攻擊方法(BIM、C&W)、10種檢測(cè)方法被攻破2019年TRADES及大量其他對(duì)抗訓(xùn)練方法、第一篇Science文章2018年物理世界攻擊方法、檢測(cè)方法升級(jí)、PGD攻擊與對(duì)抗訓(xùn)練、9種防御方法被攻破2020年AutoAttack攻擊、Fast對(duì)抗訓(xùn)練2021年增大模型、增加數(shù)據(jù)的對(duì)抗訓(xùn)練、領(lǐng)域延伸2022年尚未解決的問(wèn)題,攻擊越來(lái)越多,防御越來(lái)越難2013年Biggio等人與Szegedy等人發(fā)現(xiàn)對(duì)抗樣本Principles
of
DefenseBlock
the
attack
(掐頭去尾)Mask
the
input
gradientsRegularize
the
input
gradientsDistill
the
logitsDenoise
the
inputRobustify
the
model
(增強(qiáng)中間)Smooth
the
decision
boundaryReduce
the
Lipschitzness
of
the
modelSmooth
the
loss
landscapeAdversarial
AttackSzegedyC,ZarembaW,SutskeverI,etal.Intriguingpropertiesofneuralnetworks[J].ICLR
2014.GoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.
模型訓(xùn)練:
對(duì)抗攻擊:
分類錯(cuò)誤擾動(dòng)很小測(cè)試階段攻擊
Performance
Metrics
Othermetrics:maximumperturbationfor100%attacksuccessrate
Defense
MethodsEarly
Defense
MethodsEarly
Adversarial
Training
MethodsAdvanced
Adversarial
Training
MethodsRemaining
Challenges
and
Recent
ProgressDefensive
DistillationMaking
large
logits
change
to
be
”small”Scaling
up
logits
by
a
few
magnitudes;Retrain
the
lastlayerwith
scaled
logits;Papernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016Defensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016
12Distillation
with
temperature
TDefensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016Defensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016Defensive
DistillationPapernot
et
al.
Distillationasadefensetoadversarialperturbationsagainstdeepneuralnetworks,
S&P,
2016
Defensive
Distillation
Is
Not
RobustCarlini,Nicholas,andDavidWagner."Defensivedistillationisnotrobusttoadversarialexamples."
arXivpreprintarXiv:1607.04311
(2016).
It
can
be
evaded
by
attacking
the
distilled
network
with
the
temperature
T.
Lessons
LearnedCarlini,Nicholas,andDavidWagner."Defensivedistillationisnotrobusttoadversarialexamples."
arXivpreprintarXiv:1607.04311
(2016).Distillation
is
not
a
good
solution
for
adversarial
robustnessVanishing
input
gradients
can
still
be
recovered
by
a
reverse
operationA
defense
should
beevaluated
against
the
adaptive
attack
to
prove
real
robustnessInput
Gradients
RegularizationRoss
et
al."Improvingtheadversarialrobustnessandinterpretabilityofdeepneuralnetworksbyregularizingtheirinputgradients."
AAAI,2018.Drucker,Harris,andYannLeCun.“Improvinggeneralizationperformanceusingdoublebackpropagation.”
TNN,1992.
Classification
lossInput
gradients
regularization
Related
to
the
double
backpropagation
proposed
by
DruckerandLeCun(1992):Input
Gradients
RegularizationRoss
et
al."Improvingtheadversarialrobustnessandinterpretabilityofdeepneuralnetworksbyregularizingtheirinputgradients."
AAAI,2018.Issues:
1)
limited
adversarial
robustness,
2)
hurts
learning蒸餾的對(duì)抗訓(xùn)練的正則化的Feature
SqueezingXu
et
al."Featuresqueezing:Detectingadversarialexamplesindeepneuralnetworks."
NDSS,2018.Compress
the
input
spaceIt
also
hurts
performance
on
large-scale
image
datasets.ThermometerEncodingBuckmanetal."Thermometerencoding:Onehotwaytoresistadversarialexamples."
ICLR,2018.Discretize
the
input
to
break
small
noiseProposed
Thermometer
EncodingInput
TransformationsGuoetal."CounteringAdversarialImagesusingInputTransformations."
ICLR,2018.ImagecroppingandrescalingBit-depthreductionJPEGcompressionTotal
variance
minimizationImagequiltingObfuscated
Gradients
=
Fake
RobustnessAthalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018.
Athalye
et
al.Synthesizingrobustadversarialexamples.ICML,2018.BackwardPassDifferentiableApproximation(BPDA):
Expectation
Over
Transformation
(EOT)T:
a
set
of
randomized
transformationscan
break
non-differentiable
operation
based
defensescan
break
randomization
based
defensesfind
a
linear
approximation
of
the
non-differentiable
operations,
e.g.,
discretization,
compression
etc.
BPDA+EOT
breaks
7
defenses
published
at
ICLR
2018Athalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018.
Athalye
et
al.Synthesizingrobustadversarialexamples.ICML,2018.We
got
a
survivor!How
to
Properly
Evaluate
a
Defense?Carlini,Nicholas,etal.“Onevaluatingadversarialrobustness.”
arXivpreprintarXiv:1902.06705
(2019).
Athalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018Donotblindlyapplymultiple(similar)attacksTryatleastonegradient-freeattackandonehard-labelattackPerformatransferabilityattackusingasimilarsubstitutemodel.Forrandomizeddefenses,properlyensembleoverrandomnessFornon-differentiablecomponents,applydifferentiabletechniques
(BPDA)VerifythattheattackshaveconvergedundertheselectedhyperparametersCarefullyinvestigateattackhyperparametersandreportthoseselectedCompareagainstpriorworkandexplainimportantdifferencesTestbroaderthreatmodelswhenproposinggeneraldefensesRobust
Activation
FunctionsXiao
et
al."EnhancingAdversarialDefensebyk-Winners-Take-All."
ICLR,2020.Block
the
internal
activation:
break
the
continuityk-Winners-Take-All(k-WTA)activationRobust
Loss
FunctionPang
et
al.Rethinkingsoftmaxcross-entropylossforadversarialrobustness.ICLR,2020.Max-Mahalanobiscenter(MMC)
lossMax-Mahalanobiscenter(MMC);
SCE:
softmax
cross
entropyRobust
InferencePang
et
al.MixupInference:Betterexploitingmixuptodefendadversarialattacks.ICLR,2020.Mixup
Inference
(MI)New
Adaptive
Attacks
Break
These
DefensesTrameretal.“Onadaptiveattackstoadversarialexampledefenses.”
NeurIPS,
2020.T1:
AttackthefulldefenseT2:
Target
importantdefensepartsT3:
SimplifytheattackT4:
EnsureconsistentlossfunctionT5:
OptimizewithdifferentmethodsT6:
UsestrongadaptiveattacksHow
to
Evaluate
aDefense?CroceandHein.“Reliableevaluationofadversarialrobustnesswithanensembleofdiverseparameter-freeattacks.”
ICML,2020.
Gao
et
al.
FastandReliableEvaluationofAdversarialRobustnesswithMinimum-MarginAttack,
ICML
2022.Zimmermann
etal."IncreasingConfidenceinAdversarialRobustnessEvaluations."
arXivpreprintarXiv:2206.13991
(2022).Strong
attacks:AutoAttack
(one
must-to-test
attack)Margin
Decomposition
(MD)
attack
(better
than
AutoAttack
on
ViT)Minimum-Margin
(MM)
attack
(new
SOTA
attack
to
test?)Extra
robustness
tests
Attackunittests
(Zimmermannetal,2022)Adversarial
TrainingGoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.The
idea
is
simple:
just
train
on
adversarial
examples!
對(duì)抗訓(xùn)練是一種數(shù)據(jù)增廣方法原始數(shù)據(jù)->對(duì)抗攻擊->對(duì)抗樣本->模型訓(xùn)練Adversarial
Training
Adversarial
TrainingAdversarial
training
produces
smooth
decision
boundary正常邊界生成對(duì)抗樣本訓(xùn)練后Early
Adversarial
Training
MethodsSzegedyC,ZarembaW,SutskeverI,etal.Intriguingpropertiesofneuralnetworks[J].ICLR
2014.GoodfellowIJ,ShlensJ,SzegedyC.Explainingandharnessingadversarialexamples[J].ICLR
2015.2014年,Szegedy
et
al.
在其解釋對(duì)抗樣本的論文中已探索了對(duì)抗訓(xùn)練,用L-BFGS攻擊對(duì)神經(jīng)網(wǎng)絡(luò)每一層生成對(duì)抗樣本,并添加到訓(xùn)練過(guò)程中。發(fā)現(xiàn):深層對(duì)抗樣本更能提高魯棒性2015年,Goodfellow
et
al.提出使用FGSM(單步)攻擊生成的對(duì)抗樣本來(lái)訓(xùn)練神經(jīng)網(wǎng)絡(luò)Goodfellow等人并未使用中間層的對(duì)抗樣本,因?yàn)榘l(fā)現(xiàn)中間層對(duì)抗樣本沒(méi)有提升Min-max
Robust
OptimizationNoklandetal.Improvingback-propagationbyaddinganadversarialgradient.
arXiv:1510.04189,
2015.Huang
et
al.
Learningwithastrongadversary,
ICLR
2016.
Shaham
et
al.
Understandingadversarialtraining:Increasinglocalstabilityofneuralnetsthroughrobustoptimization,
arXiv:1511.05432,
2015The
First
Proposal
of
Min-Max
Optimization內(nèi)部最大化Inner
maximization外部最小化Outer
minimization
Virtual
Adversarial
TrainingMiyato
etal.Distributionalsmoothingwithvirtualadversarialtraining.ICLR
2016.VAT:
a
method
to
improve
generalizationDifferences
to
adversarial
trainingL2
regularized
perturbationUse
both
clean
and
adv
examples
for
trainingUse
KL
divergence
to
generate
adv
examplesWeaknesses
of
Early
AT
MethodsMiyato
etal.Distributionalsmoothingwithvirtualadversarialtraining.ICLR
2016.
These
methods
are
fast!
Only
takes
x2
time
of
standard
trainingPGD
Adversarial
TrainingAthalyeetal.“ObfuscatedGradientsGiveaFalseSenseofSecurity:CircumventingDefensestoAdversarialExamples.”
ICML,2018.
Athalye
et
al.Synthesizingrobustadversarialexamples.ICML,2018.We
got
a
survivor!PGD
Adversarial
TrainingMadryetal."TowardsDeepLearningModelsResistanttoAdversarialAttacks."
ICLR.2018.A
Saddle
Point
Problem內(nèi)部最大化Inner
maximization外部最小化Outer
minimizationA
saddle
point(constrained
bi-level
optimization)
problemIn
constrained
optimization,
Projected
Gradient
Descent
(PGD)
is
the
best
first-order
solver
PGD
Adversarial
TrainingMadryetal."TowardsDeepLearningModelsResistanttoAdversarialAttacks."
ICLR.2018.Projected
Gradient
Descent
(PGD)PGD
is
an
optimizerPGD
is
also
known
as
an
adv
attack
in
the
field
of
AML
Projection(Clipping)
PGD
Adversarial
TrainingMadryetal."TowardsDeepLearningModelsResistanttoAdversarialAttacks."
ICLR.2018.Projected
Gradient
Descent
(PGD)Random
initialization
+
Uniform
Noise
PGD
Adversarial
TrainingMadryetal.“TowardsDeepLearningModelsResistanttoAdversarialAttacks.”
ICLR.2018.
Ilyasetal.“Adversarialexamplesarenotbugs,theyarefeatures.”
NeurIPS,
2019.Characteristics
of
PGD
adversarial
training
PGD
Adversarial
TrainingMadryetal.“TowardsDeepLearningModelsResistanttoAdversarialAttacks.”
ICLR.2018.
Ilyasetal.“Adversarialexamplesarenotbugs,theyarefeatures.”
NeurIPS,
2019.決策邊界:魯棒特征:普通訓(xùn)練對(duì)抗訓(xùn)練普通訓(xùn)練對(duì)抗訓(xùn)練對(duì)抗樣本Dynamic
Adversarial
Training
(DART)Wangetal.“OntheConvergenceandRobustnessofAdversarialTraining.”
ICML.2019.
PGD步長(zhǎng)對(duì)魯棒性影響PGD步數(shù)對(duì)魯棒性影響訓(xùn)練初期使用簡(jiǎn)單攻擊Dynamic
Adversarial
Training
(DART)Wangetal.“OntheConvergenceandRobustnessofAdversarialTraining.”
ICML.2019.
How
to
measure
the
convergence
of
the
inner
maximization?
Definition(
First-OrderStationaryCondition
(FOSC))
Dynamic
Adversarial
Training
(DART)Wangetal.“OntheConvergenceandRobustnessofAdversarialTraining.”
ICML.2019.
Dynamic
Adversarial
Training:Weak
attack
for
early
training,
strong
attack
for
later
trainingWeak
attack
improves
generalization,
strong
attack
improves
final
robustness.Convergence
analysis:DART
improves
robustnessRobustness
on
CIFAR-10
with
WideResNetTRADESZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Use
distribution
loss
(KL)
for
inner
and
outer
optimizationsTRADESZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Winning
solutions
of
NeurIPS2018AdversarialVisionChallengeCharacteristics
of
TRADES使用KL監(jiān)督對(duì)抗樣本的生成,魯棒性提升顯著干凈樣本也參與訓(xùn)練,有利于模型收斂和干凈準(zhǔn)確率基于KL的對(duì)抗樣本生成包含自適應(yīng)的過(guò)程能成訓(xùn)練得到比PGD對(duì)抗訓(xùn)練更平滑的決策邊界TRADES既改進(jìn)了內(nèi)部最大化又改進(jìn)了外部最小化TRADESZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Experimental
results
of
TRADESTRADES
vs
VAT
vs
ALPZhangetal."Theoreticallyprincipledtrade-offbetweenrobustnessandaccuracy."
ICML,2019.Miyato
etal.Distributionalsmoothingwithvirtualadversarialtraining.ICLR
2016.Kannan,Harini,AlexeyKurakin,andIanGoodfellow."Adversariallogitpairing."
arXivpreprintarXiv:1803.06373
(2018).TRADES:Virtual
Adversarial
Training:Adversarial
Logits
Pairing:相似的優(yōu)化框架,不同的損失選擇,結(jié)果差異很大MART:
MisclassificationAwareadveRsarialTrainingWang,etal.“Improvingadversarialrobustnessrequiresrevisitingmisclassifiedexamples.”
ICLR,
2019.Adversarialexamplesareonlydefinedoncorrectlyclassifiedexamples,
what
about
misclassified
examples
?
MART:
MisclassificationAwareadveRsarialTrainingWang,etal.“Improvingadversarialrobustnessrequiresrevisitingmisclassifiedexamples.”
ICLR,
2019.The
influence
of
misclassifiedandcorrectlyclassifiedexamples:
Misclassifiedexampleshaveasignificantimpactonthefinalrobustness!MART:
MisclassificationAwareadveRsarialTrainingWang,etal.“Improvingadversarialrobustnessrequiresrevisitingmisclassifiedexamples.”
ICLR,
2019.
differentmaximizationtechniques
have
negligibleeffectdifferentminimizationtechniques
have
significanteffectMART:
MisclassificationAwareadveRsarialTrainingMisclassificationawareadversarialrisk:Adversarial
risk:Correctlyclassified
and
misclassifiedexample:Misclassificationawareadversarialrisk:MART:
MisclassificationAwareadveRsarialTrainingSurrogatelossfunctions(existingmethodsandMART):Semi-supervisedextensionofMART:MART:
MisclassificationAwareadveRsarialTrainingRobustness
of
MART:White-boxrobustness:ResNet-18,CIFAR-10,??=8/255White-boxrobustness:WideResNet-34-10,CIFAR-10,??=8/255Using
More
Data
to
Improve
RobustnessAlayraceta
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- GB/T 17925-2024氣瓶對(duì)接焊縫X射線數(shù)字成像檢測(cè)
- Tripetroselinin-1-2-3-Tri-6-Z-octadecenoyl-glycerol-生命科學(xué)試劑-MCE-1244
- Diethylene-glycol-d8-2-2-Oxybis-ethan-1-ol-d-sub-8-sub-生命科學(xué)試劑-MCE-5883
- 2025年度掛車司機(jī)運(yùn)輸合同違約責(zé)任與賠償合同
- 2025年度網(wǎng)絡(luò)安全行業(yè)競(jìng)業(yè)限制協(xié)議生效細(xì)則及數(shù)據(jù)隱私
- 二零二五年度創(chuàng)業(yè)公司股權(quán)分配及股權(quán)激勵(lì)協(xié)議
- 2025年度消防電梯采購(gòu)與應(yīng)急救援系統(tǒng)配套合同
- 2025年度水果種植基地農(nóng)業(yè)保險(xiǎn)合同
- 2025年度綠色能源股權(quán)合作開發(fā)合同
- 施工現(xiàn)場(chǎng)施工防傳染病制度
- 醫(yī)院消防安全培訓(xùn)課件
- 質(zhì)保管理制度
- 《00541語(yǔ)言學(xué)概論》自考復(fù)習(xí)題庫(kù)(含答案)
- 2025年機(jī)關(guān)工會(huì)個(gè)人工作計(jì)劃
- 2024年全國(guó)卷新課標(biāo)1高考英語(yǔ)試題及答案
- 華為經(jīng)營(yíng)管理-華為激勵(lì)機(jī)制(6版)
- 江蘇省南京市、鹽城市2023-2024學(xué)年高三上學(xué)期期末調(diào)研測(cè)試+英語(yǔ)+ 含答案
- 2024護(hù)理不良事件分析
- 光伏項(xiàng)目的投資估算設(shè)計(jì)概算以及財(cái)務(wù)評(píng)價(jià)介紹
- 2024新版《藥品管理法》培訓(xùn)課件
- 干燥綜合征診斷及治療指南
評(píng)論
0/150
提交評(píng)論