數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性

上傳人：y*** IP屬地：山東上傳時間：2024-10-12 格式：PPTX 頁數(shù)：62 大小：44.71MB 積分：15 舉報 版權(quán)申訴

已閱讀5頁，還剩57頁未讀，繼續(xù)免費閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請進行舉報或認領(lǐng)

文檔簡介

Explainability&

CommonRobustness姜育剛，馬興軍，吳祖煊

What

Machine

Learning

Machine

Learning

Paradigms3.

Loss

FunctionsRecap:

week

14.

Optimization

MethodsMachine

Learning

Pipelinesetuptheinputsetuptheoptimisersetupthelossregularizationmakesdecisionregionsmootherlandscape

ofalossfunction,itvariesw.r.t.data,thefunctionitselfMachine

Learning

Pipelinesetuptheinputsetuptheoptimisersetupthelossregularizationmakesdecisionregionsmootherlandscape

ofalossfunction,itvariesw.r.t.data,thefunctionitselfModel？Deep

Neural

Networks/neural-network-zoo/;/articles/cc-machine-learning-deep-learning-architectures/Feed-Forward

Neural

NetworksFeed-ForwardNeuralNetworks

(FNN)Fully

Connected

Neural

Networks

(FCN)Multilayer

Perceptron

(MLP)The

simplest

neural

networkFully-connectedbetweenlayersFordatathathasNOtemporalorspatialorder/ConvolutionalNeuralNetworksForimagesordatawithspatialorderCan

stack

>100

layers/Neurons

dimensionsNeurons

one

flat

layerRecurrent

Neural

Networks/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networksTraditional

RNNTransformersVaswani,Ashish,etal."Attentionisallyouneed."

Advancesinneuralinformationprocessingsystems

30(2017)Transformer:

new

type

DNNs

based

attentionEncoderDecoderSelf-Attention

Explained/illustrated-self-attention-2d627e33b20aCNN

ExplainedLearns

different

levels

representations/A

brief

history

CNNs:LeNet,1990sAlexNet,2012ZFNet,2013GoogLeNet,2014VGGNet,2014ResNet,2015InceptionV4,2016ResNeXt,

2017ViT,

2021AnImageisWorth16x16Words:TransformersforImageRecognitionatScale,

ICLR

2021Explainable

AI深度學(xué)習(xí)可解釋性學(xué)習(xí)機理推理機理泛化機理認知機理魯棒性學(xué)習(xí)過程學(xué)習(xí)結(jié)果決策依據(jù)推理機制泛化原因泛化條件認知科學(xué)認知啟發(fā)的智能普通魯棒性對抗魯棒性我們想要弄清楚下列問題：DNN是怎么學(xué)習(xí)的、學(xué)到了什么、靠什么泛化、在什么情況下行又在什么情況下不行？深度學(xué)習(xí)是否是真正的智能，與人類智能比誰更高級，它的未來是什么？是否存在大一統(tǒng)的理論，不但能解釋而且能提高？Methodological

PrinciplesVisualizationAblationContrastModelComponentLayerOperationNeuronSuperclassClassTraining/Test

setSubsetSampleTrainingInferenceTransferReverseHow

Understand

Machine

LearningLearning

the

process

empirical

risk

minimization

(ERM)Learning

MechanismTraining/Test

Error/AccuracyPrediction

Confidence

Explanation

via

observation:

just

plot!Wang

al.

SymmetricCrossEntropyforRobustLearningwithNoisyLabels,

ICCV

2019.Learning

MechanismParameter

dynamicsGradient

dynamicsExplanation

via

dynamics

and

informationTRADI:Trackingdeepneuralnetworkweightdistributions,

ECCV

2020;

Shwartz-ZivR,TishbyN.Openingtheblackboxofdeepneuralnetworksviainformation[J].arXiv:1703.00810,2017.Learning

MechanismDecision

boundary,

learning

process

visualizationExplanation

via

dynamics

and

informationhttps://distill.pub/2020/grand-tour/（March16,2020）;

/Learning

MechanismData

influence/valuation:

how

training

sample

impacts

the

learning

outcome?UnderstandingBlack-boxPredictionsviaInfluenceFunctions,

ICML,

2018;

PruthiG,LiuF,KaleS,etal.Estimatingtrainingdatainfluencebytracinggradientdescent.NeurIPS,2020.Datashapley:Equitablevaluationof

data

formachinelearning,

ICML,

2019.Influence

FunctionData

ShapleyInfluence

FunctionHow

model

parameter

would

change

sample

removed

from

the

training

set?UnderstandingBlack-boxPredictionsviaInfluenceFunctions,

ICML,

2018;

目標：

Cook,R.D.andWeisberg,S.Residualsandinfluenceinregression.NewYork:ChapmanandHall,1982

所以：

Training

Data

InfluenceHow

model

loss

z’

would

change

update

sample

z?PruthiG,LiuF,KaleS,etal.Estimatingtrainingdatainfluencebytracinggradientdescent.NeurIPS,2020First-order

approximation

the

above

(assuming

one

step

update

small)?Checkpoints

store

the

interim

updates所以：Understanding

the

Learned

ModelLoss

LandscapeDeep

featurest-SNE

plotMaaten

al.Visualizingdatausingt-SNE.

JMLR,

2008.https://distill.pub/2016/misread-tsne/?_ga=2.135835192.888864733.1531353600-1779571267.1531353600Understanding

the

Learned

ModelClass-wise

PatternsIntermediate

Layer

Activation

MapActivation/Attention

MapLi

al.

NeuralAttentionDistillation:ErasingBackdoorTriggersfromDeepNeuralNetwork,

ICLR

2021;

Zhao

etal.Whatdodeepnetslearn?class-wisepatternsrevealedintheinputspace.arXiv:2101.06898

(2021).One

predictive

pattern

for

each

classWhat

deep

nets

learn?Zhao,Shihao,etal."Whatdodeepnetslearn?class-wisepatternsrevealedintheinputspace."

arXiv:2101.06898

(2021).Goal:

understanding

knowledge

learned

model

particular

class.Method:

Extract

one

single

pattern

for

one

class,

then

what

this

pattern

would

be?

Other

considerations:

need

this

pixel

space,

they

are

interpretableHow

Find

the

Class-wise

Pattern:

canvas

imagePatterns

extracted

different

canvases

(red

rectangles)Class-wise

Patterns

RevealedPatterns

extracted

original,

non-robust,

robust

CIFAR-10and

patterns

adversarially

trained

modelsPredictive

power

different

sizes

patternsInference

MechanismClass

Activation

Map

(Grad-CAM)Guided

BackpropagationSelvaraju

etal.Grad-cam:Visualexplanationsfromdeepnetworksviagradient-basedlocalization.

ICCV

2017.Springenberg

al.

StrivingforSimplicity:TheAllConvolutionalNet,

ICLR

2015.Guided

BackpropagationSpringenbergetal.StrivingforSimplicity:TheAllConvolutionalNet,ICLR2015.

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709ReLU

forward

passReLU

backward

passDeconvolution

for

ReLUGuided

BackpropagationClass

Activation

Mapping

(CAM)Zhou

al.LearningDeepFeaturesforDiscriminativeLocalization.CVPR,2016.

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709GAP:

Global

Average

PoolingGrad-CAMB.Zhou,A.Khosla,L.A.,A.Oliva,andA.Torralba.LearningDeepFeaturesforDiscriminativeLocalization.InCVPR,2016;

/@chinesh4/generalized-way-of-interpreting-cnns-a7d1b0178709Grad-CAM

generalization

CAMCompute

neuron

importance:

Weighted

combination

activation

map,

then

interpolation:LIMELocalInterpretableModel-agnosticExplanations(LIME)Ribeiro

al.“Whyshoulditrustyou?”Explainingthepredictionsofanyclassifier.“

SIGKDD,

2016./marcotcr/lime

Integrated

GradientsSundararajanM,TalyA,YanQ.Axiomaticattributionfordeepnetworks,

ICML,2017./TianhongDai/integrated-gradient-pytorch

Integrate

the

gradients

along

the

wayCognitive

DistillationHuang

al.

DistillingCognitiveBackdoorPatternswithinanImage,

ICLR

2023MaskextractbycognitivedistillationUsefulandnon-usefulfeaturesUsefulfeatures:highlycorrelatedwiththetruelabelinexpectation,

soIfremoved,predictionchangeBackdoortriggerisausefulfeatureNon-usefulfeatures:notcorrelated

with

predictionIfremoved,predictiondoesnotchangeIlyas,Andrew,etal."Adversarialexamplesarenotbugs,theyarefeatures.”NeurIPS2019CognitiveDistillationObjective:distilltheminimalessenceofusefulfeaturesModelTotalVariationLossRandomnoisevectorOriginalimageMaskCognitivePatternCognitiveDistillationDistilledpatternsonbackdoored

samplesxcpmxHow

VerifyCognitivePatterns

are

EssentialBackdooredimageBinarizedmask{0,1}OriginalimageConstruct

simplified

backdoor

patterns:Backdoor

Patterns

Can

Made

Simplerxcpmxxbd’Backdoor

Patterns

Can

Made

SimplerSimplified

backdoor

patterns

also

work!L1Norm

Distributionofthe

Distilled

MaskDetect

Backdoor

SamplesAttacks:12backdoorattacksModels:ResNet-18,Pre-ActivationResNet-101,MobileNetv2,VGG-16,Inception,EfficientNet-b0Datasets:CIFAR-10/GTSRB/ImageNetsubsetEvaluation

metric:areaundertheROCcurve(AUROC)Detectionbaselines:Anti-BackdoorLearning(ABL)[2]ActivationClustering(AC)[3]Frequency[4]STRIP[5]SpectralSignatures[6]CD-L(logitslayer)andCD-F(lastactivationlayer)Superb

Detection

PerformanceCelebA

dataset:40binaryfacialattributes(gender,bald,andhaircolor)KnownbiasbetweengenderandblondhairApply

CDinthesamewayasbackdoordetectionSelectsubsetofsampleswithlowL1normExamineattributesofthesubsetCalculatedistributionshiftbetweensubsetandthefulldatasetDiscover

Biases

Facial

Recognition

ModelsDiscover

Biases

Facial

Recognition

ModelsMasks

distilled

for

predicting

each

attributeDiscover

Biases

Facial

Recognition

ModelsGeneralization

MechanismConvergenceGeneralizationDeep

Learning

TheoryConvergenceConvex

(Linear

model)Nonconvex

(DNN)Saddle

pointGeneralizationTraining

time‘Cat’Test

time‘Cat’?Traditional

theory:

simpler

model

better,

data

betterGeneralization

Theory/~ninamf/ML11/lect1117.pdf;/watch?v=zlqQ7VRba2YComponents

Generalization

Error

Boundsgeneralizationerror

empiricalerror

hypothesisclasscomplexity

confidencesample

sizeRHS:

for

all

terms,

the

lower

the

better:

small

training

errorsimpler

model

classmore

samples

less

confidenceGeneralization

TheoryZhang

al.

Understandingdeeplearningrequiresrethinkinggeneralization.

ICLR

2017.Small

training

error≠low

generalization

errorZero

training

error

was

achieved

purely

random

labels

(meaningless

learning)0

training

error

vs.

0.9

test

errorList

Existing

TheoriesRademacher

Complexity

bounds

(Bartlett

al.

2017)PAC-Bayes

bounds

(Dziugaite

and

Roy

2017)Information

bottleneck

(Tishby

and

Zaslavsky

2015)Neural

tangent

kernel/Lazy

training

(Jacot

al.

2018)Mean-field

analysis

(Chizat

and

Bach

2018)Doule

Descent

(Belkin

al.

2019)Entropy

SGD

(Chaudhari

al.

2019)/watch?v=zlqQ7VRba2YA

few

interesting

questions:Should

consider

the

role

data

generalization

analysis?Should

representation

quality

appear

the

generalization

bound?Generalization

about

math

(the

function

the

model)

knowledge?How

visualize

generalization?

Existing

approachestest

errorVisualization:

loss

landscape,

prediction

attribution,

etc.Training

test:

distribution

shift,

out-of-distribution

analysisNoisy

labels

test

data

–

questioning

data

quality

and

reliable

evaluationThe

remaining

questions:

how

generalization

happens?Math≠KnowledgeComputation

finding

patterns

understanding

the

underlying

knowledgeWhat

the

relation

computational

generalization

human

behavior?Cognitive

MechanismOpenAI

reveals

the

multimodal

neurons

CLIP/blog/multimodal-neurons/;/blog/clip/Cognitive

MechanismRitter

al.

CognitivePsychologyforDeepNeuralNetworks:AShapeBiasCaseStudy,

ICML,

2017cognitivepsychology

inspired

evaluation

DNNsshape

match

prob

means

shape

biasCognitive

MechanismGeirhos,Robert,etal."Shortcutlearningindeepneuralnetworks."

NatureMachineIntelligence

2.11(2020):665-673.DeepneuralnetworkssolveproblemsbytakingshortcutsCognitive

MechanismRajalingham,Rishi,etal.“Large-scale,high-resolutioncomparisonofthecorevisualobjectrecognitionbehaviorofhumans,monkeys,andstate-of-the-artdeepartificialneuralnetworks.”

JournalofNeuroscience

38.33(2018):7255-7269.

Rajalingham,Rishi,KailynSchmidt,andJamesJ.DiCarlo."Comparisonofobjectrecognitionbehaviorinhumanandmonkey."

JournalofNeuroscience

35.35(2015):12127-121

人人文庫> 全部分類> 教育資料 > 課件下載

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽，若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負責(zé)。
6. 下載文件中如有侵權(quán)或不適當內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性

文檔簡介

溫馨提示

最新文檔

評論

數(shù)據(jù)與模型安全 課件 第2周：可解釋性和普通魯棒性

文檔簡介

溫馨提示

最新文檔

評論

相關(guān)文檔

數(shù)據(jù)與模型安全課件第2周：可解釋性和普通魯棒性