《python高維數(shù)據(jù)分析》課件-第4章

上傳人：1*** IP屬地：廣東上傳時(shí)間：2025-02-18 格式：PPTX 頁(yè)數(shù)：207 大小：7.77MB 積分：15 舉報(bào) 版權(quán)申訴

已閱讀5頁(yè)，還剩202頁(yè)未讀，繼續(xù)免費(fèi)閱讀

版權(quán)說(shuō)明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

Chapter4

Partial

Least

Squares

Analysi4.1Basic

Concep4.2NIPALS

and

SIMPLS

Algorithm4.3Programming

Method

Standard

Partial

Least

Square4.4Example

Application4.5Stack

Partial

Least

Squares

4.1BasicConcep

4.1.1PartialLeastSquares

ConsiderthegeneralsettingofalinearPLSalgorithmtomodeltherelationbetweentwodatasets(blocksofvariables).DenotebyX?RNanN-dimensionalspaceofvariablesrepresentingthefirstblockandsimilarlybyy?RNaspacerepresentingthesecondblockofvariables.PLSmodelstherelationsbetweenthesetwoblocksbymeansofscorevectors.

After

observing

data

samples

from

each

block

variables,

PLS

decomposes

the

(n×N)matrix

zero-mean

variables

and

the

(n×M)matrix

zero-mean

variables

into

theform

Graphically,itcanbeshownasFig.4.1,wheretheT,Uare(n×p)matricesofthepextractedscorevectors(components,latentvectors),the(N×p)matrixPandthe(M×p)matrixQrepresentmatricesofloadingsandthe(n×N)matrixEandthe(n×M)matrixFarethematricesofresiduals.ThePLSmethod,whichinitsclassicalformisbasedonthenonlineariterativepartialleastsquares(NIPALS)algorithm,findsweightvectorsw,csuchthat

where

cov(t,u)

=tTu/n

denotes

the

sample

covariance

between

the

score

vectors

andu.

The

NIPALS

algorithm

starts

with

random

initialization

the

Y-space

score

vector

uand

repeats

sequence

the

following

steps

until

convergence.

Notethatu=yifM=1,thatis,Yisaone-dimensionalvectorthatwedenotebyy.InthiscasetheNIPALSprocedureconvergesinasingleiteration.

Itcanbeshownthattheweightvectorwalsocorrespondstothefirsteigenvectorofthefollowingeigenvalueproblem

TheX-andY-spacescorevectorstanduarethengivenas

wheretheweightvectorc

isdefineinsteps(4)and(5)ofNIPALS.Similarly,eigenvalueproblemsfortheextractionoft,uorcestimatescanbederived.TheuserthensolvesforoneoftheseeigenvalueproblemsandtheotherscoreorweightvectorsarereadilycomputableusingtherelationsdefinedinNIPALS.

4.1.2Form

Partial

Least

Squares

PLS

iterative

process.After

the

extraction

the

score

vectorst,u

the

matricesXand

Yare

deflated

subtracting

their

rank-one

approximations

based

and

u.Different

forms

deflation

define

several

variants

PLS.

Usingequation(4.1.1)thevectorsofloadingspandqarecomputedascoefficientsofregressingXontandYonu,respectively

4.1.2.1PLS

Mode

The

PLS

Mode

based

rank-one

deflation

individual

block

matrices

using

thecorresponding

score

and

vectors.

each

iteration

PLSModeA

the

Xand

Ymatrices

are

deflated

4.1.2.2PLS1,PLS2

PLS1(one

the

block

dataconsists

single

variable)

and

PLS2(both

blocksare

multidimensional)

are

used

PLS

regression

methods.The

sevariants

PLS

are

themost

frequently

used

PLS

approaches.

The

relationship

between

Xand

asymmetric.Two

assumptions

aremade:i)

the

score

vectors{ti}pi=1are

good

predictors

Y;pdenotes

the

number

extracted

score

vectors—PLS

iterations,ii)

linear

inner

relationbetween

the

scores

vectors

and

exists;that

is,

where

the

(p×p)

diagonal

matrix

and

Hdenotes

the

matrix

residuals.

The

asymmetricassumption

the

predictor-predicted

variable(s)

relation

transformed

into

deflationscheme

where

the

predictor

space,

sayX,

score

vectors{ti}pi=1

are

good

predictors

Y.The

score

vectors

are

then

used

deflate

Y,that

is,

component

the

regression

ont

removed

from

each

iteration

PLS

4.1.2.3PLS-SB

Asoutlinedattheendofthepreviousparagraphthecomputationofalleigenvectorsofequation(4.1.3)atoncewoulddefineanotherformofPLS.Thiscomputationinvolvesasequenceofimplicitrank-onedeflationsoftheoverallcross-productmatrix.ThisformofPLSwasusedinandinaccordancewithitisdenotedasPLS-SB.IncontrasttoPLS1andPLS2,theextractedscorevectors{ti}pi=1areingeneralnotmutuallyorthogonal.

4.1.3PLS

Regression

mentioned

the

section,

PLS1

and

PLS2

can

used

solve

linearregression

problems.

Combining

assumption(4.1.5)

linear

relation

tween

the

scoresvectors

and

uwith

the

decomposition

the

Ymatrix,

equation(4.1.1)

can

written

This

defines

the

equation

where

CT=DQT

nowdenotesthe(p×M)matrix

regression

coefficients

and

F*=HQT+F

the

residual

matrix.

Equation(4.1.6)is

simply

the

decomposition

Yusingordinary

least

squaresreg

ression

with

orthogonal

predictors

Wenowconsiderorthonormalisedscorevectorst,thatis,TTT=I,andthematrixC=YTTofthenotscaledtolengthoneweightvectorsc.Itisusefultoredefineequation(4.1.6)intermsoftheoriginalpredictorsX.Todothis,weusetherelationship

wherePisthematrixofloadingvectorsdefinedinequation(4.1.1).Pluggingthisrelationintoequation(4.1.6),weyiel

Forabetterunderstandingofthesematrixequations,theyarealsogiveningraphicalrepresentationinFig.4.2.

whereBrepresentsthematrixofregressioncoefficients

Forthelastequality,therelationsamongT,U,WandPareused[12,10,17].NotethatdifferentscalingsoftheindividualscorevectorstandudonotinfluencetheBmatrix.FortrainingdatatheestimateofPLSregressionis

and

for

testing

data

have

where

and

Tt=XtXTU(TTXXTU)-1

represent

the

matrices

testing

data

and

scorevectors,respectively.

4.1.4Statistic

From

thematrices

residuals

Ehand

Fhsums

squares

canbe

calculated

asfollows:

the

total

sum

squares

over

matrix,

the

sums

squares

over

rows,and

thesums

squares

overcolumns.These

sums

squares

can

used

construct

variance-like

estimators.The

statistical

propertiesof

these

estimatorshavenotundergone

arigorous

mathematical

treatment

yet,but

some

properties

can

understood

intuitively.

Sumsofsquaresoverthecolumnsindicatetheimportanceofavariableforacertaincomponent.Sumsofsquaresovertherowsindicatehowwelltheobjectsfitthemodel.Thiscanbeusedasanoutlierdetectioncriterion.IllustrationsaregiveninFig.4.3(a)forvariablestatisticsandinFig.4.3(b)forsamplestatistics.

AnadvantageofPLSisthatthesestatisticscanbecalculatedforeverycomponent.Thisisanidealmeansoffollowingthemodel-buildingprocess.Theevolutionofthesestatisticscanbefollowed(asshowninFig.4.3(a)and(b))asmoreandmorecomponentsarecalculatedsothatanideaofhowthedifferentobjectsandvariablesfitcanbeobtained.Incombinationwithacriterionformodeldimensionality,thestatisticscanbeusedtoestimatewhichobjectsandvariablescontributemainlytothemodelandwhichcontributemainlytotheresidual.

4.2NIPALS

and

SIMPLS

Algorithm

4.2.1NIPALS

this

section,

the

transpose

the

matrix

represented

the

superscript

4.2.1.1Theory

The

PLS

algorithm

described

this

section

will

called

the“standard”P(pán)LS

algorithm.It

has

been

presented

detail

elsewhere[3-6].

For

some

alternative

implementations

PLSsee

e.g.references[7-9].The

first

step

standard

PLS

center

the

data

matrices

Xand

giving

X0,and

Y0,

respectively.

Then

set

orthogonal

Xblock

factor

scoresT=[t1,t2,…,tA]and

companion

Yblock

factor

scores

U=[u1,u2,…,uA]

arecalculated

factor

factor.

The

first

PLS

factorst1,and

u1,are

weighted

sumsof

thecentered

variables:t1=X0w1,and

u1=Y0q1,

respectively.

Usually

theweights

aredetermined

via

the

NIPALS

algorithm.This

the

iterative

sequence:

OncethefirstXblockfactort1isobtainedoneproceedswithdeflatingthedatamatrices.ThisyieldsnewdatasetsX1andY1,whicharethematricesofresidualsobtainedafterregressingallvariablesont1

4.2.1.2NIPALS-PLSFactorsinTermsofOriginalVariables

Eachoftheweightvectorswa,a=2,3,…,A,usedfordefiningtheassociatedfactorscores,appliestoadifferentmatrixofresidualsXa-1,

and

not

the

original

centered

data

X0.This

obscures

the

interpretati

the

factors,mainly

because

one

looses

sight

what

the

depletedmatrices

Xa,as

one

goes

tohigher

dimensions,

a≥1.Some

variables

are

used

the

first

factors,

others

only

muchlater.Therelationbetweenfactorsandvariablesisbetterdisplayedbytheloadingspa(a=1,2,…,A).Indeed,theweightvectors,collectedinthep×AmatrixW,havefoundlessuseininterpretingPLSregressionmodelsthantheloadingvectors.

Itisthereforeadvantageoustore-expresstheNIPALS-PLSfactorstaintermsoftheoriginalcentereddataX0,say

or,collectingthealternativeweightvectorsinap×AmatrixR=[r1,r2,…,rA

ThefactorscoresTcomputedviaNIPALS-PLS,i.e.viadepletedXmatrices,canbeexpressedexactlyaslinearcombinationsofthecenteredXvariables,sincealldeflatedmatricesXaandfactorscoresta,a=1,2,…,A,lieinthecolumnspaceofX0.Thus,RcanbecomputedfromtheregressionofTonX0:

where

P=[p1,p2,…,pA]

the(p×A)

matrix

factor

loadings

and

the

superscript-indicates

any

generalized

inverse

and+indicates

the

uniqueMoore-Penrose

pseudo-inverse.We

also

have

the

relation.

Sincer'bpa=r'bX'0ta/(t'ata)=t'ata/(t'ata)=δab.HereIAisthe(A×A)identitymatrixandδabisKronecker’sdelta.ThusRisageneralizedinverseofP'.AnotherexpressionforRis.

whichfollowsfromtheobservationthatRandWsharethesamecolumnspaceandthatP'Rshouldbeequaltotheidentitymatrix.

Theexplicitcomputationofthe(pseudo-)inversematricesineqnuation(4.2.15)and(4.2.17)detractssomewhatfromthePLS-NIPALSalgorithm,thatisotherwiseverystraightforward.Hiiskuldssongivesthefollowingrecurrentrelation.

starting

with

r1=w1,

How

ever,

this

relation

depends

the

tridiagonal

structure

P'Pand

only

correct

for

univariate

=y(m=1,PLS1).Equation(4.2.19)

and

(4.2.20)form

set

updating

formulas

that

generally

applicable:

starting

with

G1=Ip

.Note

that

the

vectors

raare

not

normalized,

contrast

theweight

vectors

wa,

Thus

equation

(4.2.13),

neither

nor

raare

normalized.

WhentheRweightsareavailable,aclosedformmultipleregression-typepredictionmodelcanbeobtainedmorereadily:

Here,B

PLS=Rdiag(b)Q'=W(P'W)-1diag(b)Q'isthep×msetofbiasedmultivariateregressioncoefficientsobtainedviaPLSregression.

ThemodificationweproposeleadstothedirectcomputationoftheweightsR.InthiswayweavoidtheconstructionofdeflateddatamatricesX1,X2,…,XAandY1,Y2,…,YAandby-passthecalculationofweightsW.Theexplicitcomputationofmatrixinversesasinequation(4.2.15)or(4.2.17)isalsocircumvented.ThenewlydefinedRissimilar,butnotidentical,tothe“standard”Rintroducedinequation(4.2.14).Infact,ournewRcontainsnormalizedweightvectorsjustasWinstandardPLS.

Thus,thetaskwefaceistocomputeweightvectorsraandqa(a=1,2,…,A),whichcanbeapplieddirectlytothecentereddata:

Theweightsshouldbedeterminedsuchastomaximizethecovarianceofscorevector

taandua

undersomeconstraints.(Thetermcovariancewillbeusedsomewhatlooselyandinterchangeablywiththetermscross-productorinnerproduct;theymerelydifferbyascalarfactorn-1).Specifically,fourconditionscontrolthesolution:

(1)maximizationofcovariance

(2)normalizationofweightsra:

(3)normalizationofweightsqa:

(4)orthogonalityoftscores

4.2.2.2SIMPLSAlgorithm

ItisexpedienttocomputeSa+1fromitspredecessorSa.ToachievetheprojectionontothecolumnspaceofPa

willbecarriedoutasasequenceoforthogonalprojections.ForthisweneedanorthonormalbasisofPa,sayVa=[v1,v2,…,va].VamaybeobtainedfromaGramSchmidtorthonormalizationofPa,i.e.,

starting

with

Vi=v1

∝p1.An

additional

simplification

possible

when

the

response

isunivariate

(m=1,PLS1).In

this

case,

one

may

employ

the

orthogonality

properties

P,viz.,p'bpa=0,forb≤a-2.ThesepropertiescarryovertotheorthonormalizedloadingsV,i.e.,p'bVa=0,forb≤a-2.Thus,orthogonalityofpa

withrespecttoVa-2isautomaticallytakencareofandequation(4.2.32)simplifiesto

Theprojectionontothesubspacespannedbythefirstaloadingvectors,pa(p'a

pa)-1p'a,cannowbereplacedbyVaV'aandtheprojectionontheorthogonalimplementPa⊥byIp-VaV'a=∏a1(IpVbV'b).Thus,utilizingtheorthonormalityofV,theproductmatricesSa(a=1,2,…),aresteadilydepletedbyprojectingouttheperpendiculardirectionsva:

4.2.2.3Fitting,PredictionandResidualAnalysis

ForthedevelopmentofthetheoryandalgorithmofSIMPLSitwasconvenienttochoosenormalizedweightvectorsra.Thischoice,however,isinnowayessential.Wewillnowswitchtoanormalizationofthescores

tainstead,sincethisconsiderablysimplifiessomeoftheensuingformulas.ThecodegivenintheAppendixalreadyusesthelatternormalizationscheme.Thusweredefinera=ra/|X0ra|andta=ta/|ta|,givingunit-lengthscorevectorstaandorthonormalT:T'T=IA.

Predictedvaluesofthecalibrationsamplesarenowobtainedas

giving

Fornewobjectsweemploythestraightforwardpredictionformula

Thefactorscoresta*=x0*raandleverageh*=∑ta*2maybecomputedfordiagnosticpurposes,e.g.,toassesswhetherornotthenewobjectlieswithintheregioncoveredbythetrainingobjects.

4.2.2.4DetailedSIMPLSAlgorithm

4.3Programming

Method

StandardPartial

Least

Squares

4.3.1Cross-validation

Learning

the

parameters

prediction

function

and

testing

the

same

data

amethodological

mistake:

model

that

would

just

repeat

the

labels

the

samples

that

ithas

just

seen

would

have

perfect

score

but

would

fail

predict

any

thing

useful

yet-unseen

data.

This

situation

called

overfitting.

avoid

it,it

common

practice

whenperforming

(supervised)

machine

learning

experiment

hold

out

part

the

availabledata

asatest

set

X_test,

test.

Note

that

the

word“experiment”

not

intended

todenote

academic

use

only,

because

even

commercial

setting

smachine

learning

usuallystarts

out

experimentally.

Whenevaluatingdifferentsettings(“hyperparameters”)forestimators,thereisstillariskofoverfittingonthetestsetbecausetheparameterscanbetweakeduntiltheestimatorperformsoptimally.Thisway,knowledgeaboutthetestsetcan“l(fā)eak”intothemodelandevaluationmetricsnolongerreportongeneralizationperformance.Tosolvethisproblem,yetanotherpartofthedatasetcanbeheldoutasaso-called“validationset”:trainingproceedsonthetrainingset,afterwhichevaluationisdoneonthevalidationset,andwhentheexperimentseemstobesuccessful,finalevaluationcanbedoneonthetestset..

However,bypartitioningtheavailabledataintothreesets,wedrasticallyreducethenumberofsampleswhichcanbeusedforlearningthemodel,andtheresultscandependonaparticularrandomchoiceforthepairof(train,validation)sets.

Asolutiontothisproblemisaprocedurecalledcross-validation(CVforshort).Atestsetshouldstillbeheldoutforfinalevaluation,butthevalidationsetisnolongerneededwhendoingCV.

4.3.1.1Cross-validation

Iterators

For

i.i.d.

Data

Thefollowingcross-validatorscanbeusedinsuchcases.

Note

Whileii..d.dataisacommonassumptioninmachinelearningtheory,itrarelyholdsinpractice.Ifoneknowsthatthesampleshavebeengeneratedusingatime-dependentprocess,it’ssafertouseatime-seriesawarecross-validationscheme.Similarlyifweknowthatthegenerativeprocesshasagroupstructure(samplesfromcollectedfromdifferentsubjects,experiments,measurementdevices)itsafertousegroup-wisecross-validation

1.K-Fold

Inthebasicapproach,calledK-FoldCV,thetrainingsetissplitintoksmallersets(otherapproachesaredescribedbelow,butgenerallyfollowthesameprinciples).ThefollowingprocedureisfollowedforeachoftheK“folds”:

(1)Amodelistrainedusingk-1ofthefoldsastrainingdata;

(2)theresultingmodelisvalidatedontheremainingpartofthedata.

Exampleof2-foldcross-validationonadatasetwith4samples:

Hereisavisualizationofthecross-validationbehaviorinFig.4.4.NotethatK-Foldisnotaffectedbyclassesorgroups.

2.RepeatedK-Fold

RepeatedK-FoldrepeatsK-Foldntimes.ItcanbeusedwhenonerequirestorunK-Foldntimes,producingdifferentsplitsineachrepetition

Exampleof2-foldK-Foldrepeated2times:

3.LeaveOneOut(LOO)

LeaveOneOut(orLOO)isasimplecross-validation.Eachlearningsetiscreatedbytakingallthesamplesexceptone,thetestsetbeingthesampleleftout.Thus,fornsamples,wehavendifferenttrainingsetsandndifferenttestsset.Thiscross-validationproceduredoesnotwastemuchdataasonlyonesampleisremovedfromthetrainingset:

4.LeavePOut(LPO)

LeavePOutisverysimilartoLeaveOneOutasitcreatesallthepossibletraining/testsetsbyremovingpsamplesfromthecompleteset.Fornsamples,thisproducestrain-testpairs.UnlikeLeaveOneOutandK-Fold,thetestsetswilloverlapforp>1.

ExampleofLeave-2-Outonadatasetwith4samples:

5.LeavePOut(LPO)RandomPermutationsCross-validationa.k.a.Shuffle&Split

TheShuffleSplititeratorwillgenerateauserdefinednumberofindependenttrain/testdatasetsplits.Samplesarefirstshuffledandthensplitintoapairoftrainandtestsets.

Itispossibletocontroltherandomnessforreproducibilityoftheresultsbyexplicitlyseedingtherandomstatepseudorandomnumbergenerator.

Hereisausageexample:

Hereisavisualizationofthecross-validationbehaviorinFig.4.5.NotethatShuffleSplitisnotaffectedbyclassesorgroups.

4.3.1.2Cross-validationIteratorswithStratificationBasedonClassLabels

Someclassificationproblemscanexhibitalargeimbalanceinthedistributionofthetargetclasses:forinstancetherecouldbeseveraltimesmorenegativesamplesthanpositivesamples.InsuchcasesitisrecommendedtousestratifiedsamplingasimplementedinStratifiedK-FoldandStratifiedShuffleSplittoensurethatrelativeclassfrequenciesisapproximatelypreservedineachtrainandvalidationfold.

1.StratifiedK-Fold

StratifiedK-FoldisavariationofK-Foldwhichreturnsstratifiedfolds:containsapproximatelythesamepercentageofsamplesofeachtargetclassasthecompleteset.

Exampleofstratified3-foldcross-validationonadatasetwith10samplesfromtwoslightlyunbalancedclasses:

Hereisavisualizationofthecross-validationbehaviorinFig.4.6.

2.StratifiedShuffleSplit

StratifiedShuffleSplitisavariationofShuffleSplit,whichreturnsstratifiedsplits,i.e,whichcreatessplitsbypreservingthesamepercentageforeachtargetclassasinthecompleteset.

Hereisavisualizationofthecross-validationbehaviorinFig.4.7.Fig.4.7StratifiedShuffleSplit

4.3.1.3Cross-validation

Iterators

for

Grouped

Data

The

ii..d.

assumption

broken

the

underlying

generative

process

yield

groups

ofdependent

samples.

Such

grouping

data

domain

specific.An

examplewould

bewhen

there

ismedical

data

collected

frommultiple

patients,

withmultiple

samples

taken

fromeachpatient.

And

such

data

likely

dependent

the

individual

group.

our

example,the

patient

for

each

sample

will

beits

group

identifier.

1.GroupK-Fold

GroupK-FoldisavariationofK-Foldwhichensuresthatthesamegroupisnotrepresentedinbothtestingandtrainingsets.Forexampleifthedataisobtainedfromdifferentsubjectswithseveralsamplesper-subjectandifthemodelisflexibleenoughtolearnfromhighlypersonspecificfeaturesitcouldfailtogeneralizetonewsubjects.GroupK-Foldmakesitpossibletodetectthiskindofoverfittingsituations.

Imagineyouhavethreesubjects,eachwithanassociatednumberfrom1to3:

Eachsubjectisinadifferenttestingfold,andthesamesubjectisneverinbothtestingandtraining.Noticethatthefoldsdonothaveexactlythesamesizeduetotheimbalanceinthedata.

Hereisavisualizationofthecross-validationbehaviorinFig.4.8.Fig.4.8GroupK-Fol

2.LeaveOneGroupOut

LeaveOneGroupOutisacross-validationschemewhichholdsoutthesamplesaccordingtoathird-partyprovidedarrayofintegergroups.Thisgroupinformationcanbeusedtoencodearbitrarydomainspecificpre-definedcross-validationfolds.

Eachtrainingsetisthusconstitutedbyallthesamplesexcepttheonesrelatedtoaspecificgroup.

Forexample,inthecasesofmultipleexperiments,LeaveOneGroupOutcanbeusedtocreateacross-validationbasedonthedifferentexperiments:wecreateatrainingsetusingthesamplesofalltheexperimentsexceptone:

3.LeavePGroupOut

LeavePGroupsOutissimilarasLeaveOneGroupOut,butremovessamplesrelatedtoPgroupsforeachtraining/testset.

4.GroupShuffleSplit

TheGroupShuffleSplititeratorbehavesasacombinationofShuffleSplitandLeavePGroupsOut,andgeneratesasequenceofrandomizedpartitionsinwhichasubsetofgroupsareheldoutforeachsplit.

Hereisausageexample:

Hereisavisualizationofthecross-validationbehaviorinFig.4.9Fig.4.9GroupShuffleSpli

4.3.2Procedure

NIPALS

4.3.2.1Inner

Loop

The

Iterative

NIPALS

Algorithm

Provides

alternative

the

svd(X'Y);

returns

the

first

left

and

right

singularvectors

X'Y.

See

PLS

for

themeaning

the

parameters.It

similar

the

PowermethodfordeterminingtheeigenvectorsandeigenvaluesofaX'Y.

4.3.2.2Center

and

4.3.2.3NIPALS

ThisclassimplementsthegenericPLSalgorithm,constructors’parametersallowtoobtainaspecificimplementationsuchas:

ThisimplementationusesthePLSWold2blocksalgorithmbasedontwonestedloops:

(i:)Theouterloopiterateovercomponents.

(ii)Theinnerloopestimatestheweightsvectors.Thiscanbedonewithtwoalgo.(a)theinnerloopoftheoriginalNIPALSalgo,or(b)aSVDonresidualscross-covariancematrices

4.4Example

Application

4.4.1Demo

PLS

Software

version

python

2.7,and

aMicrosoftWindows

operating

system.Cross-validation

and

train

test

split

are

performedusing

the

sklearnpackage,

respectively.Dataset

done

using

the

scipy

package,and

other

programs

can

implementedby

individuals..

4.4.2CornDataset

Inthissectionthecorndatasetwasusedforexperiments.LatentvariablesofPLSareallowedtotakevaluesintheset[1,15],anditisdeterminedbythe10-foldcross-validation.Nopre-processingmethodswereusedotherthanmean-centering.Table4.1showsthetrainingerror,cross-validationerror,predictionerror,andprincipalcomponentnumberofthePLSmodelformoisture,oil,protein,andstarchcontentdirectlyusingthecorndataset.

Inthispaper,theprincipalcomponentofthePLSalgorithmisselectedbythe10-foldcross-validationmethod.TheRMSECVofthePLSmodelisgiveninFig.4.10-Fig.4.12,respectively.Fig.4.10TheselectionprocessoftheoptimallatentvariablesnumberfromPLSmodelaboutthem5specinstrumentFig.4.11TheselectionprocessoftheoptimallatentvariablesnumberfromPLSmodelaboutthemp5specinstrumentFig.4.12TheselectionprocessoftheoptimallatentvariablesnumberfromPLSmodelaboutthemp6specinstrume

人人文庫(kù)> 全部分類> 專業(yè)文獻(xiàn) > IT計(jì)算機(jī)

溫馨提示

1. 本站所有資源如無(wú)特殊說(shuō)明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽，若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間，僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請(qǐng)與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

《python高維數(shù)據(jù)分析》課件-第4章

文檔簡(jiǎn)介

溫馨提示

最新文檔

評(píng)論

《python高維數(shù)據(jù)分析》課件-第4章

文檔簡(jiǎn)介

溫馨提示

最新文檔

評(píng)論

相關(guān)文檔