




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
1、會計學(xué)1第六分類第六分類(fn li)挖掘挖掘第一頁,共81頁。nncredit approvalntarget marketingnmedical diagnosisntreatment effectiveness analysis第1頁/共81頁第二頁,共81頁。第2頁/共81頁第三頁,共81頁。TrainingDataNAME RANKYEARS TENUREDMikeAssistant Prof3noMaryAssistant Prof7yesBill Professor2yesJimAssociate Prof7yesDaveAssistant Prof6noAnneAssociat
2、e Prof3noClassificationAlgorithmsIF rank = professorOR years 6THEN tenured = yes Classifier(Model)第3頁/共81頁第四頁,共81頁。ClassifierTestingDataNAMERANKYEARS TENUREDTomAssistant Prof2noMerlisa Associate Prof7noGeorge Professor5yesJoseph Assistant Prof7yesUnseen Data(Jeff, Professor, 4)Tenured?第4頁/共81頁第五頁,共8
3、1頁。nThe class labels of training data is unknownnGiven a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data第5頁/共81頁第六頁,共81頁。nClassification by backpropagationnOther Classification MethodsnPredictionnClassification accuracynSummary第6頁
4、/共81頁第七頁,共81頁。nGeneralize and/or normalize data第7頁/共81頁第八頁,共81頁。nGoodness of rulesndecision tree sizencompactness of classification rules第8頁/共81頁第九頁,共81頁。nOther Classification MethodsnPredictionnClassification accuracynSummary第9頁/共81頁第十頁,共81頁。age?overcaststudent?credit rating?noyesfairexcellent40non
5、oyesyesyes31.40第10頁/共81頁第十一頁,共81頁。ageincomestudentcredit_rating buys_computer=30highnofairno40mediumnofairyes40lowyesfairyes40lowyesexcellentno3140 lowyesexcellentyes=30mediumnofairno40mediumyesfairyes40mediumnoexcellentnoThis follows an example from Quinlans ID3Algorithm第11頁/共81頁第十二頁,共81頁。age?overc
6、aststudent?credit rating?noyesfairexcellent40nonoyesyesyes31.40第12頁/共81頁第十三頁,共81頁。 Decision tree generation consists of two phasesuTree constructionAt start, all the training examples are at the rootPartition examples recursively based on selected attributesuTree pruningIdentify and remove branches
7、that reflect noise or outliers Use of decision tree: Classifying an unknown sampleuTest the attribute values of the sample against the decision tree第13頁/共81頁第十四頁,共81頁。nConditions for stopping partitioningnAll samples for a given node belong to the same classnThere are no remaining attributes for fur
8、ther partitioning majority votingis employed for classifying the leafnThere are no samples left第14頁/共81頁第十五頁,共81頁。nclass NnThe amount of information, needed to decide if an arbitrary example in Sbelongs to P or N is defined asnpnnpnnppnppnpI22loglog),(第15頁/共81頁第十六頁,共81頁。nThe encoding information tha
9、t would be gained by branching on A1),()(iiiiinpInpnpAE)(),()(AEnpIAGain第16頁/共81頁第十七頁,共81頁。agepiniI(pi, ni)40320.97169.0)2,3(145)0,4(144)3,2(145)(IIIageE048. 0)_(151. 0)(029. 0)(ratingcreditGainstudentGainincomeGainTrain Set()( , )()0.940.690.25Gain ageI p nE age第17頁/共81頁第十八頁,共81頁。第18頁/共81頁第十九頁,共81頁
10、。measure falling below a thresholdnDifficult to choose an appropriate thresholdnPostpruning: Remove branches from a “fully grown” treeget a sequence of progressively pruned treesnUse a set of data different from the training data to decide which is the “best pruned tree”第19頁/共81頁第二十頁,共81頁。nconvertib
11、le to simple and easy to understand classification rulesncan use SQL queries for accessing databasesncomparable classification accuracy with other methods第20頁/共81頁第二十一頁,共81頁。nRainForest (VLDB98 Gehrke, Ramakrishnan & Ganti)nseparates the scalability aspects from the criteria that determine the q
12、uality of the treenbuilds an AVC-list (attribute, value, class label)第21頁/共81頁第二十二頁,共81頁。nClassification by backpropagationnOther Classification MethodsnPredictionnClassification accuracynSummary第22頁/共81頁第二十三頁,共81頁。probabilitiesnStandard: Even when Bayesian methods are computationally intractable, t
13、hey can provide a standard of optimal decision making against which other methods can be measured第23頁/共81頁第二十四頁,共81頁。OutlookPNHumidityPNsunny2/93/5high3/94/5overcast4/90normal6/91/5rain3/92/5TempreatureW indyhot2/92/5true3/93/5mild4/92/5false6/92/5cool3/91/5第24頁/共81頁第二十五頁,共81頁。nnE.g. P(class=N | out
14、look=sunny,windy=true,)Round,and redApple?第25頁/共81頁第二十六頁,共81頁。.argmax ( |)argmax (| ) ( )CP C XP X C P CMAPCC(| ) ( )( |)()P X C P CP C XP X第26頁/共81頁第二十七頁,共81頁。C such that P(X|C)P(C) is maximumnProblem: computing P(X|C) is unfeasible!第27頁/共81頁第二十八頁,共81頁。Given Z, X is independent on Y, if P(X|Y,Z) =
15、P(X|Z) P(X,Y,Z) P(X,Y,Z) P(Y,Z) P(X,Y|Z) = = P(Z)P(Y,Z)P(Z) P(X|Y,Z) P(Y|Z) =P(X|Z) P(Y|Z)第28頁/共81頁第二十九頁,共81頁。density functionnComputationally easy in both cases第29頁/共81頁第三十頁,共81頁。Outlook Temperature Humidity Windy ClasssunnyhothighfalseNsunnyhothightrueNovercast hothighfalsePrainmildhighfalsePrainc
16、oolnormalfalsePraincoolnormaltrueNovercast coolnormaltruePsunnymildhighfalseNsunnycoolnormalfalsePrainmildnormalfalsePsunnymildnormaltruePovercast mildhightruePovercast hotnormalfalsePrainmildhightrueNoutlookP(sunny|p) = 2/9P(sunny|n) = 3/5P(overcast|p) = 4/9P(overcast|n) = 0P(rain|p) = 3/9P(rain|n)
17、 = 2/5temperatureP(hot|p) = 2/9P(hot|n) = 2/5P(mild|p) = 4/9P(mild|n) = 2/5P(cool|p) = 3/9P(cool|n) = 1/5humidityP(high|p) = 3/9P(high|n) = 4/5P(normal|p) = 6/9P(normal|n) = 2/5windyP(true|p) = 3/9P(true|n) = 3/5P(false|p) = 6/9P(false|n) = 2/5P(p) = 9/14P(p) = 9/14P(n) = 5/14P(n) = 5/14第30頁/共81頁第三十
18、一頁,共81頁。nSample X is classified in class n (dont play)第31頁/共81頁第三十二頁,共81頁。nBayesian networks, that combine Bayesian reasoning with causal relationships between attributesnDecision trees, that reason on one attribute at the time, considering most important attributes first第32頁/共81頁第三十三頁,共81頁。FamilyHi
19、storyLungCancerPositiveXRaySmokerEmphysemaDyspneaLCLC(FH, S)(FH, S)(FH, S)(FH, S)0.80.20.50.50.70.30.10.9Bayesian Belief NetworksThe conditional probability table for the variable LungCancer第33頁/共81頁第三十四頁,共81頁。nGiven both network structure and all the variables: easynGiven network structure but only
20、 some variablesnWhen the network structure is not known in advance第34頁/共81頁第三十五頁,共81頁。E = yes0.7D = healthy0.25HD=yesE=yesD=healthy0.25E=yesD=not healthy0.45E=noD=healthy0.55E=noD=not healthy0.75ExerciseHeart DiseaseBlood PressureFoodAnginaChest AcheCP=yesHD=yesHb=yes0.8HD=yesHb=no0.6HD=noHb=yes0.4H
21、D=noHb=no0.1Hb=yesD=healthy0.2D=not healthy0.85BP=highHD=yes0.85HD=no0.2第35頁/共81頁第三十六頁,共81頁。P(X,Y)=P(X|Y) P(Y)=P(Y|X) P(X)P(X)=P(X,Y) +P(X,Y),()(|,) (,):()(,)(,)(,)yes nohealthy not healthyP HDyesP HDyes EDP EDbecauseP HDyesP HDyes Eyes DhealthyP HDyes Eno Dhealth yP HDyes Eyes Dnot healthy(,)(|,) (
22、,)(|,) (,)(|P HDyes Eno Dnot healthyP HDyes Eyes Dhealthy P Eyes DhealthyP HDyes Eno Dhealthy P Eno DhealthyP HDyes Eye,) (,)(|,) (,)s Dnot healthy P Eyes Dnot healthyP HDyes Eno Dnot healthy P Eno Dnot healthy ()()0.51P HDnoP HDyes So HD=no is more possible No prior information第36頁/共81頁第三十七頁,共81頁。,
23、()(,)(|) ()(|) ()(|)()yes noP BPhighP BPhigh HDP BPhigh HDP HDP BPhigh HDyes P HDyesP HDyes BPhighP BPhigh 0.850.80330.5185(|)1(|)1 0.80330.1967P HDno BPhighP HDyes BPhigh So HD=yes is more possible第37頁/共81頁第三十八頁,共81頁。(|,)(|,)(|,)(|,)(|) (|,)(|) (|P HDyes BPhigh Dhealthy EyesP BPhigh HDyes Dhealthy
24、EyesP HDyes Dhealthy EyesP BPhigh Dhealthy EyesP BPhigh HDyes P HDyes Dhealthy EyesP BPhigh HDP HDDhealt,)0.850.58620.85(|,)1(|,)1 0.58620.4138hy EyesP HDno BPhigh Dhealthy EyesP HDyes BPhigh Dhealthy Eyes NoImage So exercises and healthy foods can reduce the risk of heart disease第38頁/共81頁第三十九頁,共81頁
25、。nOther Classification MethodsnPredictionnClassification accuracynSummary第39頁/共81頁第四十頁,共81頁。 k-fweighted sumInputvector xoutput yActivationfunctionweightvector ww0w1wnx0 x1xn第40頁/共81頁第四十一頁,共81頁。Output nodesInput nodesHidden nodesOutput vectorInput vector: xiwijA multilayer feed forward NN with a hid
26、den layer can approximate any function in any accuracyhidden nodes and output nodes are called functional units第41頁/共81頁第四十二頁,共81頁。第42頁/共81頁第四十三頁,共81頁。the unitnCompute the output value using the activation functionnCompute the errornUpdate the weights and the bias第43頁/共81頁第四十四頁,共81頁。network topology
27、 is often decided by the actual behavior第44頁/共81頁第四十五頁,共81頁。第45頁/共81頁第四十六頁,共81頁。Output nodesInput nodesHidden nodesOutput vectorInput vector: xiwijjkkkjjjwErrOOErr)1 (ijiijjOwIjIjeO11)(1 (jjjjjOTOOErrijijijOErrlww)(jjjErrl)(A cceptInput第46頁/共81頁第四十七頁,共81頁。123x1x2x3w14w15w24w251010.2-0.30.40.1w34w35w
28、46w56456-0.50.2-0.3-0.2-0.40.20.1456Initial input, weight, and bias values:Class label =1, learning rate l=0.9X1=1X2=0X3=10.26= 0.1T6=10.2-0.3-0.2-0.34= -0.45= 0.2-0.50.40.1第47頁/共81頁第四十八頁,共81頁。123456learning rate l =0.9X1=1X2=0X3=10.26= 0.1T6=10.2-0.3-0.2-0.34= -0.45=0.2-0.50.40.1jIjeO11ijiijjOwIjkk
29、kjjjwErrOOErr)1 ()(1 (jjjjjOTOOErrijijijOErrlww)(jjjErrl)(第48頁/共81頁第四十九頁,共81頁。123456learning rate l =0.9X1=1X2=0X3=10.26= 0.10.2-0.3-0.2-0.34= -0.45=0.2-0.50.40.14114IeO6116IeOjIjeO1165564466OwOwIijiijjOwI43432421414OwOwOwI53352251155OwOwOwI5115IeOT6=1第49頁/共81頁第五十頁,共81頁。123456learning rate l =0.9X1=
30、1X2=0X3=1T6=1-0.2-0.34464646)(OErrlww466444)1 (wErrOOErr514w15w24w25w34w35w)(1 (jjjjjOTOOErr)(1 (66666OTOOErrijijijOErrlww)(565656)(OErrlwwijijijOErrlww)(jjjErrl)(666)(ErrljkkkjjjwErrOOErr)1 (566555)1 (wErrOOErr第50頁/共81頁第五十一頁,共81頁。nOther Classification MethodsnPredictionnClassification accuracynSumm
31、ary第51頁/共81頁第五十二頁,共81頁。第52頁/共81頁第五十三頁,共81頁。based inference第53頁/共81頁第五十四頁,共81頁。第54頁/共81頁第五十五頁,共81頁。wd xqxi12(,)第55頁/共81頁第五十六頁,共81頁。第56頁/共81頁第五十七頁,共81頁。第57頁/共81頁第五十八頁,共81頁。rulesne.g., IF A1 and Not A2 then C2 can be encoded as 100ne.g., IF Not A1 and Not A2 then C1 can be encoded as 001 第58頁/共81頁第五十九頁
32、,共81頁。第59頁/共81頁第六十頁,共81頁。第60頁/共81頁第六十一頁,共81頁。第61頁/共81頁第六十二頁,共81頁。ne.g., income is mapped into the discrete categories low, medium, high with fuzzy values calculated第62頁/共81頁第六十三頁,共81頁。第63頁/共81頁第六十四頁,共81頁。nClassification based on concepts from association rule miningnOther Classification MethodsnPredictionnClassification accuracynSummary第64頁/共81頁第六十五頁,共81頁。classificationnClassification refers to predict categorical class labelnPrediction models continuous-valued functions第65頁/共81頁第六十六頁,共81頁。第66頁/共81頁第六十七頁,共81頁。第6
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 出血熱知識培訓(xùn)課件
- 2025年教育創(chuàng)新:《習(xí)作他了》課件的突破
- 2025年教案創(chuàng)新:《認(rèn)識自己》的深度解讀
- DB31∕T 704-2013 南美白對蝦親蝦培育技術(shù)規(guī)范
- 物流系統(tǒng)分析 課件 項目九-任務(wù)二 簡單運(yùn)輸決策優(yōu)化模型和方法
- 企業(yè)安全生產(chǎn)規(guī)章制度、安全紀(jì)律
- 企業(yè)與員工勞動合同
- 數(shù)據(jù)備份方案比較表
- 2025年浙江道路運(yùn)輸從業(yè)資格證考試
- 微信代運(yùn)營服務(wù)合同書
- 第三方代收款協(xié)議2024年
- 【獨(dú)立儲能】山西省獨(dú)立儲能政策及收益分析-中國能建
- 2024內(nèi)蒙古中考數(shù)學(xué)二輪專題復(fù)習(xí) 二次函數(shù)與幾何綜合題 類型二 面積問題(課件)
- DL-T5796-2019水電工程邊坡安全監(jiān)測技術(shù)規(guī)范
- 高等數(shù)學(xué)教案第四章不定積分
- 2024年高考時事政治考試題庫(134題)
- 安全生產(chǎn)責(zé)任制考試試卷及答案
- 擠壓模具拋光培訓(xùn)課件
- 大學(xué)生寒假回訪母校社會實踐報告
- 2023年跨境電商年度總結(jié)報告
- 水磨鉆成本分析
評論
0/150
提交評論