下載本文檔
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、Trai ning, Validati on and Test DataExample:(A) We have data on 16 data items , their attributes and class labels. RANDOMLY divide them into 8 for training, 4 for validation and 4 for testing.Trai ningItem No.d -AttributesClass1.02.03.KNOWN FOR ALL14.15.DATA ITEMS16.17.08.0Validati on9.010.011.112.0
2、Test13.014.015.116.1(B) . Next, suppose we develop, three classification models A, B, C from the training data. Let the training errors on these models be as shown below (recall that the models do not necessarily provide perfect results on trai ning data n either they are required to).Classificati o
3、n results fromItem No.d- AttributesTrue ClassModel AModel BModel C1.00112.ALL KNOWN00003.10104.11015.10006.11117.00008.0000Classificati on Error2/83/83/8(C) . Next, use the three models A, B, C to classify each item in the validation set based on its attribute vales. Recall that we do know their tru
4、e labels as well. Suppose we get the follow ing results:Classificati on results fromItem No.d- AttributesTrue ClassModel AModel BModel C9.010010.001011.101012.0010Classificati on Error2/42/41/4If we use minimum validation error as model selection criterion, we would select model C.(D) . Now use mode
5、l C to determine class values for each data point in the test set. We do so by substitut ing the (known) attribute value into the classificati on model C. Again, recall that we know the true label of each of these data items so that we can compare the values obtained from the classification model wi
6、th the true labels to determine classification error on the test set. Suppose we get the follow ing results.Classificati on results fromItem No.d- AttributesTrue ClassModel C13.0014.ALL KNOWN0015.1016.11Classificati on Error1/4(E) . Based on the above, an estimate of generalization error is 25%.What
7、 this means is that if we use Model C to classify future items for which only the attributes will be known, not the class labels, we are likely to make in correct classificati ons about 25% of the time.(F) . A summary of the above is as follows:ModelTrai ningValidati onTestA2550B50C2525Cross Validat
8、i onIf available data are limited, we employ Cross Validation (CV). In this approach, data are randomly divided into almost k equal sets. Training is done based on (k-1) sets and the k- th set is used for test. This process is repeated k times (k-fold CV). The average error on the k repetitions is u
9、sed as a measure of the test error.For the special case whe n k=1, the above is called Leave- One -Out-Cross-Validatio n (LOO-CV).EXAMPLE:Consider the above data consisting of 16 items.(A). Let k= 4, ., 4- fold Cross Validatio n.Divide the data into four sets of 4 items each.Suppose the follow ing s
10、et up occurs and the errors obta ined are as show n.Set 1Set 2Set 3Set 4Trai ningItems 1 - 12Items 1 - 813-16Items 1 - 49-16Items 5-16TestItems 13-16Items 9-12Items 5 - 8Items 1 -4Error on test set(assume)25%35%28%32%Estimated Classification Error (CE) = 25+35+28+32= 30%4(B). LOO -CVFor this, data are divided into 16 sets, each con sisti ng of 15 training data and one test data.Set 1Set 2Set 15Set 16Trai ningItems 1 - 15Items 1 -14,16Item 1,3-8Items 2-16TestItem 16Item 15Item 2I
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 《網(wǎng)絡(luò)銷售食品監(jiān)督抽檢抽樣指南》編制說明 2
- 《機(jī)械原理與機(jī)械設(shè)計(jì) (下冊(cè)) 第4版》 課件 第23章-聯(lián)軸器、離合器和制動(dòng)器
- 安徽省合肥市一中、六中、八中2022年高一物理第二學(xué)期期末達(dá)標(biāo)檢測(cè)模擬試題含解析
- 安徽定遠(yuǎn)高復(fù)學(xué)校2022年物理高一下期末檢測(cè)模擬試題含解析
- 2022年云南省昭通市三中高一物理第二學(xué)期期末學(xué)業(yè)水平測(cè)試模擬試題含解析
- 2022年新疆烏魯木齊市沙依巴克區(qū)四中高一物理第二學(xué)期期末預(yù)測(cè)試題含解析
- 2022年烏海市重點(diǎn)中學(xué)物理高一下期末學(xué)業(yè)水平測(cè)試模擬試題含解析
- 數(shù)學(xué)課件下載教學(xué)課件
- 2024年食品制造機(jī)械項(xiàng)目提案報(bào)告
- 蜜蜂做工課件教學(xué)課件
- 小學(xué)美術(shù)-《方方圓圓》教學(xué)設(shè)計(jì)學(xué)情分析教材分析課后反思
- 新版入團(tuán)志愿書表格(含申請(qǐng)書范本)
- 2023年內(nèi)蒙古自治區(qū)事業(yè)單位招聘考試真題及答案解析
- 點(diǎn)贊中國(guó)智慧樹知到課后章節(jié)答案2023年下玉林師范學(xué)院
- 針灸推拿科中藥制劑研究計(jì)劃并實(shí)施
- 三年級(jí)科學(xué)第二單元《研究土壤》作業(yè)設(shè)計(jì)
- 空間向量與立體幾何復(fù)習(xí)
- 人教版小學(xué)六年級(jí)上冊(cè)月考數(shù)學(xué)試卷(1-2單元)(9月)(二)(解析版)
- 第一次月考動(dòng)員班會(huì)堅(jiān)定目標(biāo)與勝利
- 2022新能源集控中心調(diào)試試運(yùn)行應(yīng)急預(yù)案
- 業(yè)委會(huì)換屆選舉全套流程
評(píng)論
0/150
提交評(píng)論