1、UCI數(shù)據(jù)庫使用說明機器學(xué)習(xí)領(lǐng)域的UCI數(shù)據(jù)集使用說明此目錄包含數(shù)據(jù)集和相關(guān)領(lǐng)域知識(后面以簡短的列表形式進行的注釋) 這些數(shù)據(jù)已經(jīng)或能用于評價學(xué)習(xí)算法。每個數(shù)據(jù)文件(*data )包含以“屬性-值”對形式描述的很多個體樣本的 記錄。對應(yīng)的*.info文件包含的大量的文檔資料。(有些文件_generate_databases ;他們不包含*.data文件。)作為數(shù)據(jù)集和領(lǐng)域知識的補充,在utilities 目錄里包含了一些在使用這一數(shù)據(jù)集時的有用資料。地址 /mlearn/MLRepository.html, 這里的UCI數(shù)據(jù)集可以看作是通過 web的
2、遠程拷貝。作為選擇,這些數(shù)據(jù)同樣可以通過ftp 獲得,ftp:/ .可是使用匿名登陸ftp 。可以在 pub/mach in e-lear nin g-databases目錄中找到。UCI 一直都在尋找可加入的新數(shù)據(jù),這些數(shù)據(jù)將被寫入in comi ng子目錄中 希望您能貢獻您的數(shù)據(jù),并提供相應(yīng)的文檔。謝謝一一貢獻過程可以參考 DOC-REQUIREMENTS文件。目前,多數(shù)數(shù)據(jù)使用下面的格式:一個實例一行, 沒有空格,屬性值之間使用逗號“,”隔開,并且缺少的值使用問號“ ?”表示。并請在做出您的貢獻后提醒一下站點管理員:ml-repositoryics.uci
3、.edu下面以UCI中IRIS為例介紹一下數(shù)據(jù)集:ucidatairis 中有三個文件:In dexiris.datairis.n amesindex為文件夾目錄,列出了本文件夾里的所有文件,女口iris中index的內(nèi)容如下:In dex of iris18 Mar 1996105 In dex08 Mar 19934551 iris.data30 May 19892604 iris. namesiris.data為iris數(shù)據(jù)文件,內(nèi)容如下:5.1,3.5,1.4,0.2,lris-setosa4.9,3.0,1.4,0.2,lris-setosa4.7,3.2,1.3,0.2,lris-
4、setosa7.0,3.2,4.7,1.4,lris-versicolor6.4,3.2,4.5,1.5,lris-versicolor6.9,3.1,4.9,1.5,lris-versicolor6.3,3.3,6.0,2.5,lris-virgi nica5.8,2.7,5.1,1.9,lris-virgi nica7.1,3.0,5.9,2.1,lris-virgi nica如上,屬性直接以逗號隔開,中間沒有空格(5.1,3.5,1.4,02 ),最后一列為本行屬性對應(yīng)的值,即決策屬性Is介紹了 irir數(shù)據(jù)的一些相關(guān)信息,如數(shù)據(jù)標(biāo)題、數(shù)據(jù)來源、以
5、前使用情況、最近信息、實例數(shù)目、實例的屬性等,如下所示部分:7. Attribute In formatio n:1. sepal le ngth in cm2. sepal width in cm3. petal le ngth in cm4. petal width in cm5. class:-Iris Setosa-Iris Versicolour-Iris Virgi nica9. Class Distribution: 33.3% for each of 3 classes.本數(shù)據(jù)的使用實例請參考其他論文,或本站后面的內(nèi)容。This is the UCI Repository Of
6、 Machine LearningDatabases and Doma in TheoriesThis is the UCI Repository Of Machine Learning Databases and Domain Theories4 December 1995: pub/mach in e-lear nin g-databases/mlearn/MLRepository.htmlLibraria n: Patrick M. Murphy ( )have111
7、 databases and domain theories (36MB)This directory contains data sets and domain theories (the latter bee nanno tated as such in the follow ing brief listi ng) that have bee n or can beused to evaluate lear ning algorithms. Each data file (*.data) containsin dividual records described in terms of a
8、ttribute-value pairs. Thecorresponding *.info file contains voluminousdocumentation. (Somefiles_gen erate_ databases; they do not have *.data files.)In addition to data sets and domain theories, the utilities/ directorycontains utilities that you may find useful when using datasets in thisrepository
9、.The contents of this repository can be viewed and remotely copied overtheweb. Theaddressis/mlearn/MLRepository.html.Alter natively, the contents of this repository can be remotely copied viaftp to . En ter a nonym ous for user id, and e-mail address(email=userhos
10、tuserhost/email)for password. These databasescan be found by executi ngcd pub/mach in e-lear nin g-databases.Notes:1. Were always look ing for additi on al databases, which can bewritte n to the sub-directory n amed /incomin g. Please send yours,withdocumentation. Thanks- See DOC-REQUIREMENTS for su
11、ggested docume ntati onprocedures. Prese ntly, most databases have the follow ing format: 1in sta nee per line, no spaces, commas separate attribute values, andmissing values are denoted by ?. Also, please notify the site libraria n( ) after making a don atio n.2. Ivan Bratko
12、 requested that the databases he donated from the Ljublja naOn cology In stitute (e.g.,breast-ca ncer,lymphography, andprimary-tumor)have restricted access. We are allowed to share them with academicinstitutionsupon request. These databases (like several others)requireprovidingproper citations be ma
13、de in published articles that usethem.Citati on requireme nts are in each databases corresp onding *.doc file.To access any of these databases, send email to .To aid you in decid ing if you want any of these databases, the documentation files are available.3. An archive serv
14、er may now be used to recieve via e-mail files in thisrepository. In stalled on ics, it provides email access to files inour anonym ous ftp/uucp area (ftp). If people have no other accesstoour archives, the n they can send mail to:Comma nds to the server may be give n in the
15、 body. Some comma ndsare:helpsend find The help comma nd replies with a useful help message.If you publish material based on databases obtained from this repository, then, in your ack no wledgeme nts, please note the assista nee you received byusing this repository. Thanks - this will help others to
16、 obtain the samedata sets and replicate your experiments. We suggest the following pseudo-APArefere nee format for referri ng to this repository (LaTeXd):Murphy,P .M., & Aha,D.W. (1994). it UCI Repository of machinelear ningdatabases/mlearn/MLRepository.html.Irvine, CA: Universi
17、ty of California, Departmentof InformationandComputerScien ce.Patrick M. Murphy (Repository Libraria n)Brief Overview of Databases and Doma in Theories:Quick List ing:1. ann eali ng (David Sterl ing and Wray Buntine)2. Artificial Characters Database & DT (do nated by Attilio Giorda na)3-4. audiology
18、 (Ray Bareiss and Bruce Porter, used in Protos)1. Origi nal Versio n2. Sta ndardized-Attribute Versi on of the Origi nal.5. auto-mpg (from CMU StatLib library)6. autos (Jeff Schlimmer)7. badges (Haym Hirsh)8. bala nce-scale (Tim Hume)9. ballo ons (Michael Pazza ni)10. breast-cancer (Ljubljana Instit
19、ute of Ontcology, restricted access)Olvi11. breast-ca ncer-wisc onsin(Wise onsinBreast Can cer Dbase,Man gasaria n)1. Original versi on2. Diag no stic data set3. Prog no stic data set12. bridges (Yoram Reich)13-21. chess1. Partial generatorof Quinlans chess-end-game data (kr-vs-kn)(Schlimmer)2. Shap
20、iros en dgame database (kr-vs-kp) (Rob Holte)3. king-rook-vs-king (Michael Bain, Arthur van Hoff)4-9. Six domain theories (Nick Flann)22. Bach Chorales (time-series) database (Darrell Con kli n)23. Conn ect-4 Database (Joh n Tromp)24-25. Credit Scree ning Database1. Japa nese Credit Scree ning Data
21、and doma in theory (Chiharu Sano)2. Credit Card Applicati on Approval Database (Ross Qui nla n)26. Ein-Dor and Feldmessers cpu-performa nee database (David Aha)27. Diabetes Data (Serdar Ucku n, AI-M94)28. dgp-2 data gen erati on program (Powell Ben edict)29. Docume nt Un dersta nding (Don ato Malerb
22、a)30. Nine small EBL doma in theories and examples in sub-directory ebl31. Evli n Kinn eys echocardiogram database (Steve n Salzberg)32. flags (Richard Forsyth)33. fun cti on-finding (Culle n Schafers 352 case studies)34. glass (Vi na Spiehler)35. hayes-roth (from Hayes-RothA2s paper)36-39. heart-di
23、sease (Robert Detra no)40. hepatitis (G. Gong)41. horse colic database (Mary McLeish & Matt Cecile)42. (Bost on) Housi ng database (from CMU StatLib library)43. ICU data (Serdar Uckun, AIM-94)44. Image segme ntati on database (Carla Brodley)45. io no sphere in formatio n (Vi nee Sigillito)46. iris (
24、R.A. Fisher, 1936)47. isolet (Ron Cole and Mark Fantys database donated by TomDietterich)48. ki nship (J. Ross Qui nlan)49. labor- negotiatio ns (Sta n Matwi n)50-51. led-display-creator (from the CART book)52. le nses (Ce ndrowskas database don ated by Ben oit Julie n)53. letter-recog nitio n datab
25、ase (created and don ated by David Slate)54. liver-disorders (BUPA Medicals database don ated by Richard Forsyth)55. logic-theorist (Paul ORorke)56. lu ng can cer (Stefa n Aeberhard)57. lymphography (Ljubja na In stitute of On cology, restricted access)58-59. mecha ni cal-a nalysis (Fra ncesco Berga
26、da no)1. Original Mechanical Analysis Data Set2. PUMPS DATA SET60 mobile robots (don ated by Kli ngspor, Morik and Rieger)doma in& Rich61-64. molecular-biology1. promoterseque nces(Towell, Shavlik, & Noordewier,theory also)2. splice-ju ncti on seque nces (Towell, Noordewier, & Shavlik,doma in theory
27、 also)3. prote in sec on dary structure database (Qia n and Sejno wski)4. protein sec on dary structure doma in theory (Jude Shavlik Macli n)65. MONKs Problems (do nated by Sebastian Thru n)66. Moral Reas oner Database (don ated by James Wogulis)67. mushroom (Jeff Schlimmer)68. MUSK databases (do na
28、ted by Tom Dietterich)69. othello doma in theory (Tom Fawcett)70. Page Blocks Classificati on (Do nato Malerba)71. Pima In dia ns diabetes diag no ses (Vi nee Sigillito)72. Postoperative Patie nt data (Jerzy W. Grzymala-Busse)73. Primary Tumor (Ljubja na In stitute of On cology, restricted access)74
29、. Qualitative Structure Activity Relatio nships (QSARs) (Ross Kin g)75. Quadraped An imals (Joh n H. Genn ari)76. Servo data (Ross Qui nlan)77. shuttle-la ndin g-c on trol (Boja n Cest nik)78. solar flare (Gary Bradshaw)79-80. soybea n (from Ryszard Michalskis groups)81. space shuttle databases (David Draper)82. spectrometer (In fra-Red Astr onomy Satellite Project Database, Joh n Stutz)83. Sponge Database (Ios une Uriz and Marta Domin go)84. Statlog Project databases (from Ross Kin g,.)85 Stude nt Loa n relatio nal database (from Michael Pazza
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
- 2025年度水域承包經(jīng)營權(quán)流轉(zhuǎn)與漁業(yè)資源保護合同4篇
- 2025年度智能車庫租賃與停車設(shè)備維修服務(wù)合同4篇
- 2025年度個人購買進口挖掘機合同標(biāo)準(zhǔn)范本4篇
- 2025版無人駕駛技術(shù)研發(fā)與測試合同集錦4篇
- 2025年度交通事故糾紛訴訟保全擔(dān)保服務(wù)合同
- 二零二五年度智能安防系統(tǒng)建設(shè)合同3篇
- 二零二五版粉煤灰運輸安全生產(chǎn)責(zé)任書3篇
- 二零二五版電影投資居間代理合同參考范本2篇
- 2025年度打井工程地質(zhì)勘察合同8篇
- 2025年校園綠化與校園生態(tài)教育與推廣合同3篇
- 2025-2030年中國MPV汽車市場全景調(diào)研及投資策略分析報告
- 2024-2025學(xué)年初中七年級上學(xué)期數(shù)學(xué)期末綜合卷(人教版)含答案
- 中國高血壓防治指南(2024年修訂版)解讀(總)
- 重視心血管-腎臟-代謝綜合征(CKM)
- 新媒體研究方法教學(xué)ppt課件(完整版)
- 2020新版?zhèn)€人征信報告模板
- 工業(yè)純鐵生產(chǎn)工藝流程【詳情】
- 工藝管道儀表流程圖(共68頁).ppt
- 關(guān)于蒸汽管道應(yīng)急預(yù)案
- 技術(shù)服務(wù)及售后服務(wù)的承諾及保證措施
- 五項管理行動日志excel表格