版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、大數(shù)據(jù)科學(xué)與機(jī)器學(xué)習(xí)平臺(tái)介紹技術(shù)創(chuàng)新,變革未來Agenda 大綱v 數(shù)據(jù)科學(xué)和機(jī)器學(xué)習(xí)概要Data Science 101Machine Learning 101Data Science and ML Challengesv IBM 數(shù)據(jù)科學(xué)平臺(tái)介紹IBM Data Science ExperienceIBM Machine Learningv 數(shù)據(jù)科學(xué)和機(jī)器學(xué)習(xí)案例演示W(wǎng)hat is Data Science?Data science, also known as data-driven science, is an interdisciplinary field about scienti
2、fic methods, processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured,12 similar to Knowledge Discovery in Databases (KDD).Data science is a concept to unify statistics, data analysis and their related methods in order to understand and ana
3、lyze actual phenomena with data.3 It employs techniques and theories drawn from many fields within the broad areas of mathematics, statistics, information science, and computer science, in particular from the subdomains of machine learning, classification, cluster analysis, data mining, databases, a
4、nd visualization.From Wikipedia4Data Scientist: The Sexiest Job of the 21st CenturyWhat abilities make a data scientist successful?Think of him or her as a hybrid ofdata hackeranalystcommunicatortrusted adviserThe combination is extremely powerfuland rare.-Harvard Business Review Oct 2012 Issue數(shù)據(jù)科學(xué)家
5、的硬技能/thoughts/becoming-a-data-scien tist/機(jī)器學(xué)習(xí) 第三次浪潮What is Machine Learning?Computers that Learn without being explicitly programmed Grow and change when exposed to new dataDeliver personalized and optimized customer interactionsIdentify Patternsnot readily foreseen byhumansBuild Modelsof behavior f
6、rom those patternsAchieving Business Value through Watson Machine LearningChurn analysis helps identify the cause of the churn and implement effective strategies for retention.Detect and understand life- threatening medicalconditions and design ever more effective treatment programsLearn, predict we
7、ather patterns and energy production from renewable sources and integrate into grid more effectivelyProduct recommendation, nextpurchase prediction, targeted offers individual tailored shopping experience.Identify suspicious behavior, predict and prevent threats / fraud continually reduce business r
8、isks and costsCapabilitiesMachine Learning helpsConstantly learns and adaptsAvoids making the same mistakesFaster, deeper, improved insightsResulting inSmarter business outcomesLower business risks and costsNew business opportunities8Machine Learning 101 : Types of machine learningClassificationData
9、 points are labeled and are being used to predict a categoryTwo-class vs multi-classExample:Fraud detection (fraud vs non-fraud)Spam email detection (spam vs non-spam)RegressionWhen a value is being predictedExample:Stock prices predictionClusteringData points are not labeled.Goal is to group data i
10、nto clusters to better organize the data9Machine Learning 101 : feature engineeringA feature is a piece of information that might be useful for predictionExample, predict the churn probability of a customerLabeled data is the desired output dataExample, CHURN_LABEL false representing a churn sampleN
11、OT a featureFeatureFeatureFeatureTraining a modelFeature EngineeringFeature EngineeringScoringLabeled examplesTrainingScoringNew dataModelModelPredicted dataa TrainOps (DevOps) storyDeploy11Data ScientistOperational systemDevOpsWhat is Machine Learning (機(jī)器學(xué)習(xí)概要)The (incomplete) machine learning proce
12、ssTakes significant development, deployment and management effortsIngest DataExtract FeaturesTrain ModelDeploy ModelMake PredictionsHuman Intervention1212Choose Best ModelIdentify Model DegradationPrediction And ScoringManage Deployments數(shù)據(jù)科學(xué)及機(jī)器學(xué)習(xí)新挑戰(zhàn)降低數(shù)據(jù)科學(xué)入門門檻 (Citizen Data Scientist)管控機(jī)器學(xué)習(xí)全生命周期提高持續(xù)交
13、付能力數(shù)據(jù)科學(xué)的可重復(fù)性1133IngestionFeedbackMonitorDataTrainEvalDeployPredict/ActIngestionPrepScoreHistory dataTraining & Validation dataTest dataNew dataAgenda 大綱v 數(shù)據(jù)科學(xué)和機(jī)器學(xué)習(xí)概要Data Science 101Machine Learning 101Data Science and ML Challengesv IBM 數(shù)據(jù)科學(xué)平臺(tái)介紹IBM Data Science ExperienceIBM Machine Learningv 數(shù)據(jù)科學(xué)和機(jī)
14、器學(xué)習(xí)案例演示14IBM 數(shù)據(jù)科學(xué)工具箱v IBM SPSSv IBM Data Science Experiencev IBM Machine Learning1516IBM Data Science Experience社區(qū)教材與數(shù)據(jù)集連接數(shù)據(jù)科學(xué)家提問文章與論文開源Scala/Python/R/SQLJupyter and Zeppelin* NotebooksRStudio IDE and Shiny appsApache Spark復(fù)制與分享項(xiàng)目 Your favorite librariesIBM 提供的能力數(shù)據(jù)預(yù)處理/Pipeline UI *自動(dòng)數(shù)據(jù)準(zhǔn)備與建模*高級(jí)可視化*模型
15、管理與部署模型API文檔*Spark云服務(wù)/Packaged SparkData Science Experience (DSx) 主要特性v DSx Cloud Service v DSx Local EditionIBM Machine Learning for z/OS 組件Notebook 和可視化建立模型Cognitive Assistant for Data Scientists (CADS)模型部署模型管理持續(xù)監(jiān)控和反饋IBM Machine Learning for z/OS 企業(yè)級(jí)機(jī)器學(xué)習(xí)平臺(tái)Feature Highlights CADS 數(shù)據(jù)科學(xué)認(rèn)知助手18What is
16、CADS?Cognitive Assistant for Data Scientist which helps select the best fit algorithm for trainingWhy Data Scientists need CADS?Many algorithms for classification/regression tasks: SVM, Decision Trees/Forests, Nave Bayes, Logistic Regression, etc.Substantial cost in user and compute time to select t
17、he best algorithmUser spends time on trying various learnersComputational cost for training a single SVM can exceed 24hSelection commonly based on data scientist bias and experienceFeature Highlights CADS/HPOTraining DataLogistic RegressionRandom ForestDecision Tree500500Minimize amount of data to b
18、e considered to make an informed selection of most suitable learnerGiven a data set try to select best approach by directly considering part of actual data19Feature Highlights Integrated Notebook Interface with flexible APIsIngest data from DB2z tableData transformation and training20Feature Highlig
19、hts Data Visualization with Brunel (/Brunel-Visualization/Brunel)212122Feature Highlights Visual Model Builder, the guided Machine Learning InterfaceIngest data and transformTraining and evaluationFeature Highlights Model ManagementManage model, create deploymentManage deploymentFeature Highlights Easily cons
溫馨提示
- 1. 本站所有資源如無(wú)特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 人工智能電氣設(shè)備安裝協(xié)議
- 拳擊教練聘用合同范本
- 節(jié)能項(xiàng)目招投標(biāo)合同樣本
- 水利工程電力電纜架設(shè)合同
- 建筑供沼氣聯(lián)合施工合同
- 木工工程協(xié)議書條款解釋
- 企業(yè)管理戰(zhàn)略規(guī)劃
- 網(wǎng)絡(luò)成癮的預(yù)防與治療
- 智慧金融服務(wù)的行業(yè)競(jìng)爭(zhēng)力
- 重點(diǎn)整改事項(xiàng)落實(shí)情況報(bào)告(3篇)
- 汽車尾氣排放檢測(cè)操作標(biāo)準(zhǔn)
- 塔吊基礎(chǔ)下?lián)Q填地基設(shè)計(jì)
- 《中醫(yī)基礎(chǔ)理論腎》PPT課件.ppt
- 顧問咨詢服務(wù)合同
- CNAS-EC-017_2017《認(rèn)證機(jī)構(gòu)認(rèn)可風(fēng)險(xiǎn)分級(jí)管理辦法》
- 事故安全培訓(xùn)案例(一)
- 考題六年級(jí)數(shù)學(xué)上冊(cè)看圖列方程計(jì)算專項(xiàng)北師大版
- 高壓線遷移施工方案
- 培智學(xué)校的心理健康教育模式探索
- 《數(shù)學(xué)家的故事》讀后感(7篇)
- 3、三院社會(huì)滿意度測(cè)評(píng)指標(biāo)體系
評(píng)論
0/150
提交評(píng)論