華遠(yuǎn)地產(chǎn)股份有限公司ppt課件_第1頁
華遠(yuǎn)地產(chǎn)股份有限公司ppt課件_第2頁
華遠(yuǎn)地產(chǎn)股份有限公司ppt課件_第3頁
華遠(yuǎn)地產(chǎn)股份有限公司ppt課件_第4頁
華遠(yuǎn)地產(chǎn)股份有限公司ppt課件_第5頁
已閱讀5頁,還剩23頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、Learning with Bayesian NetworksDavid HeckermanPresented by Colin Rickert太原房產(chǎn)網(wǎng) 52youju.Introduction to Bayesian NetworksBayesian networks represent an advanced form of general Bayesian probabilityA Bayesian network is a graphical model that encodes probabilistic relationships among variables of inter

2、est1The model has several advantages for data analysis over rule based decision trees1.OutlineBayesian vs. classical probability methodsAdvantages of Bayesian techniquesThe coin toss prediction model from a Bayesian perspectiveConstructing a Bayesian network with prior knowledgeOptimizing a Bayesian

3、 network with observed knowledge (data)Exam questions.Bayesian vs. the Classical ApproachThe Bayesian probability of an event x, represents the persons degree of belief or confidence in that events occurrence based on prior and observed facts.Classical probability refers to the true or actual probab

4、ility of the event and is not concerned with observed behavior.Bayesian vs. the Classical ApproachBayesian approach restricts its prediction to the next (N+1) occurrence of an event given the observed previous (N) events.Classical approach is to predict likelihood of any given event regardless of th

5、e number of occurrences.ExampleImagine a coin with irregular surfaces such that the probability of landing heads or tails is not equal.Classical approach would be to analyze the surfaces to create a physical model of how the coin is likely to land on any given throw.Bayesian approach simply restrict

6、s attention to predicting the next toss based on previous tosses.Advantages of Bayesian TechniquesHow do Bayesian techniques compare to other learning models? Bayesian networks can readily handle incomplete data sets. Bayesian networks allow one to learn about causal relationships Bayesian networks

7、readily facilitate use of prior knowledge Bayesian methods provide an efficient method for preventing the over fitting of data (there is no need for pre-processing).Handling of Incomplete DataImagine a data sample where two attribute values are strongly anti-correlatedWith decision trees both values

8、 must be present to avoid confusing the learning modelBayesian networks need only one of the values to be present and can infer the absence of the other:Imagine two variables, one for gun-owner and the other for peace activist. Data should indicate that you do not need to check both values.Learning

9、about Causal RelationshipsWe can use observed knowledge to determine the validity of the acyclic graph that represents the Bayesian network.For instance is running a cause of knee damage?Prior knowledge may indicate that this is the case.Observed knowledge may strengthen or weaken this argument.Use

10、of Prior Knowledge and Observed BehaviorConstruction of prior knowledge is relatively straightforward by constructing “causal edges between any two factors that are believed to be correlated.Causal networks represent prior knowledge where as the weight of the directed edges can be updated in a poste

11、rior manner based on new data.Avoidance of Over Fitting DataContradictions do not need to be removed from the data.Data can be “smoothed such that all available data can be used.The “Irregular Coin Toss from a Bayesian PerspectiveStart with the set of probabilities = 1,n for our hypothesis.For coin

12、toss we have only one representing our belief that we will toss a “heads, 1- for tails.Predict the outcome of the next (N+1) flip based on the previous N flips: for 1, ,ND = X1=x1, Xn=xnWant to know probability that Xn+1=xn+1 = heads represents information we have observed thus far (i.e. = D.Bayesia

13、n ProbabilitiesPosterior Probability, p( |D,): Probability of a particular value of given that D has been observed (our final value of ) . In this case = D.Prior Probability, p( |): Prior Probability of a particular value of given no observed data (our previous “belief)Observed Probability or “Likel

14、ihood, p(D|,): Likelihood of sequence of coin tosses D being observed given that is a particular value. In this case = .p(D|): Raw probability of D.Bayesian Formulas for Weighted Coin Toss (Irregular Coin)where*Only need to calculate p( |D,) and p(|), the rest can be derived .IntegrationTo find the

15、probability that Xn+1=heads, we must integrate over all possible values of to find the average value of which yields: .Expansion of Terms 1. Expand observed probability p(|D,):2. Expand prior probability p(|): *“Beta function yields a bell curve upon integration which is a typical probability distri

16、bution. Can be viewed as our expectation of the shape of the curve.Beta Function and IntegrationIntegrating gives the desired result:Combine product of both functions to yield.Key PointsMultiply the results of the beta function (prior probability) with results of the coin toss function for (observed

17、 probability). Result is our confidence for this value of .Integrating the product of the two with respect to over all values of 0BFor all nodes that do not have a causal link we can check for conditional independence between those nodes.ExampleUsing the above graph of expected causes, we can check

18、for conditional independence of the following probabilities given initial sample datap(a|f) = p(a)p(s|f,a) = p(s)p(g|f,a, s) = p(g|f)p(j|f,a,s,g) = p(j|f,a,s).Construction of “Posterior knowledge based on observed data:For every node i, we construct the vector of probabilities ij = ij1, ijn where ij

19、 is represented as row entry in a table of all possible combinations j of the parent nodes 1,n The entries in this table are the weights that represent the degree of confidence that nodes 1,n influence node i (though we dont know these values yet).Determining Table Values for i How do we determine t

20、he values for ij?Perform multivariate integration to find the average ij for all i and j in a similar manner to the coin toss integration:Count all instances “m that satisfy a configuration ijk then observed probability for ijk becomes ijkm(1- ijk )n-mIntegrate over all vectors ijk to find the average value of each ijk .Question1: What is Bayesian Probability?A persons degree of belief

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論