版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
1、次講座的題目/時間計算機視覺的背景及幾何基礎(chǔ) (2/13,第1周)攝像機的幾何標定 (3/6,第4周)剛體運動姿態(tài)估計問題 (3/27,第7周)姿態(tài)估計問題 (II)(或?qū)?yīng)問題) (4/17,第10周)應(yīng)用 (5/8,第13周)第1頁,共72頁。要求聽5 次講座并積極提問,共同討論(每次有約15-20分鐘的提問及討論時間)至少完成3個實驗中的一個(程序+報告)(上機地點頭兩周內(nèi)定,到時候我通知)完成一篇(與實驗相關(guān)的) “學術(shù)”論文最終成績計算:本科生: 60%(實驗) + 40%(文章)研究生: 40%(實驗) + 60%(文章)第2頁,共72頁。綱要什么是CV? 什么是CV? 它是從什么
2、時候發(fā)展起來的?它有哪些研究內(nèi)容?它與哪些學科/領(lǐng)域相關(guān)?CV的若干問題及應(yīng)用展望 幾何基礎(chǔ)概率基礎(chǔ)一些相關(guān)資源第3頁,共72頁。Definitions of CV (1)“Today, the study of extracting 3-D information from video images and building a 3-D model of the scene, called computer vision or image understanding, is one of the research areas that attract the most attention a
3、ll over the world.” from K. Kanatani, “Statistical Optimization for Geometric Computation: Theory and Practics”, 1996.第4頁,共72頁。CV的定義 (2)“視覺,不僅指對光信號的感受,它還包括了對視覺信息的獲取、傳輸、處理、存儲與理解的全過程信號處理理論與計算機出現(xiàn)以后,人們試圖用攝像機獲取環(huán)境圖像并將其轉(zhuǎn)換成數(shù)字信號,用計算機實現(xiàn)對視覺信息處理的全過程,這樣,就形成了一門新興的學科計算機視覺”“計算機視覺的研究目標是使計算機具有通過二維圖像認知三維環(huán)境信息的能力” “計算機視
4、覺計算理論與算法基礎(chǔ)”, 馬頌德, 張正友, 1998.“計算機視覺是當前計算機科學研究的一個非常活躍的領(lǐng)域,該學科旨在為計算機和機器人開發(fā)出具有與人類水平相當?shù)囊曈X能力。各國學者對于計算機視覺的研究始于20世紀60年代初,但相關(guān)基礎(chǔ)研究的大部分重要進展則是在80年代以后取得的?!?“/asia/news/displayArticle.aspx?id=1332”第5頁,共72頁。研究的內(nèi)容早期:低層(low-level)圖像處理,如 image transformation, image restoration, image enhancement, thresholding, region
5、labelling, and shape characterization.“Tried to identify and classify objects in images by techniques of Pattern Recognition (模式識別), which had been developed for the purpose of recognizing 2-D characters and symbols by feature extraction and statistical decision making by learning”.第6頁,共72頁?!癕any pa
6、ttern recognition researchers believed that the paradigm of pattern recognition would also lead to intelligent vision systems that could understand 3-D scenes”.“However, they soon realized the crucial fact that 3-D objects look very different from viewpoint to viewpoint beyond the capability of 2-D
7、feature-based learning; 3-D meanings of 2-D images cannot be understood unless some a prior knowledge about the scene is given. Thus, Knowledge came to play an essential role”.第7頁,共72頁?!癟his type of knowledge-based high-level reasoning is called the top-down (自上而下) (or goal-driven (目標驅(qū)動) approach.”
8、“In a sense, this approach corresponds to the psychological view toward human perception(感知) that humans understand the environment by unconsciously matching the vast amount of knowledge accumulated from experience in the process of growth.”“This view can be compared to what is known as the Gestalt
9、psychology, which regards human perception as integration of the environment and experience. ”第8頁,共72頁。Thus, the problem of how to represent and organize such knowledge became a major concern, and many symbolic schemes were derived. Establishing such symbolic representations is one of the central th
10、emes of artificial intelligence (人工智能), and machine vision was regarded as problem solving by artificial intelligence.第9頁,共72頁?!癏owever, the inherent difficulty of this approach was soon realized: the amount of necessary knowledge, most of which has the form of “if then else ”, is limitless, heavily
11、 depending on the domain of each application (“office scene”, “outdoor scene”, etc) and constantly changing (e.g., today, many telephones are no longer black and do not have dials). However large the amount of knowledge is, exceptions are bound to appear, and computation time blows up exponentially
12、as the amount of knowledge increases.”第10頁,共72頁。Many combinatorial techniques were proposed so as to find plausible interpretation efficiently without doing exhaustive search. Such techniques include various types of heuristic (啟發(fā)式的) search as well as special techniques such as constraint propagatio
13、n (約束繁殖) and probabilistic relaxation (概率松弛).第11頁,共72頁?!癛ealizing that such computational problems are inevitable as long as knowledge is directly matched with features extracted from raw images, researchers began to pay attention to “physical/optical laws” governing 3-D scenes. In analyzing 2-D ima
14、ges, such laws can provide clues to the 3-D shapes and positions of objects. ”第12頁,共72頁?!癋or example, the surface gradients of objects can be estimated by analyzing shading intensities (shape from shading). The orientation of a surface in the scene can also be estimated by analyzing the perspective
15、distortion of a texture on it (shape from texture). If objects are moving in the scene (or the camera is moving relative to the objects), the 3-D shapes of the objects and their 3-D motions (or the camera motion) can be computed (shape from motion or structure from motion).” 第13頁,共72頁?!癆lthough such
16、 analyses require appropriate assumptions about surface reflectance, illumination, perspective distortion, and rigid motion, they do not depend on specific application domains; they are called constraints in contrast to knowledge for the top-down approach. This approach is in line with the psycholog
17、ical view toward human vision that human perception occurs automatically when visual signals trigger computation in the brain and that this computational functionality is innate, acquired in the process of evolution. ”第14頁,共72頁。This view was asserted by J. J. Gibson, who had a great influence on not
18、 only psychologists but also machine vision researchers. Thus, a new paradigm (范例) was established. First, primitive features are extracted from raw images by edge detection and image segmentation, resulting in primal sketches; next, approximate shapes and surface orientations are estimated by apply
19、ing available constraints (shading, texture, motion, stereo, etc.), resulting in 2.5-D sketches; 第15頁,共72頁。then, appropriate 3-D models (e.g., generalized cylinders) are fitted to such data, resulting in a numerical and symbolic representation of the scene; finally, high-level inference is made from
20、 such representations. This is called the bottom-up (自下向上) (or data-driven (數(shù)據(jù)驅(qū)動) approach, which is also known as the Marr paradigm after David Marr, who strongly endorsed this approach. 第16頁,共72頁。Marr的計算視覺理論框架Marr從信息處理系統(tǒng)的角度出發(fā),認為視覺系統(tǒng)的研究應(yīng)分為三個層次,即計算理論層次、表達(representation)與算法層次、硬件實現(xiàn)層次計算理論層次要回答系統(tǒng)各部分的計算
21、目的與計算策略,亦即各部分的輸入輸出是什么,之間的關(guān)系是什么變換或什么約束表達與算法層次應(yīng)給出各部分的輸入輸出和內(nèi)部的信息表達,以及實現(xiàn)計算理論所規(guī)定的目標的算法.硬件實現(xiàn)層次要回答“如何用硬件實現(xiàn)以上算法”第17頁,共72頁。A major drawback of this approach is its susceptibility to noise. Computation solely based on physical/optical constraints is likely to produce meaningless interpretations in the presenc
22、e of noise. This is because 3-D reconstruction from 2-D data is a typical inverse problem (逆問題), for which solutions are known to be generally unstable with respect to noise. 第18頁,共72頁?!癐n order to cope with this inherent ill-posedness, many optimization techniques were devised so as to force the so
23、lution to have required properties. Such techniques are generally called regularization. Other types of optimization include a stochastic relaxation technique called simulated annealing (模擬退火), which was constructed by analogy with statistical mechanics, and the use of neural networks, which gave ri
24、se to a new view toward human cognition called connectionism. ”第19頁,共72頁?!癟oday, many attempts are being made to enhance the reliability of image data. One approach is to actively control the motion of the camera so that the resulting 3-D interpretation becomes stable (active vision). Another approa
25、ch is using multiple sensors (stereo, range sensing, etc.) and fusing the data (sensor fusion). ”第20頁,共72頁?!癐n order to fuse data, the reliability of individual data must be evaluated in quantitative terms so that reliable data contribute more than unreliable data.” “Some researchers are attempting
26、to use only minimum information that is enough to achieve a specific goal such as object avoidance (qualitative vision, purposive vision, etc.). ” for detailed information, read “intro_KKanatani.doc”第21頁,共72頁。相關(guān)領(lǐng)域數(shù)學,物理學腦科學(或神經(jīng)生理學)心理學,認知科學, AI, “計算機視覺發(fā)展得益于神經(jīng)生理學、心理學與認知科學對動物視覺系統(tǒng)的研究,但計算機視覺已發(fā)展起一套獨立的計算理論與
27、算法,它并不刻意去仿真生物視覺系統(tǒng)”第22頁,共72頁。相關(guān)學科與相關(guān)課程的聯(lián)系數(shù)字圖象處理計算機視覺模式識別機器視覺計算機圖形學線性代數(shù)集合論高級語言程序設(shè)計數(shù)據(jù)結(jié)構(gòu)先后順序重疊量反應(yīng)相關(guān)程度基礎(chǔ)知識計算機視覺專題(如圖象與視覺計算)高等代數(shù)最優(yōu)化方法。信號與系統(tǒng)計算幾何第23頁,共72頁。Overview (1)計算機視覺的幾何學基礎(chǔ)攝像機模型單攝像機(pinhole model/perspective transformation)雙攝像機 (epipolar geometry: fundamental/essential matrix)三攝像機及更多(multi-view geomet
28、ry)運動估計對應(yīng)點問題(correspondence problem)光流計算方法剛體運動參數(shù)估計(minimal projective reconstruction)2-view, 7 points in correspondence; (Faugeras)3-view, 6 points in correspondence; (Quan Long)3-view, 8 points with one missing in one of the three view. (Quan Long)幾何重構(gòu)(Geometry reconstruction)立體視覺(stereo vision)Sha
29、pe from X (shading/motion/texture/contour/focus/de-focus/.)第24頁,共72頁。Overview (2)計算機視覺的物理學基礎(chǔ)攝像機及其成像過程視點、光源、空間中光線、表面處的光線.明暗 (shading)、陰影 (shadow)光學/色彩 (light/color)輻射學(radiometry),輻照率, , 物體表面特性漫反射表面(各向同性)Lambertian surfaceBDRF (bi-directional reflectance distribution function)第25頁,共72頁。Overview (3)計算
30、機視覺的圖像模型基礎(chǔ)攝像機模型及其校準內(nèi)參數(shù)、外參數(shù)圖像特征邊緣、角點、輪廓、紋理、形狀圖像序列特征 (運動)對應(yīng)點、光流第26頁,共72頁。Overview (4)計算機視覺的信號處理層次低層視覺處理單圖像:濾波/邊緣檢測/紋理多圖像:幾何/立體/從運動恢復仿射或透視結(jié)構(gòu) (affine/perspective structure from motion)中層視覺處理聚類分割/擬合線條、曲線、輪廓 clustering for segmentation, fitting line基于概率方法的聚類分割/擬合跟蹤 tracking高層視覺處理匹配模式分類/關(guān)聯(lián)模型識別 pattern clas
31、sification/aspect graph recognition應(yīng)用距離數(shù)據(jù)(range data)/圖像數(shù)據(jù)檢索/基于圖像的繪制第27頁,共72頁。Overview (5)計算機視覺的數(shù)學基礎(chǔ)射影仿射幾何、微分幾何概率統(tǒng)計與隨機過程數(shù)值計算與優(yōu)化方法機器學習計算機視覺的基本的分析工具和數(shù)學模型Signal processing approach: FFT, filtering, wavelets, Subspace approach: PCA, LDA, ICA, Bayesian inference approach: EM, Condensation/sequential impo
32、rtance sampling (SIS) , Markov chain Monte Carlo (MCMC) , .Machine learning approach: SVM/Kernel machine, Boosting/Adaboost, k-NN/Regression, HMM, BN/DBN (Dynamic Bayesian Network), Gibbs, MRF, 第28頁,共72頁。Overview (6)計算機視覺問題的特點高維數(shù)據(jù)的本質(zhì)維數(shù)很低,使得模型化成為可能。High dimensional image/video data lie in a very low
33、dimensional manifold.解的不唯一性 缺少約束的逆問題優(yōu)化問題第29頁,共72頁。CV的若干問題及應(yīng)用展望基本視覺系統(tǒng)如下:特征檢測Shape from X識別圖像低層特征位置與形狀物體描述 涉及模塊與系統(tǒng)的研究存在的問題與出現(xiàn)的一些新 思路,如“視覺信息處理系統(tǒng)的任務(wù)”, “關(guān)于模塊化 問題” , “局部特征與全局特征” , “物體建?!?,等等 三維計算機視覺將會有極廣泛的應(yīng)用前景, 如: 計算機人機交互;多媒體技術(shù),數(shù)據(jù)庫與圖像通信; 生產(chǎn)自動化;醫(yī)學;自動導航;三維場景建模與可視化第30頁,共72頁。綱要什么是CV? 什么是CV? 它是從什么時候發(fā)展起來的?它有哪些研
34、究內(nèi)容?它與哪些學科/領(lǐng)域相關(guān)?CV的若干問題及應(yīng)用展望 幾何基礎(chǔ)概率基礎(chǔ)一些相關(guān)資源第31頁,共72頁。射影幾何知識簡介歐氏幾何:旋轉(zhuǎn)和平移都是歐氏變換研究在歐氏變換下保持不變的性質(zhì)(歐氏性質(zhì))的幾何是歐氏幾何如平行性,長度,角度等都是歐氏性質(zhì)射影幾何:照相機的成像過程是一個射影(透視或中心射影)的過程它不保持歐氏性質(zhì),如平行線不再平行研究射影空間中在射影變換下保持不變的性質(zhì)(射影性質(zhì))的幾何學是射影幾何第32頁,共72頁。第33頁,共72頁。第34頁,共72頁。無窮遠元素平行線交于一個無窮遠點;平行平面交于一條無窮遠直線;在一條直線上只有唯一一個無窮遠點;所有的一組平行線共有一個無窮遠點在
35、一個平面上,所有的無窮遠點組成一條直線,稱為這個平面的無窮遠直線第35頁,共72頁。第36頁,共72頁。維空間中所有的無窮遠點組成一個平面, 稱為這個空間的無窮遠平面第37頁,共72頁。射影空間對n維歐氏空間加入無窮遠元素,并對有限元素和無窮遠元素不加區(qū)分,則它們共同構(gòu)成了n維射影空間.1維射影空間是一條射影直線,它由歐氏直線和它的無窮遠點組成;2維射影空間是一個射影平面,它由歐氏平面和它的無窮遠直線組成;3維射影空間是由3維歐氏空間加上無窮遠平面組成第38頁,共72頁。齊次坐標在歐氏空間中建立坐標系以后,點與坐標有了一一對應(yīng),但當引入無窮遠點以后,無窮遠點沒有坐標,為了刻劃無窮遠點的坐標,可
36、以引入齊次坐標在n維歐氏空間中,建立直角坐標以后,每個點的坐標為(m1, , mn),對任意n+1個數(shù)x1, , xn, x0,如果滿足x00, xi/x0 = mi, (i = 1n)則稱(x1, , xn, x0)為該點的齊次坐標而(m1, , mn)被稱為非齊次坐標第39頁,共72頁。不全為0的數(shù)x1, , xn組成的坐標 (x1, , xn, 0)被稱為無窮遠點的齊次坐標例設(shè)在歐氏直線上的普通點的坐標為x,則適合x1/ x0 = x的任意兩個數(shù)組成的坐標(x1, x0)為該點的齊次坐標,而x為該點的非齊次坐標對任意x1 0,則(x1, 0)是無窮遠點的齊次坐標第40頁,共72頁。第41
37、頁,共72頁。第42頁,共72頁。射影參數(shù)第43頁,共72頁。交比第44頁,共72頁。射影變換第45頁,共72頁。第46頁,共72頁。第47頁,共72頁。射影平面中的對偶“點”與“直線”叫做射影平面上的對偶元素“過一點作一直線”與“在直線上取一點”叫做對偶作圖在射影平面設(shè)有點,直線及其相互結(jié)合和順序關(guān)系所組成的一個命題,將此命題中的各元素改為它的對偶元,各作圖改為它的對偶作圖,其結(jié)果形成另一個命題,這兩個命題稱為平面對偶命題對偶原則:在射影平面中,若一個命題成立,則其對偶命題也成立第48頁,共72頁。第49頁,共72頁。第50頁,共72頁。調(diào)和關(guān)系若點對(P1, P2)和(P3, P4)的交比
38、是-1,即 (P1, P2;P3, P4) = -1,則稱(P1, P2)與(P3, P4) 是調(diào)和的點對(P1, P2)與(P3, P4) 是調(diào)和的當且僅當(1+2)(3+4) = 2(12 +34)其中i分別是Pi (i = 1, , 4)的射影參數(shù)第51頁,共72頁。第52頁,共72頁。第53頁,共72頁。完全四點(線)形中的調(diào)和關(guān)系第54頁,共72頁。第55頁,共72頁。二次曲線第56頁,共72頁。第57頁,共72頁。第58頁,共72頁。第59頁,共72頁。第60頁,共72頁。絕對二次曲線(Absolute Conic)第61頁,共72頁。第62頁,共72頁。第63頁,共72頁。極點與極線對于一個二次曲線C和某個點A(向量),由L=CA確定的直線(線坐標)稱為點A關(guān)于二次曲線C的極線當A在二次曲線C
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 二零二五年度足浴行業(yè)承包經(jīng)營合同范本12篇
- 個人自建房出租合同3篇
- 二零二五年度車庫車位抵押融資服務(wù)合同范本4篇
- 2024-2025學年高中政治專題三信守合同與違約綜合訓練含解析新人教版選修5
- 二零二五年度同行調(diào)車安全責任保險合同
- 二零二五年份天使投資合同范本:旅游產(chǎn)業(yè)合作協(xié)議
- 二零二五年度智能家居租賃與買賣合同示范文本
- 2025年車輛抵押租賃融資租賃合同范本4篇
- 2025年度存量房買賣居間合同書(含房屋租賃權(quán)變更條款)4篇
- 二零二五年度林業(yè)生態(tài)修復工程合同2篇
- 2025水利云播五大員考試題庫(含答案)
- 老年髖部骨折患者圍術(shù)期下肢深靜脈血栓基礎(chǔ)預防專家共識(2024版)解讀
- 中藥飲片驗收培訓
- 手術(shù)室??谱o士工作總結(jié)匯報
- DB34T 1831-2013 油菜收獲與秸稈粉碎機械化聯(lián)合作業(yè)技術(shù)規(guī)范
- 蘇州市2025屆高三期初陽光調(diào)研(零模)政治試卷(含答案)
- 創(chuàng)傷處理理論知識考核試題及答案
- (正式版)HG∕T 21633-2024 玻璃鋼管和管件選用規(guī)定
- 《義務(wù)教育數(shù)學課程標準(2022年版)》測試題+答案
- 殘疾軍人新退休政策
- 白酒代理合同范本
評論
0/150
提交評論