




版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、Computer vision application院 (系) 電子與信息工程學(xué)院 專 業(yè) 集成電路工程 學(xué) 生 呂廣興 14S158054 0Computer vision applicationThe directoryReport: Computer vision application21.The object of the project22.The method and the principle applied to the project22.1 Platform22.2 The principle of transform the RGB image to the gray
2、 image22.3 The principle of image enhancement22.4 The principle of thresholding32.5 The principle of classifier33.The content and the result of the project43.1 The main steps in the project43.2 About human body posture recognition4About three kinds of methods are most common:43.3.Stereo vision114.Re
3、ference17Report: Computer vision application1.The object of the projectThe object of the project is Gesture recognition and location in the interior of people.2.The method and the principle applied to the project2.1 PlatformThe platform is based on Visual Studio 2012 and OpenCV 2.4.10.2.2 The princi
4、ple of transform the RGB image to the gray imageThere are three major methods to transform the RGB image to the gray image.The first one is called the maximum value that is set the value of R, G, and B to the maximum of these three. Gray=R=G=B=max(R, G, B)The second one is called mean value which is
5、 set the value of R, G, and B to the mean value of these three.Gray=R=G=B=(R+G+B)/3The third one is called weighted average that is giving different weights to the R, G and B according to the importance or other indicators, and then adding the three parts together. In fact, humans eye is very high s
6、ensitive to green, then red, last blue. Gray=0.30R+0.59G+0.11B2.3 The principle of image enhancementImage enhancement is the process of making images more useful. There are two broad categories of image enhancement techniques. The first one is spatial domain technique, and it is a direct manipulatio
7、n of image pixels that includes point processing and neighborhood operations. The second one is frequency domain technique, and it is a manipulation of Fourier transform or wavelet transform of an image.The principle of the median filter is to replace the value of a pixel by the median of the gray l
8、evels in the neighborhood of that pixel(the original value of the pixel is included in the computation of the median). It forces the points with distinct gray levels to be more like their neighbors.In addition, we also apply the morphological image processing after smoothing. Morphological image pro
9、cessing (or morphology) describes a range of image processing techniques that deal with the shape (or morphology) of features in an image. The basic ideal of Morphology is to use a special structuring element to measure or extract the corresponding shape or characteristics in the input images for fu
10、rther image analysis and object recognition. The mathematical foundation of morphology is the set theory. There are two basic morphological operations: erosion and dilation. 2.4 The principle of thresholdingThresholding is particularly useful for segmentation in which we want to isolate an object of
11、 interest from a background. At the same time, thresholding segmentation is usually the first step in any segmentation approach. The blow formula is the basic principle of image segmentation. When the gray level is no bigger than the threshold, we will set the pixel value zero(black). In contrast, w
12、hen the gray level is bigger than the threshold, we will set the pixel value 255(white).When it comes to the threshold, we get the value through the image histograms2.5 The principle of classifierThe classifier is a algorithm or device that separates objects into different classes. Usually, a classi
13、fier consists of three parts. First one is the sensor, for instance, imaging device, fingerprint reader, etc.Second one is feature extractor, for example, edge detector or property descriptor. Third one is classifier which uses the extracted features for decision making or Euclidian distance, or oth
14、er methods.Features should can be regarded as the descriptors we introduced before. And the feature should be representative and useful for classification. When it comes to the feature space, the set of all possible patterns form the feature vector. Each feature vector is a point in the so-called fe
15、ature space. Similar objects yield similar measurement results. Nearby points in feature space correspond to similar objects. Distance in feature space is related to similarity. Points that belong to the same class form a cloud in feature space.Divide the data set into a training set and a test set.
16、 The performance of a classifier should be assessed by the classification error on an independent test set. This set should not contain objects that are included in the training set. Determine a decision boundary by minimizing the classification error of the training set. Determine the classifier pe
17、rformance by computing the classification error of the test set.3.The content and the result of the project3.1 The main steps in the projectBefore we segment the vessel and classify the vessel, we can find these images afforded are color. So first we should load these pictures into gray images. Beca
18、use we use the method of SVM (Support Vector Machines), we should divide these images into a training set and a test set. Next is image enhancement, and the next is thresholding, and then object extraction. Next is feature extraction and vessel classification. We first use the training set to train
19、these images with representative features and then test the test set to recognize the humans gesture.The depth information of human is obtained by the binocular stereo vision.And then obtain the position of the people and prepare for the three-dimensional reconstruction.3.2 About human body posture
20、recognitionAbout three kinds of methods are most common:1. Method based on template matching.2. Method based on Classification.3. Prediction based approach.The method based on template matching maybe the most accurate of the all three.However this method will consume a lot of time. So it is not real
21、 time.The method based on Classification meets the accuracy requirements in the process of dealing with small data and Implementation ,so, in a single scene, for the time being, this method is used for the time being.About third methods:If the data processed by the computer in a complex scene, the d
22、ata will be expanded in a geometric scale.Dealing with this problem is the most difficult problem in artificial intelligence.However, in recent years, the neural network based on deep learning's voice recognition and image processing has shown the advantages.3.2.1 Foreground extractionMoving tar
23、get detection is the basis of the whole target detection and tracking system.Based on the video, but also for further processing (such as encoding, target tracking, target classification, target behavior understanding foundation). The purpose of moving target detection is to extract the moving objec
24、t (such as human, vehicle, etc.) from the video image.Frame difference method, background subtraction method and optical flow method. Based on the three kinds of commonly used methods. There are many kinds of improvement methods, one is the inter frame difference method and the background difference
25、 method combining method and good results have been achieved, but there are still retain less incomplete object contour detection and target point problem.Using background subtraction method is better than the direct access method, access to background method and statistical average method, it is a
26、method that through carrying on the statistics to the continuous image sequence averaged to obtain image background.In addition to get better background, R.T.Colin proposed to established a single Gauss background model.Grimson et al. Proposed an adaptive hybrid Gauss background model to obtain a mo
27、re accurate background description for target detection.At the same time, in order to increase the robustness and reduce the impact of environmental changes, to update the background is important.For example based on the recursive updating of the statistical averaging method, a simple adaptive filte
28、r is used to update the background model.In this paper, according to the algorithm proposed in KaewTraKulPong et al.1Zivkovic et al.23, we use the update variance to the 3 parameters in the root model, and finally, the algorithm is realized by using OpenCV basic function. The main processes are as f
29、ollows:1.Firstly, the mean, variance and weight of each Gauss are set to 0.2.The T model used in the video is used to train the GMM model. For each pixel, the establishment of its model number of the largest GMM_MAX_COMPONT Gauss GMM model. When the first pixel is set up, the initial mean, variance,
30、 and weight are set to 1.3. The first frame of the training process, when back to the pixel value, compared with the previous Gaussian mean, if the pixel value and mean value model in three times the variance of the difference, the task is the Gaussian. At this point, the following equation is used
31、to update.4. when training frames in T, different GMM pixel number adaptive selection. First of all, with the weight divided by the variance of the various Gauss from big to small sort, and then select the most in front of B Gauss, so thatWhere is generally set to 0.3.So that we can eliminate the no
32、ise points in the training process.5. during the testing phase, the new pixel value is compared with every mean values of B a Gaussian, if the difference between 2 times the variance of the words, that is the background, or that the foreground. And as long as there is a Gauss component to meet the c
33、ondition is considered a prospect. Foreground assignment is 255, and the background value is 0. So as to form a two value chart.6. Due to foreground binary map contains a lot of noise, so the use of morphological opening operation noise is reduced to 0, followed by the closed reconstruction operatio
34、n due to the edge portion of the opening operation loss of information. Eliminate the small noise points.The above is the algorithm to achieve the general process, but when we are in the specific programming, there are still many details of the need to pay attention, such as the choice of some param
35、eter values. Here, after testing some of the commonly used parameter values are declared as follows:Among the 3 parameters of the value of the update variance, the learning rate is 0.005. That is to say T equals 200.Define the maximum number of mixed Gauss number for each pixel 7.Take video of the f
36、irst 200 frames for training.Take Cf 0.3. That is to meet the weight of the number is greater than 0.7 of the number of adaptive Gauss B.During the training process, a new set of Gauss is needed, and the weight value is equal to the value of the learning rate, which is 0.005.During the training proc
37、ess, the need to build a new Gauss, take the mean value of the input pixel value for the input of the Gauss. The variance was 15.During the training process, the need to build a new Gauss, take the variance of the Gauss 15.The following picture is a dynamic background in the training process.Figure
38、3.1 the result of foreground extraction3.2.2 Feature extractionAfter an image has been segmented into regions. Representation and description should be considered.Representation and description used to make the data useful to a computer. Representing region in 2 ways1.In terms of its external charac
39、teristics (its boundary)focus on shape characteristics2.In terms of its internal characteristics (its region) focus on regional properties, e.g., color, texture.Sometimes, we may need to use both ways.Choosing a representation scheme, however is only the part of the task of making data useful to com
40、puter.The next task is to describe the region based on the chosen representation.For example: Representation boundary :Description of the length of the boundary, orientation of the straight line joining its extreme points, and the number of concavities in the boundary.To find the feature of the targ
41、et , we need to extract the contour of the target, and to extract object from the background based on the area of every contour. And having the largest area is the destination physical contour. Here we use the blow function:(1) Find contoursfindContours(image, contours,/輪廓的數(shù)組 CV_RETR_EXTERNAL,/獲取外輪
42、CV_CHAIN_APPROX_NONE);/獲取每個(gè)輪廓的每個(gè)像素(2) Draw contoursdrawContours(result,contours, -1,/繪制所有輪廓 cv:Scalar(0),/顏色信息為黑色 2);/輪廓線的繪制寬度為2The following image is the result of contours extraction and object extraction.Figure 3.2 The result of extraction of contour and objectAt last ,we choose the characteristi
43、cs of Length of boundary and the hight of Feret box to train and predict.We also test other characteristics but not as good as these two.3.2.3. Recognition and classification3.2.3.1 Classifier We use the SVM (Support vector machine) classifier to recognize the ships. Support vector machines are supe
44、rvised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new example
45、s into one category or the other, making it a non-probabilistic binary linear classifier. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. There are four main steps to s
46、tructure SVM:1. A given training set T=(x1,y1), (x2,y2), (xn, yn)2. Solving two quadratic programming problem:We get the solution: 3. Calculate the parameters of W, and selects a positive componentto compute b 4. To construct a decision boundary:, thus the have the decision function: All the above w
47、ork can be done with the OpenCv functions, but it need us to make the train file and test file for it. Training file used for learning, and with the features, we can make classification of these four vessels(testing file). In short SVM steps can be summaries as: training (learning), testing and pred
48、icting. When choosing the images, we main keep this in mind: In order to train SVM effectively, we could not choose vessel image for training freely, instead we need to select the vessel shape with more obvious characteristics, and can be representative for vessel image type. If the vessel shape too
49、 special or similar would interfere the SVM learning. Because too diverse the sample is, it will increases the difference between feature vectors, reduces the classification of objects. As a result, it increase the burden of SVM learning. Some main SVM codes used:/ The parameters of support vector m
50、achine settings CvSVMParams params; params.svm_type = CvSVM:C_SVC; /SVM type: C using support vector machineparams.kernel_type = CvSVM:LINEAR;/ The type of kernel function: linear params.term_crit = cvTermCriteria(CV_TERMCRIT_ITER, 100, 1e-6); /Termination criterion function: when the number of iter
51、ations reaches the maximum / SVM training CvSVM SVM; / Establish an instance of the SVM model SVM.train(trainingDataMat, labelsMat, Mat(), Mat(), params); /The training model, parameters are: input data, response, ··,··, features Figure 3.4 the result of pattern recognition3.2.3.
52、2 Recognition resultsTest resultsTest samplesCorrectidentification numberprecision550550100%3.2.4. ConclusionFrom the block above, we know that the method we use can distinguish the several ships, but still exist errors. Because the number of given picture not very large .When test the category of o
53、ne picture ,it is inevitable to make error. And some categories of pictures also have some same points, which the feature is similar, hence its hard to distinguish them. What more the classifier SVM also exist some errors, it could not classify exactly. It also can be the features we chose are not e
54、nough, so some more work should be done.3.3.Stereo vision3.3.1 StereopsisFusing the pictures recorded by our two eyes and exploiting the difference (or disparity) between them allows us to gain a strong sense of depth. This chapter isconcerned with the design and implementation of algorithms that mi
55、mic our abilityto perform this task, known as stereopsis. Reliable computer programs for stereoscopic perception are of course invaluable in visual robot navigation (Figure 7.1),cartography, aerial reconnaissance, and close-range photogrammetry. They are alsoof great interest in tasks such as image
56、segmentation for object recognition or theconstruction of three-dimensional scene models for computer graphics applications.figure 3.4: Left: The Stanford cart sports a single camera moving in discrete incrementsalong a straight line and providing multiple snapshots of outdoor scenes. Center: TheINR
57、IA mobile robot uses three cameras to map its environment. Right: The NYUmobile robot uses two stereo cameras, each capable of delivering an image pair. As shown by these examples, although two eyes are sufficient for stereo fusion, mobile robots are sometimes equipped with three (or more) cameras. The bulk of this chapter is concerned with binocular perception but stereo algorithms using multiple cameras are discussed 4. Photos courtesy of Hans Moravec, Olivier Faugeras, and Yann LeCun.Stereo vision involves two processes: The fusion of features observed by two(or mor
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 醫(yī)院收費(fèi)合同范本
- 農(nóng)體產(chǎn)品加工合同范本
- 醫(yī)院制氧機(jī)采購(gòu)合同范本
- 絲接頭采購(gòu)合同范本
- 公司買賣合同范本
- 買賣小商鋪合同范本
- 企業(yè)房產(chǎn)轉(zhuǎn)讓合同范本
- 單位考察合同范本
- 信息化合同范本
- 公司不執(zhí)行合同范本
- 風(fēng)電項(xiàng)目施工進(jìn)度計(jì)劃
- 芙蓉鎮(zhèn)足球協(xié)會(huì)成立申請(qǐng)書
- 鍘草機(jī)設(shè)備更新項(xiàng)目資金申請(qǐng)報(bào)告-超長(zhǎng)期特別國(guó)債投資專項(xiàng)
- 急性呼吸窘迫綜合征-課件
- DB14∕T 1319-2016 公路工程標(biāo)準(zhǔn)工程量清單及計(jì)量規(guī)范
- 《黃金介紹》課件
- 2024年吉林省中考語(yǔ)文真題版有答案
- CHT 8023-2011 機(jī)載激光雷達(dá)數(shù)據(jù)處理技術(shù)規(guī)范(正式版)
- 第一單元 位置與方向(一)(單元測(cè)試)-2023-2024學(xué)年三年級(jí)下冊(cè)數(shù)學(xué)人教版
- 如何在小學(xué)語(yǔ)文教學(xué)中落實(shí)單元語(yǔ)文要素
- 《第四章多彩的光》復(fù)習(xí)課件
評(píng)論
0/150
提交評(píng)論