基于KINECT的點云采集系統(tǒng)開題報告_第1頁
基于KINECT的點云采集系統(tǒng)開題報告_第2頁
基于KINECT的點云采集系統(tǒng)開題報告_第3頁
基于KINECT的點云采集系統(tǒng)開題報告_第4頁
基于KINECT的點云采集系統(tǒng)開題報告_第5頁
已閱讀5頁,還剩9頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領

文檔簡介

1、基于KINECT的點云采集系統(tǒng)1. 研究的目的、意義由于三維激光掃描的軟硬件水平日趨成熟和大眾化,相應的三維點云采集技術(shù)越來越受關(guān)注的研究熱點,廣泛應用于計算機動畫、醫(yī)學圖像處理、文物保護、地形勘探、游戲開發(fā)和數(shù)字化媒體創(chuàng)作等領域。點云采集技術(shù)是近來數(shù)據(jù)采集的一個發(fā)展熱點,因為其具有快速性、穿透性、不接觸性、實時性、動態(tài)性、主動性、全數(shù)字性、高密度、高效率等許多傳統(tǒng)的常規(guī)測量技術(shù)所不具有的優(yōu)點,所以被廣泛的應用于很多領域,具有廣闊的發(fā)展前景和應用需求。近年來三維激光掃描設備在效率、精度和價格方面得到了極大地發(fā)展,同時為了滿足人們的需求三維點云采集技術(shù)也成為了研究的重點。在實際測量中,由于光的線

2、性傳播特性,三維激光掃描設備在一個視角下對于具有復雜形狀的物體的某些區(qū)域或者背面往往存在視覺盲點,需要通過多次不同視角下的測量才能實現(xiàn)完整的模型數(shù)據(jù)采集;由于掃描設備測量范圍有限,對于大尺寸物體或者大范圍場景,不能一次性完整測量,必須分塊測量。上述的問題導致測量結(jié)果往往是多塊具有不同坐標系統(tǒng)且存在噪聲的點云數(shù)據(jù),不能夠完全滿足人們對數(shù)字化模型真實度和實時性的要求,因此對三維點云采集技術(shù)有重要的研究意義。2. 國內(nèi)外研究現(xiàn)狀在中國,政府和科研機構(gòu)均開始高度關(guān)注大數(shù)據(jù),工信部發(fā)布的物聯(lián)網(wǎng)“十二五”規(guī)劃上,把信息處理技術(shù)作為四項關(guān)鍵技術(shù)創(chuàng)新工程之一提出。在進行三維數(shù)據(jù)采集時,有很多不確定的因素都會使

3、得采集時引入噪聲,在使用數(shù)據(jù)時,我們要盡可能的降低噪聲所造成的影響,有些學者通過濾波的方式進行處理。陳曉霞等在進行數(shù)據(jù)點的篩選時主要進行了兩個步驟的操作,首先在預處理時利用了密度聚類這種方式,然后通過VRML實時交互的顯示功能來展現(xiàn)。張毅等利用K領域結(jié)合核函數(shù),來對點云數(shù)據(jù)進行噪聲的去除。在當前三維點云數(shù)據(jù)配準研究領域中,使用最為廣泛的算法當屬ICP 算法了。ICP 算法自 1992 年由 Besl 和 Mckay提出之后便在自動配準方面得到了廣泛的引用。但是由于傳統(tǒng)的 ICP 算法的效率并不高,而且對初始值要求很高,容易陷入局部收斂等缺陷,很多學者對 ICP 算法進行了改進。Hans Mar

4、tin Kjer 等利用基于曲率的方法進行抽樣加速配準效率;孫謙等利用法向量內(nèi)積加權(quán),但是由于人為的因素對最后的配準精度與效率造成了一定的影響;賀永興等提出了一種基于鄰域特征的配準方法;蔣成成等利用 Delaunay 三角剖對ICP 算法進行了一定的改進。為了使數(shù)據(jù)模型的復雜度更適合于有限的計算機資源,必須對數(shù)據(jù)模型進行簡化。對三維點云數(shù)據(jù)的精簡方法主要可以概括為兩大類:一類是對三維點云數(shù)據(jù)進行拓撲劃分利用拓撲關(guān)系進行精簡,另一類是根據(jù)特征信息來選取代表點從而進行精簡。曲率精簡法是典型的直接通過代表點選取來進行精簡的方法。國外的 Martin 等人在1997 年提出均勻網(wǎng)格法這種方法,但是這種

5、方法的局限性在于它經(jīng)常會誤刪除掉一些特征數(shù)據(jù);Chen 等根據(jù)法向量來進行精簡,但是對點云數(shù)據(jù)要求比較局限;Lee 等的改進方法對于物體表面特征的保留起到了一定的作用,但是時間開銷太大。國內(nèi)的張麗艷等人也是利用法向量來精簡點云,該方法對于物體特征保留雖然不錯,但是由于實驗中的某些參數(shù)是根據(jù)經(jīng)驗值來選擇的,這個方法的可操作性就不強了;朱冒冒等人是通過二次精簡來進行改進的,相對來說精簡的比較合理了;史寶全等提出的聚類精簡算法雖然在保留點云特征上面已經(jīng)不錯了,可計算量有所增加;杜曉暉等人提出的混合算法簡化效率就比較高了,但是時間略有下降。3. 擬采取的研究路線(1)點云數(shù)據(jù)的采集 本文提出了一種基于

6、kinect的點云采集系統(tǒng)設計方案,該方案以kinect為核心,利用模塊化C+模板庫PLC(點云庫)中提供的通用采集接口,可以直接獲取到實際坐標空間的三維信息,三維坐標信息保存為點云數(shù)據(jù),提高了點云數(shù)據(jù)的采集速度。本方案的采集系統(tǒng)是由局域網(wǎng)內(nèi)的一臺計算機負責kinect點云數(shù)據(jù)的采集,它將采集到的圖像深度信息轉(zhuǎn)換為實際空間上的點的三維坐標信息,利用了kinect的深度成像原理,采用OpenNI開放自然交互框架來抓取kinect設備中的點云數(shù)據(jù)。(2)點云數(shù)據(jù)去噪聲處理由于受人為擾動、光照、掃描設備本身的缺陷等因素的影響,采集到的數(shù)據(jù)會受到噪聲污染,需要進行去噪聲處理,根據(jù)噪聲在各個方向上擴散方

7、式不同我們可以采用各向同性和各向異性算法對噪聲進行處理。(3) 點云數(shù)據(jù)的參數(shù)化表示利用三角網(wǎng)格參數(shù)化將原始模型上的數(shù)據(jù)點映射到給定的參域上,建立點云數(shù)據(jù)到參數(shù)域上新的點云集合Q之間的一種對應關(guān)系&:G>Q,并且要求在一定意義下幾何變形達到極小。(4) 點云數(shù)據(jù)的可視化處理 針對三維模型的幾何特征,提出基于三角形簡化的多分辨率復雜三維模型生成算法,生成多尺度的三維點云數(shù)據(jù)結(jié)構(gòu),同時構(gòu)建對應的多分辨率紋理特征模型在建立幾何與紋理尺度關(guān)聯(lián)的基礎上,采用R+樹的索引機制實現(xiàn)三維模型的分塊存儲,建立靜態(tài)多層次(LOD)的三維模型分塊數(shù)據(jù)結(jié)構(gòu)。在進行三維場景瀏覽時,依據(jù)客戶端現(xiàn)場范圍裁剪和

8、網(wǎng)絡傳輸效率,在服務端快速檢索和獲取相應的靜態(tài)LOD三維模型數(shù)據(jù),自適應分塊傳輸?shù)娇蛻舳?,實時生成符合視覺要求的動態(tài)LOD三維模型,達到最佳的可視化效果。4. 文獻綜述Kinect 是微軟公司研制的一款體感外設,最初是針對其游戲主機 XBOX360 推出的一套外設產(chǎn)品,適用領域也僅限于游戲領域。但是其高科技的含量以及該裝置本身的創(chuàng)意在發(fā)售后的兩年內(nèi)開始逐漸應用于許多領域,隨著近來 kinect for windows 這款針對 windows 平臺的研發(fā)設備的推出,目前世界上尤其是國外的一批人工智能科學家,人機交互,體感互動工程師和研究小組等等也紛紛在對 kinect 的應用領域做探究和研發(fā)。

9、 Kinect 最初的開發(fā)代號稱為 Natal,之后正式更名為Kinect。Kinect 技術(shù)是微軟公司基于高端研究得出來的電子科技產(chǎn)品,是微軟在依靠人工智能解決復雜問題的過程中產(chǎn)生的一個副產(chǎn)品,這就是 Kinect 的來歷。除了體感設備已經(jīng)比較普及的游戲領域外,Kinect 的應用和實驗性應用正在快速發(fā)展,下面通過幾個領域已經(jīng)出現(xiàn)了的應用來探討 Kinect 的應用領域。(1)虛擬應用。歐洲時裝店 Topshop 在莫斯科旗艦店安裝了一種全新的試衣間,這種虛擬的試衣間利用了當前最先進了兩種技術(shù):增強現(xiàn)實(augmented reality,AR)和微軟 Kinect 體感外設,你無需試穿就能

10、見到真實的試衣效果。(2)3D 建模雕塑工具。經(jīng)過國外一些小組的實驗,多臺 Kinect 可以用作 3D 攝像機并進行 3D 建模。在一個名為 Blablab LAB 的小組的街頭實驗中,通過使用三臺 Kinect 為游客進行掃描建模,然后使用 Rep Rap 3D 打印機制作出一個迷你的雕塑。(3)機械控制遙控機器人。使用 Kinect 作為機器人的頭,通過 Kinect 檢測周圍環(huán)境,并進行 3D 建模,來指導機器人的行動。 因為機器人的可應用領域非常廣泛,低廉成本的 kinect機器人可以代替?zhèn)鹘y(tǒng)機器進行一些不需要很高精度的危險地區(qū)或者地底高空等惡劣環(huán)境下的測量與勘察作業(yè)。(4)虛擬實驗

11、醫(yī)學領域。Kinec 在醫(yī)學領域中,可代替醫(yī)生進行尸檢,研究人員只需要對著空氣做手勢或者語音,就可控制 3D 圖片放大縮小旋轉(zhuǎn)等功能。根據(jù)測量探頭的組成方式不同,被測對象表面數(shù)據(jù)的獲取主要包括接觸式和非接觸式測量兩類。接觸式測量的代表性設備是三坐標測量機(CMM)。但是由于接觸式測量設備與被測物體接觸,不可避免的使被測物體產(chǎn)生變形,因此測量誤差較大。非接觸式則應用光學及激光原理進行激光掃描或光學掃描等,不存在受力變形產(chǎn)生的誤差。大多數(shù)實用的非接觸式測量儀器都采用結(jié)構(gòu)光照明技術(shù),投影儀器發(fā)出結(jié)構(gòu)照明光束,接收器接受由被測三維表面返回的光信號。由于三維面形對結(jié)構(gòu)照明光束產(chǎn)生的空間或時間調(diào)制,因此可

12、以通過適當?shù)姆椒◤挠^察光場中解調(diào)出三維面形數(shù)據(jù)。激光三維掃描設備采集到的數(shù)據(jù),是大量的三維點坐標的集合。由于點的數(shù)量巨大,其數(shù)據(jù)被形象的成為點云數(shù)據(jù)。點云數(shù)據(jù)采集過程一般為:將儀器與電源、微機連接并開啟,打開數(shù)據(jù)識別和處理軟件。建立定點參照目標,并開啟掃描儀坐標系統(tǒng)的自動識別功能,建立三維坐標系統(tǒng)。在當前坐標系統(tǒng)內(nèi),對采集范圍內(nèi)的實體進行數(shù)字采集,并建立三維圖形。一次采集完畢后,更換儀器地點,通過定點參照物重新識別當前坐標,進行數(shù)據(jù)的多次采集,并自動完成數(shù)據(jù)的空間合并。對掃描得到的云點數(shù)據(jù)進行先期處理,包括對模型的分割、修剪、移動、旋轉(zhuǎn)、縮放等等。通過開放的數(shù)字接口,對當前模型數(shù)據(jù)進行轉(zhuǎn)換,使

13、其與后期三維設計軟件和開發(fā)軟件兼容、并行和共享。同樣我們也可以利用Atos掃描儀進行點云數(shù)據(jù)的采集。Atos三維掃描儀是一種帶有兩個CCD攝像機和一個中央投影單元的光學三維掃描儀。它的中央投影單元部分配備了一個白色的投射燈泡和一個可規(guī)則滑動的復雜光柵。Atos掃描儀的傳感器被固定在一個三腳架上,并可以十分方便的沿四軸方向轉(zhuǎn)動。測量時,投射燈泡將規(guī)則變化的光柵投影到被測工件表面產(chǎn)生的摩爾條紋,摩爾條紋的變化被CCD鏡頭記錄下來,并轉(zhuǎn)送到計算機,經(jīng)過處理以后得到兩個CCD鏡頭分別拍攝到的兩張“三維”照片。由于兩個CCD鏡頭可以感知高達440,000個象素,所以每一單幅照片可以采集到1.3萬個有效數(shù)

14、據(jù)點。Atos軟件可以在瞬間處理這1.3萬個數(shù)據(jù)并精確的標定出其三維空間坐標值。在Atos掃描儀進行測量,即點云采集的過程中,誤差的產(chǎn)生是難以避免的,但如果誤差累積到一定程度,就無法達到精度的要求。因此,正確的測量順序應該是由中部向四周逐漸擴展測量,這樣做所得到的誤差是最小的。 點云數(shù)據(jù)采集的工作特點:(1)多幅性。各種數(shù)據(jù)采集系統(tǒng),由于測量范圍的限制,所得的點云是一幅一幅進行測量的,一般情況下,一幅點云的測量并不能包含工件上所有需要的點,所以最終工件完整點云數(shù)據(jù)的獲得需要利用多次測量的多幅點云進行拼合。這種拼合在測量系統(tǒng)中一般有自動拼合和人工拼合兩種方式,也有相應的軟件。自動拼合方式有邊測量

15、邊拼合,例如在光學三維掃描系統(tǒng)ATOS中的利用相鄰兩幅點云的共同參考點進行拼合,以及利用數(shù)碼相機定位、經(jīng)TRITOP軟件處理生成的整體參考點數(shù)據(jù)與包含特定參考點的單幅點云數(shù)據(jù)拼合;也有利用工件表面特征在測量完成后進行自動拼合的,例如利用Geomagic軟件進行的多幅點云的自動拼合。(2)工件的多樣性。在測量中,工件的形狀、尺寸以及工件中不同部位的精度不同,決定著需要采取不同的測量策略和測量手段來進行測量。一般情況下,對于工件需要進行幾何反求的部分,例如汽車飯金、各種鑄件外形、各種注塑件等,由于其功能各異,因而形狀各異;由于其模型材料各異,有的是用油泥塑造的,有的是塑料件,有的是鋼件,有的是玻璃

16、件,有的是海綿或橡膠件,因而表面形態(tài)各異,功能不同,復雜程度各異。(3)數(shù)據(jù)采集要求的多變性。在進行數(shù)據(jù)采集時,要明確所采集數(shù)據(jù)的用途,所采集的數(shù)據(jù)與產(chǎn)品中哪些部分相關(guān)或在空間以產(chǎn)品中哪些部件的相應部分為參考點,以決定數(shù)據(jù)采集是對產(chǎn)品部件單獨進行,還是在產(chǎn)品裝配中包含相關(guān)部件來進行。這在進行數(shù)據(jù)采集前首先要予以明確,以減少不必要的返工。數(shù)據(jù)采集中對工件不同部位精度要求不同,例如對發(fā)動機機體數(shù)據(jù)采集時,對用于定位的孔的尺寸精度要求較高。性材料構(gòu)成的工件不可采用接觸式測量,而須采用非接觸測量。參考文獻: 1 喬思航,程志權(quán),陳寅,等基于三個Kinect 的個性化人體重建系統(tǒng)仿真學報,2013,25

17、(10),2408 - 24112 李國鎮(zhèn)基于 Kinect 的三維重建方法的研究和實現(xiàn)D北京:北京交通大學,20123 韋羽棉,尚趙偉基于 Kinect 的旋轉(zhuǎn)剛體三維重建方法計算機與現(xiàn)代化,2014(5):89 - 98 4 羅元,謝彧,張毅. 基于 Kinect 傳感器的智能輪椅手勢控制系統(tǒng)的設計與實現(xiàn)J.機器人,2012,(01).5 劉鑫,許華榮,胡占義. 基于GPU和 Kinect 的快速物體重建. 自動化學報,2012,38(8):1288-1297.6 周瑾,潘建江,童晶,等.使用 Kinect 快速重建三維人體. 計算機輔助設計與圖形學學報,2013,25(6):873-87

18、9.7 宋詩超, 禹素萍,許武軍基于 Kinect 的三維人體掃描、重建及測量技術(shù)的研究天津工業(yè)大學學報,2012,31(5):34 - 41.8 孫晶晶,王金變,管玉基于三維掃描技術(shù)的人體測量天津工業(yè)大學學報,2012,31(5):30-33. 9 宋詩超基于Kinect 的三維人體建模與測量的研究上海:東華大學,2013.10 朱德海點云庫 PCL 學習教程北京:北京航天航空大學出版社,201211 余濤Kinect 應用開發(fā)實戰(zhàn):用最自然的方式與機器對話北京:機械工業(yè)出版社,201312 陶麗君基于深度信息的實時頭部姿態(tài)估計廈門:廈門大學,201313 Henry P, Krainin

19、M, Herbst E, etal. RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. In Proceedings of the International Symposium on Experimental Robotics(ISER), 2010.14 Newcombe R A, Izadi S, Hilliges O, etal. Kinect Fusion: Real-Time Dense Surface Mapping and Tracking, in IEEE ISMA

20、R, IEEE, October 2011.15 Izadi S, Kim D, Hilliges O, etal. Kinect Fusion:Real-time 3D reconstruction and interaction using a moving depth cameraJ. In Symposium on User Interface Software and Technology(UIST), 2011.5.5. 外文文獻翻譯The Microsoft KINECT: A Novel Tool for Psycholinguistic Research Rinus G. V

21、erdonschot1, Héloïse Guillemaud2, Hobitiana Rabenarivo2, Katsuo Tamaoka31. Waseda Institute for Advanced Study, Waseda University, Tokyo, Japan 2. Graduate School of Engineering, Nagoya University, Nagoya, Japan 3. Graduate School of Languages and Cultures, Nagoya University, Nagoya, Japan

22、Received 29 May 2015; accepted 26 June 2015; published 30 June 2015 Abstract The Microsoft KINECT is a 3D sensing device originally developed for the XBOX.The Microsoft KINECT opens up many exciting new opportunities for conducting experimental research on hu-man behavior. We investigated some of th

23、ese possibilities within the field of psycholinguistics(specifically: language production) by creating software, using C#, allowing for the KINECT to be used in a typical psycholinguistic experimental setting. The results of a naming experiment using this software confirmed that the KINECT was able

24、to measure the effects of a robust psycholinguistic variable (word frequency) on naming latencies. However, although the current version of the software is able to measure psycholinguistic variables of interest, we alsodiscuss several points where the software can still stand to be improved. The mai

25、n aim of this paper is to make the software freely available for assessment and use by the psycholinguistic community and to illustrate the KINECT as a potentially valuable tool for investigating human behavior, especially in the field of psycholinguistics. Keywords Language Production, Psycholingui

26、stics, KINECT, Psychological Research Tool Introduction The way we interact with technology is rapidly changing. While we were once limited to keyboards and poin-and-click devices, we can now interact with technology using our whole body. The rapidly decreasing cost of 3D sensing technologies (such

27、as the Microsoft KINECT), even allows us to interact with technology through facial expressions and voice information. Although this technology offers exciting new opportunities for experimental research onhuman behavior, the actual implementation of these novel technologies is still in its infancy.

28、 This paper highlights a potentially important role for KINECT technology in a particular area concerning the study of human behavior, namely language production (a subfield of psycholinguistics). This paper is structured as follows: First, we provide a brief background on the existing research and

29、theoretical models of language production, and summarize how dependent variables (such as naming latencies and accuracy) are usually obtained. Second, we introduce several important features of the KINECT sensor and review their potential applications within experimental psycholinguistic research. S

30、ubsequently, we discuss the C# software developed by our lab (all code freely downloadable), which implements the KINECT device to an experimental paradigm by depicting a characteristic experimental situation found in psycholinguistics. Next, we present experimental data within a genuine experimenta

31、l setting by testing 34 participants on a word-frequency paradigm by using the KINECT and validate this data by using an established method in the field (i.e., by voice key). Finally, we point out particular shortcomings of the current version of the software and avenues for resolving these shortcom

32、ings and implementing the KINECT in future research, both on language production and in general.1. Short Background on Language Production Research Although the KINECT offers advancements for behavioral research in many fields, this paper focuses on how the KINECT can benefit research on language pr

33、oduction (a part of experimental psycholinguistics). Within thelanguage production literature, there are several theoretical models that describe the way speech is produced: starting from ideas in our head and ending with the actual pronunciation of words (e.g. Dell, 1986; Levelt, Roelofs, & Mey

34、er, 1999). Most of the experimental data supporting these models comes from chronometric research (i.e. measuring reaction time latencies) using basic “triggering devices” such as buttons and voice keys (i.e.electronic circuits initiating a pulse if an input volume crosses a certain threshold). Typi

35、cal experimental paradigms used in language production research either show a particular stimulus on the screen or present a stimulus auditorily and wait for the participant to name a particular target out loud. The time it takes from seeing (or hearing) the stimulus to naming it out loud is called

36、the reaction time (RT) and serves as the main dependent variable together with the accuracy of the response. However, classic lab equipment such as voice keys only capture RTs for the onset of a single word at a time, and the difference between speech and other (irrelevant) sounds (e.g. coughing) ca

37、nnot be distinguished without time consuming post-hoc (or online) manual response checking (although there is freely available software which substantially eases and optimizes this task such as Check Vocal; Protopapas, 2007). This is because voice key triggering will simply occur if the input volume

38、 crosses a certain threshold. Additionally, data will be usually lost if the voice input does not exceed that threshold (e.g.when a participant speaks softly for instance). Moreover, voice keys have no semantic capabilities, which again instigate a need for manual response checking. Finally, some qu

39、estions have arisen about the reliability of voice keys. For example, when speaking, even after phonemes are produced it may take the voice key varying amounts of time to detect them, since some sounds take more or less time to initiate (e.g. /z/ versus /p/; see Kessler, Treiman, & Mullennix, 20

40、02; Sakuma, Fushimi, & Tatsumi, 1997). It is therefore reasonable to state that paradigms found in experimental psycholinguistics can be limited by particular aspects of experimental equipment.2. The Microsoft KINECT Device In contrast to devices designed to be implemented for scientific use onl

41、y, the KINECT is a device (costing roughly 200 USD) developed by Microsoft to be used with video games (e.g. on XBOX and Windows). The KINECT enables users to interact with a computer via gestures and voice commands.The KINECT (v1)1 contains an infrared (IR) emitter and IR depth sensor (640 × 4

42、80 pixels) for 3D tracking, a RGB camera (1280 × 960 pixels) to acquire high-quality RGB color video (both the IR depth sensor and the RGB camera operate at 30 fps) and a microphone array, which contains four microphones for capturing sound. The IR emitter emits infrared light in a predetermine

43、d “speckle pattern” (which are in fact small dots of infrared light that fall on everything in front of the KINECT camera). The IR depth sensor perceives these patterns and determines depth by looking at the displacement of specific dot patterns (e.g. on objects close to the KINECT the dot pattern w

44、ill be spread out, but on far objects the dot pattern will be much denser). Additionally, as there are four microphones, it is possible to accurately retrieve the spatial location of the sound source (e.g. a person speaking), as well as being able to record what is spoken. Furthermore, by using an a

45、ccelerometer it is possible to determine the current orientation of the KINECT and the integrated tilt motor can be used to track objects or people within the room.For research in language production, one particularly important feature of KINECT is its ability to track thehuman face . Microsoft has

46、made a so-called Software Development Kit (SDK; current version for KINECT v1 is 1.8) available which contains numerous programming routines to track a human face in real time. This SDK can measure roughly 100 points (including so-called “hidden points”) resulting in real-time face-tracking. Thus, t

47、he KINECT is able to build a detailed model of the human face, called a face mesh, using sets of triangles and lines. 3. Opportunities Offered by the KINECT for Research in PsycholinguisticsNaturally, the most important issue for researchers is how the KINECT can contribute to their research. The fo

48、llowing list, though incomplete, offers five potential ways we believe the KINECT could advance language production research: 1) The KINECT can track lip movements in real-time, allowing researchers to obtain detailed information on the speech planning process even before actual speech sounds are ut

49、tered. By focusing on the distances between particular points on the lips and face, in combination with the speech recognition pack (found in the SDK), it is possible to determine the onset and offset of individual words. In this paper we report our preliminary efforts to build a novel program that

50、detects the detection of the beginning and end of individual words, by tracking lip movements.2) Another exciting feature is that the KINECT is able to track more than one person over time, which would allow for language experiments to take place in a more natural, conversational setting.3) The KINE

51、CT has the potential to perform basic eye tracking, allowing researchers to assess approximately where participants are looking on a screen. Experimental paradigms may benefit from these additional behavioral measures, which could indicate, for instance, whether participants are engaged in the task

52、at hand, and, if so, which parts of the screen they are mainly fixating on. 4) The KINECT comes with advanced voice recognition (including language packs for many major languages), allowing for automatic post-hoc accuracy checking (see examples in the SDK provided by Microsoft).5) It has been shown

53、that the KINECT is able to on-line track and interpret body gestures and basic emotions,allowing for another dimension to be added to the dependent measures in a psycholinguistic experiment.4. A First Attempt to Implement the KINECT into the Area of Language Production As far as we know, there is no

54、 previous language production literature that utilizes the KINECT. This paper therefore represents the first attempt in this field to integrate the KINECT into the daily practice of a psycholinguistics lab. In this paper we focus on implementing the first of the five abovementioned points, that is,

55、the tracking of lip movements in real-time to gather information on the speech planning process.As there are no previous instances for comparison (again, as far as we know), we set out to program a working version of the KINECT software (using C#) to display experimental stimuli and measure a psycho

56、linguistic variable of interest. We aim to keep the code open and freely available for other researchers to use and adapt to their own insights. Obviously, when running the program, an attached KINECT for Windows, including the SDK is required (and Visual Studio is needed when adapting the code). To

57、 accommodate those who do not have this setup we provide a short video demonstrating the program online. Furthermore, the program(executable and source code) is provided .Notice that we provide the complete working directory in this file to have everything available to experienced programmers (for t

58、hose who simply want to run the program the executable can be found in /bin/x86/debug/FaceTrackingBasicsWPF.exe). The KINECT SDK v1.8 needs to be installed as well.Although the KINECT is able to track more than one person, in this initial stage of program development, only a single person is tracked

59、 during an experiment. The current version of the program is able to:1) Randomly display a word (taken from an Excel file) to a participant.2) Use the KINECT to determine the visual on- and offset of the word relative to its initial presentation (i.e.lip/face points) in real-time.3) Use the KINECT to detect the auditory on- and offset of the word relative to its

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論