




版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、基于KINECT的點(diǎn)云采集系統(tǒng)1. 研究的目的、意義由于三維激光掃描的軟硬件水平日趨成熟和大眾化,相應(yīng)的三維點(diǎn)云采集技術(shù)越來(lái)越受關(guān)注的研究熱點(diǎn),廣泛應(yīng)用于計(jì)算機(jī)動(dòng)畫、醫(yī)學(xué)圖像處理、文物保護(hù)、地形勘探、游戲開發(fā)和數(shù)字化媒體創(chuàng)作等領(lǐng)域。點(diǎn)云采集技術(shù)是近來(lái)數(shù)據(jù)采集的一個(gè)發(fā)展熱點(diǎn),因?yàn)槠渚哂锌焖傩浴⒋┩感?、不接觸性、實(shí)時(shí)性、動(dòng)態(tài)性、主動(dòng)性、全數(shù)字性、高密度、高效率等許多傳統(tǒng)的常規(guī)測(cè)量技術(shù)所不具有的優(yōu)點(diǎn),所以被廣泛的應(yīng)用于很多領(lǐng)域,具有廣闊的發(fā)展前景和應(yīng)用需求。近年來(lái)三維激光掃描設(shè)備在效率、精度和價(jià)格方面得到了極大地發(fā)展,同時(shí)為了滿足人們的需求三維點(diǎn)云采集技術(shù)也成為了研究的重點(diǎn)。在實(shí)際測(cè)量中,由于光的線
2、性傳播特性,三維激光掃描設(shè)備在一個(gè)視角下對(duì)于具有復(fù)雜形狀的物體的某些區(qū)域或者背面往往存在視覺盲點(diǎn),需要通過(guò)多次不同視角下的測(cè)量才能實(shí)現(xiàn)完整的模型數(shù)據(jù)采集;由于掃描設(shè)備測(cè)量范圍有限,對(duì)于大尺寸物體或者大范圍場(chǎng)景,不能一次性完整測(cè)量,必須分塊測(cè)量。上述的問(wèn)題導(dǎo)致測(cè)量結(jié)果往往是多塊具有不同坐標(biāo)系統(tǒng)且存在噪聲的點(diǎn)云數(shù)據(jù),不能夠完全滿足人們對(duì)數(shù)字化模型真實(shí)度和實(shí)時(shí)性的要求,因此對(duì)三維點(diǎn)云采集技術(shù)有重要的研究意義。2. 國(guó)內(nèi)外研究現(xiàn)狀在中國(guó),政府和科研機(jī)構(gòu)均開始高度關(guān)注大數(shù)據(jù),工信部發(fā)布的物聯(lián)網(wǎng)“十二五”規(guī)劃上,把信息處理技術(shù)作為四項(xiàng)關(guān)鍵技術(shù)創(chuàng)新工程之一提出。在進(jìn)行三維數(shù)據(jù)采集時(shí),有很多不確定的因素都會(huì)使
3、得采集時(shí)引入噪聲,在使用數(shù)據(jù)時(shí),我們要盡可能的降低噪聲所造成的影響,有些學(xué)者通過(guò)濾波的方式進(jìn)行處理。陳曉霞等在進(jìn)行數(shù)據(jù)點(diǎn)的篩選時(shí)主要進(jìn)行了兩個(gè)步驟的操作,首先在預(yù)處理時(shí)利用了密度聚類這種方式,然后通過(guò)VRML實(shí)時(shí)交互的顯示功能來(lái)展現(xiàn)。張毅等利用K領(lǐng)域結(jié)合核函數(shù),來(lái)對(duì)點(diǎn)云數(shù)據(jù)進(jìn)行噪聲的去除。在當(dāng)前三維點(diǎn)云數(shù)據(jù)配準(zhǔn)研究領(lǐng)域中,使用最為廣泛的算法當(dāng)屬ICP 算法了。ICP 算法自 1992 年由 Besl 和 Mckay提出之后便在自動(dòng)配準(zhǔn)方面得到了廣泛的引用。但是由于傳統(tǒng)的 ICP 算法的效率并不高,而且對(duì)初始值要求很高,容易陷入局部收斂等缺陷,很多學(xué)者對(duì) ICP 算法進(jìn)行了改進(jìn)。Hans Mar
4、tin Kjer 等利用基于曲率的方法進(jìn)行抽樣加速配準(zhǔn)效率;孫謙等利用法向量?jī)?nèi)積加權(quán),但是由于人為的因素對(duì)最后的配準(zhǔn)精度與效率造成了一定的影響;賀永興等提出了一種基于鄰域特征的配準(zhǔn)方法;蔣成成等利用 Delaunay 三角剖對(duì)ICP 算法進(jìn)行了一定的改進(jìn)。為了使數(shù)據(jù)模型的復(fù)雜度更適合于有限的計(jì)算機(jī)資源,必須對(duì)數(shù)據(jù)模型進(jìn)行簡(jiǎn)化。對(duì)三維點(diǎn)云數(shù)據(jù)的精簡(jiǎn)方法主要可以概括為兩大類:一類是對(duì)三維點(diǎn)云數(shù)據(jù)進(jìn)行拓?fù)鋭澐掷猛負(fù)潢P(guān)系進(jìn)行精簡(jiǎn),另一類是根據(jù)特征信息來(lái)選取代表點(diǎn)從而進(jìn)行精簡(jiǎn)。曲率精簡(jiǎn)法是典型的直接通過(guò)代表點(diǎn)選取來(lái)進(jìn)行精簡(jiǎn)的方法。國(guó)外的 Martin 等人在1997 年提出均勻網(wǎng)格法這種方法,但是這種
5、方法的局限性在于它經(jīng)常會(huì)誤刪除掉一些特征數(shù)據(jù);Chen 等根據(jù)法向量來(lái)進(jìn)行精簡(jiǎn),但是對(duì)點(diǎn)云數(shù)據(jù)要求比較局限;Lee 等的改進(jìn)方法對(duì)于物體表面特征的保留起到了一定的作用,但是時(shí)間開銷太大。國(guó)內(nèi)的張麗艷等人也是利用法向量來(lái)精簡(jiǎn)點(diǎn)云,該方法對(duì)于物體特征保留雖然不錯(cuò),但是由于實(shí)驗(yàn)中的某些參數(shù)是根據(jù)經(jīng)驗(yàn)值來(lái)選擇的,這個(gè)方法的可操作性就不強(qiáng)了;朱冒冒等人是通過(guò)二次精簡(jiǎn)來(lái)進(jìn)行改進(jìn)的,相對(duì)來(lái)說(shuō)精簡(jiǎn)的比較合理了;史寶全等提出的聚類精簡(jiǎn)算法雖然在保留點(diǎn)云特征上面已經(jīng)不錯(cuò)了,可計(jì)算量有所增加;杜曉暉等人提出的混合算法簡(jiǎn)化效率就比較高了,但是時(shí)間略有下降。3. 擬采取的研究路線(1)點(diǎn)云數(shù)據(jù)的采集 本文提出了一種基于
6、kinect的點(diǎn)云采集系統(tǒng)設(shè)計(jì)方案,該方案以kinect為核心,利用模塊化C+模板庫(kù)PLC(點(diǎn)云庫(kù))中提供的通用采集接口,可以直接獲取到實(shí)際坐標(biāo)空間的三維信息,三維坐標(biāo)信息保存為點(diǎn)云數(shù)據(jù),提高了點(diǎn)云數(shù)據(jù)的采集速度。本方案的采集系統(tǒng)是由局域網(wǎng)內(nèi)的一臺(tái)計(jì)算機(jī)負(fù)責(zé)kinect點(diǎn)云數(shù)據(jù)的采集,它將采集到的圖像深度信息轉(zhuǎn)換為實(shí)際空間上的點(diǎn)的三維坐標(biāo)信息,利用了kinect的深度成像原理,采用OpenNI開放自然交互框架來(lái)抓取kinect設(shè)備中的點(diǎn)云數(shù)據(jù)。(2)點(diǎn)云數(shù)據(jù)去噪聲處理由于受人為擾動(dòng)、光照、掃描設(shè)備本身的缺陷等因素的影響,采集到的數(shù)據(jù)會(huì)受到噪聲污染,需要進(jìn)行去噪聲處理,根據(jù)噪聲在各個(gè)方向上擴(kuò)散方
7、式不同我們可以采用各向同性和各向異性算法對(duì)噪聲進(jìn)行處理。(3) 點(diǎn)云數(shù)據(jù)的參數(shù)化表示利用三角網(wǎng)格參數(shù)化將原始模型上的數(shù)據(jù)點(diǎn)映射到給定的參域上,建立點(diǎn)云數(shù)據(jù)到參數(shù)域上新的點(diǎn)云集合Q之間的一種對(duì)應(yīng)關(guān)系&:G>Q,并且要求在一定意義下幾何變形達(dá)到極小。(4) 點(diǎn)云數(shù)據(jù)的可視化處理 針對(duì)三維模型的幾何特征,提出基于三角形簡(jiǎn)化的多分辨率復(fù)雜三維模型生成算法,生成多尺度的三維點(diǎn)云數(shù)據(jù)結(jié)構(gòu),同時(shí)構(gòu)建對(duì)應(yīng)的多分辨率紋理特征模型在建立幾何與紋理尺度關(guān)聯(lián)的基礎(chǔ)上,采用R+樹的索引機(jī)制實(shí)現(xiàn)三維模型的分塊存儲(chǔ),建立靜態(tài)多層次(LOD)的三維模型分塊數(shù)據(jù)結(jié)構(gòu)。在進(jìn)行三維場(chǎng)景瀏覽時(shí),依據(jù)客戶端現(xiàn)場(chǎng)范圍裁剪和
8、網(wǎng)絡(luò)傳輸效率,在服務(wù)端快速檢索和獲取相應(yīng)的靜態(tài)LOD三維模型數(shù)據(jù),自適應(yīng)分塊傳輸?shù)娇蛻舳?,?shí)時(shí)生成符合視覺要求的動(dòng)態(tài)LOD三維模型,達(dá)到最佳的可視化效果。4. 文獻(xiàn)綜述Kinect 是微軟公司研制的一款體感外設(shè),最初是針對(duì)其游戲主機(jī) XBOX360 推出的一套外設(shè)產(chǎn)品,適用領(lǐng)域也僅限于游戲領(lǐng)域。但是其高科技的含量以及該裝置本身的創(chuàng)意在發(fā)售后的兩年內(nèi)開始逐漸應(yīng)用于許多領(lǐng)域,隨著近來(lái) kinect for windows 這款針對(duì) windows 平臺(tái)的研發(fā)設(shè)備的推出,目前世界上尤其是國(guó)外的一批人工智能科學(xué)家,人機(jī)交互,體感互動(dòng)工程師和研究小組等等也紛紛在對(duì) kinect 的應(yīng)用領(lǐng)域做探究和研發(fā)。
9、 Kinect 最初的開發(fā)代號(hào)稱為 Natal,之后正式更名為Kinect。Kinect 技術(shù)是微軟公司基于高端研究得出來(lái)的電子科技產(chǎn)品,是微軟在依靠人工智能解決復(fù)雜問(wèn)題的過(guò)程中產(chǎn)生的一個(gè)副產(chǎn)品,這就是 Kinect 的來(lái)歷。除了體感設(shè)備已經(jīng)比較普及的游戲領(lǐng)域外,Kinect 的應(yīng)用和實(shí)驗(yàn)性應(yīng)用正在快速發(fā)展,下面通過(guò)幾個(gè)領(lǐng)域已經(jīng)出現(xiàn)了的應(yīng)用來(lái)探討 Kinect 的應(yīng)用領(lǐng)域。(1)虛擬應(yīng)用。歐洲時(shí)裝店 Topshop 在莫斯科旗艦店安裝了一種全新的試衣間,這種虛擬的試衣間利用了當(dāng)前最先進(jìn)了兩種技術(shù):增強(qiáng)現(xiàn)實(shí)(augmented reality,AR)和微軟 Kinect 體感外設(shè),你無(wú)需試穿就能
10、見到真實(shí)的試衣效果。(2)3D 建模雕塑工具。經(jīng)過(guò)國(guó)外一些小組的實(shí)驗(yàn),多臺(tái) Kinect 可以用作 3D 攝像機(jī)并進(jìn)行 3D 建模。在一個(gè)名為 Blablab LAB 的小組的街頭實(shí)驗(yàn)中,通過(guò)使用三臺(tái) Kinect 為游客進(jìn)行掃描建模,然后使用 Rep Rap 3D 打印機(jī)制作出一個(gè)迷你的雕塑。(3)機(jī)械控制遙控機(jī)器人。使用 Kinect 作為機(jī)器人的頭,通過(guò) Kinect 檢測(cè)周圍環(huán)境,并進(jìn)行 3D 建模,來(lái)指導(dǎo)機(jī)器人的行動(dòng)。 因?yàn)闄C(jī)器人的可應(yīng)用領(lǐng)域非常廣泛,低廉成本的 kinect機(jī)器人可以代替?zhèn)鹘y(tǒng)機(jī)器進(jìn)行一些不需要很高精度的危險(xiǎn)地區(qū)或者地底高空等惡劣環(huán)境下的測(cè)量與勘察作業(yè)。(4)虛擬實(shí)驗(yàn)
11、醫(yī)學(xué)領(lǐng)域。Kinec 在醫(yī)學(xué)領(lǐng)域中,可代替醫(yī)生進(jìn)行尸檢,研究人員只需要對(duì)著空氣做手勢(shì)或者語(yǔ)音,就可控制 3D 圖片放大縮小旋轉(zhuǎn)等功能。根據(jù)測(cè)量探頭的組成方式不同,被測(cè)對(duì)象表面數(shù)據(jù)的獲取主要包括接觸式和非接觸式測(cè)量?jī)深悺=佑|式測(cè)量的代表性設(shè)備是三坐標(biāo)測(cè)量機(jī)(CMM)。但是由于接觸式測(cè)量設(shè)備與被測(cè)物體接觸,不可避免的使被測(cè)物體產(chǎn)生變形,因此測(cè)量誤差較大。非接觸式則應(yīng)用光學(xué)及激光原理進(jìn)行激光掃描或光學(xué)掃描等,不存在受力變形產(chǎn)生的誤差。大多數(shù)實(shí)用的非接觸式測(cè)量?jī)x器都采用結(jié)構(gòu)光照明技術(shù),投影儀器發(fā)出結(jié)構(gòu)照明光束,接收器接受由被測(cè)三維表面返回的光信號(hào)。由于三維面形對(duì)結(jié)構(gòu)照明光束產(chǎn)生的空間或時(shí)間調(diào)制,因此可
12、以通過(guò)適當(dāng)?shù)姆椒◤挠^察光場(chǎng)中解調(diào)出三維面形數(shù)據(jù)。激光三維掃描設(shè)備采集到的數(shù)據(jù),是大量的三維點(diǎn)坐標(biāo)的集合。由于點(diǎn)的數(shù)量巨大,其數(shù)據(jù)被形象的成為點(diǎn)云數(shù)據(jù)。點(diǎn)云數(shù)據(jù)采集過(guò)程一般為:將儀器與電源、微機(jī)連接并開啟,打開數(shù)據(jù)識(shí)別和處理軟件。建立定點(diǎn)參照目標(biāo),并開啟掃描儀坐標(biāo)系統(tǒng)的自動(dòng)識(shí)別功能,建立三維坐標(biāo)系統(tǒng)。在當(dāng)前坐標(biāo)系統(tǒng)內(nèi),對(duì)采集范圍內(nèi)的實(shí)體進(jìn)行數(shù)字采集,并建立三維圖形。一次采集完畢后,更換儀器地點(diǎn),通過(guò)定點(diǎn)參照物重新識(shí)別當(dāng)前坐標(biāo),進(jìn)行數(shù)據(jù)的多次采集,并自動(dòng)完成數(shù)據(jù)的空間合并。對(duì)掃描得到的云點(diǎn)數(shù)據(jù)進(jìn)行先期處理,包括對(duì)模型的分割、修剪、移動(dòng)、旋轉(zhuǎn)、縮放等等。通過(guò)開放的數(shù)字接口,對(duì)當(dāng)前模型數(shù)據(jù)進(jìn)行轉(zhuǎn)換,使
13、其與后期三維設(shè)計(jì)軟件和開發(fā)軟件兼容、并行和共享。同樣我們也可以利用Atos掃描儀進(jìn)行點(diǎn)云數(shù)據(jù)的采集。Atos三維掃描儀是一種帶有兩個(gè)CCD攝像機(jī)和一個(gè)中央投影單元的光學(xué)三維掃描儀。它的中央投影單元部分配備了一個(gè)白色的投射燈泡和一個(gè)可規(guī)則滑動(dòng)的復(fù)雜光柵。Atos掃描儀的傳感器被固定在一個(gè)三腳架上,并可以十分方便的沿四軸方向轉(zhuǎn)動(dòng)。測(cè)量時(shí),投射燈泡將規(guī)則變化的光柵投影到被測(cè)工件表面產(chǎn)生的摩爾條紋,摩爾條紋的變化被CCD鏡頭記錄下來(lái),并轉(zhuǎn)送到計(jì)算機(jī),經(jīng)過(guò)處理以后得到兩個(gè)CCD鏡頭分別拍攝到的兩張“三維”照片。由于兩個(gè)CCD鏡頭可以感知高達(dá)440,000個(gè)象素,所以每一單幅照片可以采集到1.3萬(wàn)個(gè)有效數(shù)
14、據(jù)點(diǎn)。Atos軟件可以在瞬間處理這1.3萬(wàn)個(gè)數(shù)據(jù)并精確的標(biāo)定出其三維空間坐標(biāo)值。在Atos掃描儀進(jìn)行測(cè)量,即點(diǎn)云采集的過(guò)程中,誤差的產(chǎn)生是難以避免的,但如果誤差累積到一定程度,就無(wú)法達(dá)到精度的要求。因此,正確的測(cè)量順序應(yīng)該是由中部向四周逐漸擴(kuò)展測(cè)量,這樣做所得到的誤差是最小的。 點(diǎn)云數(shù)據(jù)采集的工作特點(diǎn):(1)多幅性。各種數(shù)據(jù)采集系統(tǒng),由于測(cè)量范圍的限制,所得的點(diǎn)云是一幅一幅進(jìn)行測(cè)量的,一般情況下,一幅點(diǎn)云的測(cè)量并不能包含工件上所有需要的點(diǎn),所以最終工件完整點(diǎn)云數(shù)據(jù)的獲得需要利用多次測(cè)量的多幅點(diǎn)云進(jìn)行拼合。這種拼合在測(cè)量系統(tǒng)中一般有自動(dòng)拼合和人工拼合兩種方式,也有相應(yīng)的軟件。自動(dòng)拼合方式有邊測(cè)量
15、邊拼合,例如在光學(xué)三維掃描系統(tǒng)ATOS中的利用相鄰兩幅點(diǎn)云的共同參考點(diǎn)進(jìn)行拼合,以及利用數(shù)碼相機(jī)定位、經(jīng)TRITOP軟件處理生成的整體參考點(diǎn)數(shù)據(jù)與包含特定參考點(diǎn)的單幅點(diǎn)云數(shù)據(jù)拼合;也有利用工件表面特征在測(cè)量完成后進(jìn)行自動(dòng)拼合的,例如利用Geomagic軟件進(jìn)行的多幅點(diǎn)云的自動(dòng)拼合。(2)工件的多樣性。在測(cè)量中,工件的形狀、尺寸以及工件中不同部位的精度不同,決定著需要采取不同的測(cè)量策略和測(cè)量手段來(lái)進(jìn)行測(cè)量。一般情況下,對(duì)于工件需要進(jìn)行幾何反求的部分,例如汽車飯金、各種鑄件外形、各種注塑件等,由于其功能各異,因而形狀各異;由于其模型材料各異,有的是用油泥塑造的,有的是塑料件,有的是鋼件,有的是玻璃
16、件,有的是海綿或橡膠件,因而表面形態(tài)各異,功能不同,復(fù)雜程度各異。(3)數(shù)據(jù)采集要求的多變性。在進(jìn)行數(shù)據(jù)采集時(shí),要明確所采集數(shù)據(jù)的用途,所采集的數(shù)據(jù)與產(chǎn)品中哪些部分相關(guān)或在空間以產(chǎn)品中哪些部件的相應(yīng)部分為參考點(diǎn),以決定數(shù)據(jù)采集是對(duì)產(chǎn)品部件單獨(dú)進(jìn)行,還是在產(chǎn)品裝配中包含相關(guān)部件來(lái)進(jìn)行。這在進(jìn)行數(shù)據(jù)采集前首先要予以明確,以減少不必要的返工。數(shù)據(jù)采集中對(duì)工件不同部位精度要求不同,例如對(duì)發(fā)動(dòng)機(jī)機(jī)體數(shù)據(jù)采集時(shí),對(duì)用于定位的孔的尺寸精度要求較高。性材料構(gòu)成的工件不可采用接觸式測(cè)量,而須采用非接觸測(cè)量。參考文獻(xiàn): 1 喬思航,程志權(quán),陳寅,等基于三個(gè)Kinect 的個(gè)性化人體重建系統(tǒng)仿真學(xué)報(bào),2013,25
17、(10),2408 - 24112 李國(guó)鎮(zhèn)基于 Kinect 的三維重建方法的研究和實(shí)現(xiàn)D北京:北京交通大學(xué),20123 韋羽棉,尚趙偉基于 Kinect 的旋轉(zhuǎn)剛體三維重建方法計(jì)算機(jī)與現(xiàn)代化,2014(5):89 - 98 4 羅元,謝彧,張毅. 基于 Kinect 傳感器的智能輪椅手勢(shì)控制系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)J.機(jī)器人,2012,(01).5 劉鑫,許華榮,胡占義. 基于GPU和 Kinect 的快速物體重建. 自動(dòng)化學(xué)報(bào),2012,38(8):1288-1297.6 周瑾,潘建江,童晶,等.使用 Kinect 快速重建三維人體. 計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào),2013,25(6):873-87
18、9.7 宋詩(shī)超, 禹素萍,許武軍基于 Kinect 的三維人體掃描、重建及測(cè)量技術(shù)的研究天津工業(yè)大學(xué)學(xué)報(bào),2012,31(5):34 - 41.8 孫晶晶,王金變,管玉基于三維掃描技術(shù)的人體測(cè)量天津工業(yè)大學(xué)學(xué)報(bào),2012,31(5):30-33. 9 宋詩(shī)超基于Kinect 的三維人體建模與測(cè)量的研究上海:東華大學(xué),2013.10 朱德海點(diǎn)云庫(kù) PCL 學(xué)習(xí)教程北京:北京航天航空大學(xué)出版社,201211 余濤Kinect 應(yīng)用開發(fā)實(shí)戰(zhàn):用最自然的方式與機(jī)器對(duì)話北京:機(jī)械工業(yè)出版社,201312 陶麗君基于深度信息的實(shí)時(shí)頭部姿態(tài)估計(jì)廈門:廈門大學(xué),201313 Henry P, Krainin
19、M, Herbst E, etal. RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. In Proceedings of the International Symposium on Experimental Robotics(ISER), 2010.14 Newcombe R A, Izadi S, Hilliges O, etal. Kinect Fusion: Real-Time Dense Surface Mapping and Tracking, in IEEE ISMA
20、R, IEEE, October 2011.15 Izadi S, Kim D, Hilliges O, etal. Kinect Fusion:Real-time 3D reconstruction and interaction using a moving depth cameraJ. In Symposium on User Interface Software and Technology(UIST), 2011.5.5. 外文文獻(xiàn)翻譯The Microsoft KINECT: A Novel Tool for Psycholinguistic Research Rinus G. V
21、erdonschot1, Héloïse Guillemaud2, Hobitiana Rabenarivo2, Katsuo Tamaoka31. Waseda Institute for Advanced Study, Waseda University, Tokyo, Japan 2. Graduate School of Engineering, Nagoya University, Nagoya, Japan 3. Graduate School of Languages and Cultures, Nagoya University, Nagoya, Japan
22、Received 29 May 2015; accepted 26 June 2015; published 30 June 2015 Abstract The Microsoft KINECT is a 3D sensing device originally developed for the XBOX.The Microsoft KINECT opens up many exciting new opportunities for conducting experimental research on hu-man behavior. We investigated some of th
23、ese possibilities within the field of psycholinguistics(specifically: language production) by creating software, using C#, allowing for the KINECT to be used in a typical psycholinguistic experimental setting. The results of a naming experiment using this software confirmed that the KINECT was able
24、to measure the effects of a robust psycholinguistic variable (word frequency) on naming latencies. However, although the current version of the software is able to measure psycholinguistic variables of interest, we alsodiscuss several points where the software can still stand to be improved. The mai
25、n aim of this paper is to make the software freely available for assessment and use by the psycholinguistic community and to illustrate the KINECT as a potentially valuable tool for investigating human behavior, especially in the field of psycholinguistics. Keywords Language Production, Psycholingui
26、stics, KINECT, Psychological Research Tool Introduction The way we interact with technology is rapidly changing. While we were once limited to keyboards and poin-and-click devices, we can now interact with technology using our whole body. The rapidly decreasing cost of 3D sensing technologies (such
27、as the Microsoft KINECT), even allows us to interact with technology through facial expressions and voice information. Although this technology offers exciting new opportunities for experimental research onhuman behavior, the actual implementation of these novel technologies is still in its infancy.
28、 This paper highlights a potentially important role for KINECT technology in a particular area concerning the study of human behavior, namely language production (a subfield of psycholinguistics). This paper is structured as follows: First, we provide a brief background on the existing research and
29、theoretical models of language production, and summarize how dependent variables (such as naming latencies and accuracy) are usually obtained. Second, we introduce several important features of the KINECT sensor and review their potential applications within experimental psycholinguistic research. S
30、ubsequently, we discuss the C# software developed by our lab (all code freely downloadable), which implements the KINECT device to an experimental paradigm by depicting a characteristic experimental situation found in psycholinguistics. Next, we present experimental data within a genuine experimenta
31、l setting by testing 34 participants on a word-frequency paradigm by using the KINECT and validate this data by using an established method in the field (i.e., by voice key). Finally, we point out particular shortcomings of the current version of the software and avenues for resolving these shortcom
32、ings and implementing the KINECT in future research, both on language production and in general.1. Short Background on Language Production Research Although the KINECT offers advancements for behavioral research in many fields, this paper focuses on how the KINECT can benefit research on language pr
33、oduction (a part of experimental psycholinguistics). Within thelanguage production literature, there are several theoretical models that describe the way speech is produced: starting from ideas in our head and ending with the actual pronunciation of words (e.g. Dell, 1986; Levelt, Roelofs, & Mey
34、er, 1999). Most of the experimental data supporting these models comes from chronometric research (i.e. measuring reaction time latencies) using basic “triggering devices” such as buttons and voice keys (i.e.electronic circuits initiating a pulse if an input volume crosses a certain threshold). Typi
35、cal experimental paradigms used in language production research either show a particular stimulus on the screen or present a stimulus auditorily and wait for the participant to name a particular target out loud. The time it takes from seeing (or hearing) the stimulus to naming it out loud is called
36、the reaction time (RT) and serves as the main dependent variable together with the accuracy of the response. However, classic lab equipment such as voice keys only capture RTs for the onset of a single word at a time, and the difference between speech and other (irrelevant) sounds (e.g. coughing) ca
37、nnot be distinguished without time consuming post-hoc (or online) manual response checking (although there is freely available software which substantially eases and optimizes this task such as Check Vocal; Protopapas, 2007). This is because voice key triggering will simply occur if the input volume
38、 crosses a certain threshold. Additionally, data will be usually lost if the voice input does not exceed that threshold (e.g.when a participant speaks softly for instance). Moreover, voice keys have no semantic capabilities, which again instigate a need for manual response checking. Finally, some qu
39、estions have arisen about the reliability of voice keys. For example, when speaking, even after phonemes are produced it may take the voice key varying amounts of time to detect them, since some sounds take more or less time to initiate (e.g. /z/ versus /p/; see Kessler, Treiman, & Mullennix, 20
40、02; Sakuma, Fushimi, & Tatsumi, 1997). It is therefore reasonable to state that paradigms found in experimental psycholinguistics can be limited by particular aspects of experimental equipment.2. The Microsoft KINECT Device In contrast to devices designed to be implemented for scientific use onl
41、y, the KINECT is a device (costing roughly 200 USD) developed by Microsoft to be used with video games (e.g. on XBOX and Windows). The KINECT enables users to interact with a computer via gestures and voice commands.The KINECT (v1)1 contains an infrared (IR) emitter and IR depth sensor (640 × 4
42、80 pixels) for 3D tracking, a RGB camera (1280 × 960 pixels) to acquire high-quality RGB color video (both the IR depth sensor and the RGB camera operate at 30 fps) and a microphone array, which contains four microphones for capturing sound. The IR emitter emits infrared light in a predetermine
43、d “speckle pattern” (which are in fact small dots of infrared light that fall on everything in front of the KINECT camera). The IR depth sensor perceives these patterns and determines depth by looking at the displacement of specific dot patterns (e.g. on objects close to the KINECT the dot pattern w
44、ill be spread out, but on far objects the dot pattern will be much denser). Additionally, as there are four microphones, it is possible to accurately retrieve the spatial location of the sound source (e.g. a person speaking), as well as being able to record what is spoken. Furthermore, by using an a
45、ccelerometer it is possible to determine the current orientation of the KINECT and the integrated tilt motor can be used to track objects or people within the room.For research in language production, one particularly important feature of KINECT is its ability to track thehuman face . Microsoft has
46、made a so-called Software Development Kit (SDK; current version for KINECT v1 is 1.8) available which contains numerous programming routines to track a human face in real time. This SDK can measure roughly 100 points (including so-called “hidden points”) resulting in real-time face-tracking. Thus, t
47、he KINECT is able to build a detailed model of the human face, called a face mesh, using sets of triangles and lines. 3. Opportunities Offered by the KINECT for Research in PsycholinguisticsNaturally, the most important issue for researchers is how the KINECT can contribute to their research. The fo
48、llowing list, though incomplete, offers five potential ways we believe the KINECT could advance language production research: 1) The KINECT can track lip movements in real-time, allowing researchers to obtain detailed information on the speech planning process even before actual speech sounds are ut
49、tered. By focusing on the distances between particular points on the lips and face, in combination with the speech recognition pack (found in the SDK), it is possible to determine the onset and offset of individual words. In this paper we report our preliminary efforts to build a novel program that
50、detects the detection of the beginning and end of individual words, by tracking lip movements.2) Another exciting feature is that the KINECT is able to track more than one person over time, which would allow for language experiments to take place in a more natural, conversational setting.3) The KINE
51、CT has the potential to perform basic eye tracking, allowing researchers to assess approximately where participants are looking on a screen. Experimental paradigms may benefit from these additional behavioral measures, which could indicate, for instance, whether participants are engaged in the task
52、at hand, and, if so, which parts of the screen they are mainly fixating on. 4) The KINECT comes with advanced voice recognition (including language packs for many major languages), allowing for automatic post-hoc accuracy checking (see examples in the SDK provided by Microsoft).5) It has been shown
53、that the KINECT is able to on-line track and interpret body gestures and basic emotions,allowing for another dimension to be added to the dependent measures in a psycholinguistic experiment.4. A First Attempt to Implement the KINECT into the Area of Language Production As far as we know, there is no
54、 previous language production literature that utilizes the KINECT. This paper therefore represents the first attempt in this field to integrate the KINECT into the daily practice of a psycholinguistics lab. In this paper we focus on implementing the first of the five abovementioned points, that is,
55、the tracking of lip movements in real-time to gather information on the speech planning process.As there are no previous instances for comparison (again, as far as we know), we set out to program a working version of the KINECT software (using C#) to display experimental stimuli and measure a psycho
56、linguistic variable of interest. We aim to keep the code open and freely available for other researchers to use and adapt to their own insights. Obviously, when running the program, an attached KINECT for Windows, including the SDK is required (and Visual Studio is needed when adapting the code). To
57、 accommodate those who do not have this setup we provide a short video demonstrating the program online. Furthermore, the program(executable and source code) is provided .Notice that we provide the complete working directory in this file to have everything available to experienced programmers (for t
58、hose who simply want to run the program the executable can be found in /bin/x86/debug/FaceTrackingBasicsWPF.exe). The KINECT SDK v1.8 needs to be installed as well.Although the KINECT is able to track more than one person, in this initial stage of program development, only a single person is tracked
59、 during an experiment. The current version of the program is able to:1) Randomly display a word (taken from an Excel file) to a participant.2) Use the KINECT to determine the visual on- and offset of the word relative to its initial presentation (i.e.lip/face points) in real-time.3) Use the KINECT to detect the auditory on- and offset of the word relative to its
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 文化旅游產(chǎn)業(yè)發(fā)展合同
- 廣東省陽(yáng)江市高新區(qū)2024-2025學(xué)年高一上學(xué)期1月期末地理試題 含解析
- 家電行業(yè)智能家電互聯(lián)互通方案
- 企業(yè)采購(gòu)原材料采購(gòu)協(xié)議
- 水電站建設(shè)運(yùn)營(yíng)合作協(xié)議
- 旅游行業(yè)服務(wù)質(zhì)量保障協(xié)議
- 網(wǎng)絡(luò)科技行業(yè)數(shù)據(jù)安全使用承諾書
- 企業(yè)員工福利計(jì)劃與服務(wù)支持方案
- 私人教練健身訓(xùn)練合同協(xié)議
- 產(chǎn)品銷售代理合同集
- 2025年常州機(jī)電職業(yè)技術(shù)學(xué)院高職單招職業(yè)技能測(cè)試近5年??及鎱⒖碱}庫(kù)含答案解析
- 健康科普知識(shí)
- 2025-2030年中國(guó)真空凍干蔬菜市場(chǎng)發(fā)展走勢(shì)及投資策略分析報(bào)告
- 中央2025年交通運(yùn)輸部所屬事業(yè)單位招聘261人筆試歷年參考題庫(kù)附帶答案詳解
- 云南省昆明市2025年中考語(yǔ)文模擬試卷六套【附參考答案】
- 中智集團(tuán)所屬中智國(guó)際商務(wù)發(fā)展限公司招聘高頻重點(diǎn)提升(共500題)附帶答案詳解
- 【9語(yǔ)一?!?024年蚌埠市懷遠(yuǎn)縣中考一模語(yǔ)文試題
- 《芮城花椒栽培技術(shù)規(guī)程》
- 《零售基礎(chǔ)》課件-第一章 零售概述
- 專項(xiàng)10:文言文-【中職專用】2025年職教高考學(xué)業(yè)考試語(yǔ)文二輪專項(xiàng)突破(福建專用)
- DB45T 1097-2014 鋼管混凝土拱橋施工技術(shù)規(guī)程
評(píng)論
0/150
提交評(píng)論