心理語言學(xué)CH3.perceptionPPT課件_第1頁
心理語言學(xué)CH3.perceptionPPT課件_第2頁
心理語言學(xué)CH3.perceptionPPT課件_第3頁
心理語言學(xué)CH3.perceptionPPT課件_第4頁
心理語言學(xué)CH3.perceptionPPT課件_第5頁
已閱讀5頁,還剩52頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、1chapter 3: speech perception2overview of questions can computers perceive speech as well as humans? why does an unfamiliar foreign language often sound like a continuous stream of sound, with no breaks between words? does each word that we hear have a unique pattern of air pressure changes associat

2、ed with it? are there specific areas in the brain that are responsible for perceiving speech?3 speech perception refers to the processes by which humans are able to interpret and understand the sounds used in language. the study of speech perception is closely linked to the fields of phonetics and p

3、honology in linguistics and cognitive psychology and perception in psychology. 4 research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. speech research has applications in building computer systems that ca

4、n recognize speech, as well as improving speech recognition for hearing- and language-impaired listeners.5speech perception the first step in comprehending spoken language is to identify the words being spoken, performed in multiple stages: 1. phonemes are detected (/b/, /e/, /t/, /e/, /r/, )2. phon

5、emes are combined into syllables (/be/ /ter/)3. syllables are combined into words (“better”)4. word meaning retrieved from memory6spectrogram: i owe you a yo-yo7speech perception: two problems words are not neatly segmented (e.g., by pauses) lack of phoneme invariance coarticulation = consecutive sp

6、eech sounds blend into each other due to mechanical constraints on articulators speaker differences; pitch affected by age and sex; different dialects, talking speeds etc.8the speech input consists of; frequency range 50-5600hz critical band filters(臨界頻帶濾波器) dynamic range 50db temporal resolution (瞬

7、時(shí)清晰度)of 10ms smallest detectable change in f0 2hz smallest change in f1 40hz smallest change in f2 100hz smallest change in f3 150hz9the speech stimulus phoneme - smallest unit of speech that changes meaning in a word in english there are 47 phonemes: 23 major vowel sounds 24 major consonant sounds

8、number of phonemes in other languages varied11 in hawaiian and 60 in some african dialects10table 13.1 major consonants and vowels of english and their phonetic symbols11the acoustic signal produced by air that is pushed up from the lungs through the vocal cords and into the vocal tract vowels are p

9、roduced by vibration of the vocal cords and changes in the shape of the vocal tract these changes in shape cause changes in the resonant frequency and produce peaks in pressure at a number of frequencies called formants(共振峰)12figure 13.1 the vocal tract includes the nasal and oral cavities and the p

10、harynx, as well as components that move, such as the tongue, lips, and vocal cords.13the acoustic signal - continued the first formant has the lowest frequency, the second has the next highest, etc. sound spectrograms show the changes in frequency and intensity for speech consonants are produced by

11、a constriction of the vocal tract formant transitions - rapid changes in frequency preceding or following consonants14figure 13.3 spectrogram of the word had showing the first (f1), second (f2), and third (f3) formants for the vowel /ae/. (spectrogram courtesy of kerry green.) 15figure 13.4 spectrog

12、ram of the sentence “roy read the will,” showing the formants such as f1, f2, and f3, and formant transitions such as t2 and t3. (spectrogram courtesy of kerry green.)16the relationship between the speech stimulus and speech perception the segmentation problem - there are no physical breaks in the c

13、ontinuous acoustic signal how do we segment the individual words? the variability problem - there is no simple correspondence between the acoustic signal and individual phonemes variability from a phonemes context coarticulation - overlap between articulation of neighboring phonemes17figure 13.5 spe

14、ctrogram of “i owe you a yo-yo.” this spectrogram does not contain pauses or breaks that correspond to the words that we hear. the absence of breaks in the acoustic signal creates the segmentation problem. (spectrogram courtesy of david pisoni.)18figure 13.6 hand-drawn spectrograms for /di/ and /du/

15、. (from “perception of the speech code,” by a. m. liberman, 1967, psychological review, 74, 431-461, figure 1. copyright 1967 by the american psychological association. reprinted by permission of the author.)19the relationship between the speech stimulus and speech perception - continued variability

16、 from different speakers speakers differ in pitch, accent, speed in speaking, and pronunciation this acoustic signal must be transformed into familiar words people perceive speech easily in spite of the segmentation and variability problems 20figure 13.7 (a) spectrogram of “what are you doing?” pron

17、ounced slowly and distinctly. (b) spectrogram of “what are you doing?” as pronounced in conversational speech. (spectrogram courtesy of david pisoni.)21stimulus dimensions of speech perception invariant acoustic cues - features of phonemes that remain constant short-term spectrograms are used to inv

18、estigate invariant acoustic cues sequence of short-term spectra can be combined to create a running spectral display from these displays, there have been some invariant cues discovered22figure 13.8 left: a short-term spectrum of the acoustic energy in the first 26 ms of the phoneme /ga/. right: soun

19、d spectrogram of the same phoneme. the sound for the first 26 ms is indicated in red. the peak in the short-term spectrum, marked a, corresponds to the dark band of energy, marked a in the spectrum. the minimum in the short-term spectrum, marked b, corresponds to the light area, marked b in the spec

20、trogram. the spectrogram on the right shows the energy for the entire 500 ms duration of the sound, whereas the short-term spectrum only shows the first 26 ms at the beginning of this signal. (courtesy of james sawusch.)23figure 13.9 running spectral displays for /pi/ and /da/. these displays are ma

21、de up of a sequence of short-term spectra, like the one in figure 13.8. each of these spectra is displaced 5 ms on the time axis, so that each step we move along this axis indicates the frequencies present in the next 5 ms. the low-frequency peak (v) in the /da/ display is a cue for voicing. (from “

22、time-varying features of initial stop consonants in auditory running spectra: a first report,” by d. kewley-port, and p. a. luce, 1984, perception and psychophysics, 35, 353-360, figure 1. copyright 1984 by psychonomic society publications. reprinted by permission.)24categorical perception this occu

23、rs when a wide range of acoustic cues results in the perception of a limited number of sound categories an example of this comes from experiments on voice onset time (vot) - time delay between when a sound starts and when voicing begins stimuli are da (vot of 17ms) and ta (vot of 91ms)25categorical

24、perception - continued computers were used to create stimuli with a range of vots from long to short listeners do not hear the incremental changes, instead they hear a sudden change from /da/ to /ta/ at the phonetic boundary thus, we experience perceptual constancy for the phonemes within a given ra

25、nge of vot 26figure 13.10 spectrograms for /da/ and /ta/. the voice onset time - the time between the beginning of the sound and the onset of voicing - is indicated at the beginning of the spectrogram for each sound (spectrogram courtesy of ron cole.) 27figure 13.11 the results of a categorical perc

26、eption experiment indicate that /da/ is perceived for vots to the left of the phonetic boundary, and that /ta/ is perceived at vots to the right of the phonetic boundary. (from “selective adaptation of linguistic feature detectors, by p. eimas and j. d. corbit, 1973, cognitive psychology, 4, 99-109,

27、 figure 2. copyright 1973 academic press, inc. reprinted by permission.)28figure 13.12 in the discrimination part of a categorical perception experiment, two stimuli are presented, and the listener indicates whether they are the same or different. the typical result is that two stimuli with vots on

28、the same side of the phonetic boundary (solid arrows) are judged to be the same, and that two stimuli on different sides of the phonetic boundary (dashed arrows) are judged to be different. 29figure 13.13 perceptual constancy occurs when all stimuli on one side of the phonetic boundary are perceived

29、 to be in the same category even though their vot is changed over a substantial range. this diagram symbolizes the constancy observed by eimas and corbit (1973) experiment, in which /da/ was heard on one side of the boundary and /ta/ on the other side.30speech perception is multimodal auditory-visua

30、l speech perception the mcgurk effect visual stimulus shows a speaker saying “ga-ga” auditory stimulus has a speaker saying “ba-ba” observer watching and listening hears “da-da”, which is the midpoint between “ga” and “ba” observer with eyes closed will hear “ba”31mcgurk effect32figure 13.14 the mcg

31、urk effect. the womans lips are moving as if she is saying /ga-ga/, but the actual sound being presented is /ba-ba/. the listener, however, reports hearing the sound /da-da/. if the listener closes his eyes, so that he no longer sees the womans lips, he hears /ba-ba/. thus, seeing the lips moving in

32、fluences what the listener hears.33cognitive dimensions of speech perception top-down processing, including knowledge a listener has about a language, affects perception of the incoming speech stimulus segmentation is affected by context and meaning i scream you scream we all scream for ice cream34f

33、igure 13.15 speech perception is the result of top-down processing (based on knowledge and meaning) and bottom-up processing (based on the acoustic signal) working together.35meaning and phoneme perception experiment by turvey and van gelder short words (sin, bat, and leg) and short nonwords (jum, b

34、af, and teg) were presented to listeners the task was to press a button as quickly as possible when they heard a target phoneme on average, listeners were faster with words (580 ms) than non-words (631 ms)36meaning and phoneme perception - continued experiment by warren listeners heard a sentence th

35、at had a phoneme covered by a cough the task was to state where in the sentence the cough occurred listeners could not correctly identify the position and they also did not notice that a phoneme was missing - called the phonemic restoration effect37phonemic restoration auditory presentation percepti

36、onlegislature legislaturelegi_laturelegi laturelegi*lature legislatureit was found that the *eel was on the axle. wheel it was found that the *eel was on the shoe. heel it was found that the *eel was on the orange. peel it was found that the *eel was on the table. meal warren, r. m. (1970). perceptu

37、al restorations of missing speech sounds. science, 167, 392-393. 38meaning and word perception experiment by miller and isard stimuli were three types of sentences: normal grammatical sentences anomalous sentences that were grammatical ungrammatical strings of words listeners were to shadow (repeat

38、aloud) the sentences as they heard them through headphones39meaning and word perception - continued results showed that listeners were 89% accurate with normal sentences 79% accurate for anomalous sentences 56% accurate for ungrammatical word strings differences were even larger if background noise

39、was present 40speaker characteristics indexical characteristics - characteristics of the speakers voice such as age, gender, emotional state, level of seriousness, etc. experiment by palmeri, goldinger, and pisoni listeners were to indicate when a word was new in a sequence of words results showed t

40、hat they were much faster if the same speaker was used for all the words41speech perception and the brain brocas aphasia - individuals have damage in brocas area (in frontal lobe) labored and stilted speech and short sentences but they understand others wernickes aphasia - individuals have damage in

41、 wernickes area (in temporal lobe) speak fluently but the content is disorganized and not meaningful they also have difficulty understanding others42figure 13.16 brocas and wernickes areas, which are specialized for language production and comprehension, are located in the left hemisphere of the bra

42、in in most people.43speech perception and the brain - continued measurements from cats auditory fibers show that the pattern of firing mirrors the energy distribution in the auditory signal brain scans of humans show that there are areas of the human what stream that are selectively activated by the

43、 human voice44figure 13.17 (a) short-term spectrum for /da/. this curve indicates the energy distribution in /da/ between 20 and 40 ms after the beginning of the signal. (b) nerve firing of a population of cat auditory nerve fibers to the same stimulus. (from “encoding of speech features in the audi

44、tory nerve,” by m. b. sachs, e. d. young, and m. i. miller, 1981. in r. carlson and b. granstrom (eds.) the representation of speech in the peripheral auditory system, pp. 115-130. copyright 1981 by elsevier science publishing, new york. reprinted by permission.)45experience dependent plasticity bef

45、ore age 1, human infants can tell difference between sounds that create all languages the brain becomes “tuned” to respond best to speech sounds that are in the environment other sound differentiation disappears when there is no reinforcement from the environment46motor theory of speech perception l

46、iberman et al. proposed that motor mechanisms responsible for producing sounds activate mechanisms for perceiving sound evidence from monkeys comes from the existence of mirror neurons experiment by watkins et al. participants had their motor cortex for face movements stimulated by transcranial magn

47、etic stimulation (tms)47motor theory of speech perception - continued results showed small movements for the mouth called motor evoked potentials (mep) this response became larger when the person listened to speech or watched someone elses lip movements in addition, the where stream may work with the what stream for speech perception48figure 13.18 the transcranial magnetic stimulation experiment that provides evidence for a link between speech perception and production in humans. see text for de

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論