CLEC中國英語學(xué)習(xí)者語料庫_第1頁
CLEC中國英語學(xué)習(xí)者語料庫_第2頁
CLEC中國英語學(xué)習(xí)者語料庫_第3頁
CLEC中國英語學(xué)習(xí)者語料庫_第4頁
CLEC中國英語學(xué)習(xí)者語料庫_第5頁
已閱讀5頁,還剩6頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、CLEC收集了包括中學(xué)生、大學(xué)英語4級(jí)和6級(jí)、專業(yè)英語低年級(jí)和高年級(jí)在內(nèi)的5種學(xué)生的語料一百多萬詞,并對(duì)言語失誤進(jìn)行標(biāo)注。其目的就是觀察各類學(xué)生的英語特征和言語失誤的情況,希望通過定量和定性的方法對(duì)中國學(xué)習(xí)者英語作出較為精確的描寫,為我國學(xué)生的英語教學(xué)提供有用的反饋信息。 表1 CLEC語料分布類型 詞次 ST2208088ST3209043ST4212855ST5214510ST6226106總計(jì)1070602    言語失誤標(biāo)注 原則  1.         簡單合理,易于系統(tǒng)操作

2、。參與標(biāo)注的人比較多,分類表過于繁復(fù),就難于掌握。我們采取兩級(jí)分類,第一級(jí)有11類:詞形(fm)、動(dòng)詞短語(vp)、名詞短語(np)、代詞(pr)、形容詞短語(aj)、副詞(ad)、介詞短語(pp)、連詞(cj)、詞匯(wd)、搭配(cc)、句子(sn)。每一類里再用數(shù)目字細(xì)分。如cc為詞語搭配不當(dāng),cc1表示名詞和名詞的搭配,cc2表示名詞和動(dòng)詞的搭配,cc3表示動(dòng)詞和名詞的搭配,等等。 2.         分類表的類別要適中。過粗容易統(tǒng)一,但信息太少,不利于分析學(xué)習(xí)者的失誤/過細(xì)難以統(tǒng)一,容易把同一

3、種失誤歸到不同類別。目前我們采取的辦法是對(duì)常見的失誤從細(xì)(如vp和np都有9小類),對(duì)少見的失誤從粗(如cj只有兩小類)?,F(xiàn)在的分類表有61個(gè)失誤碼,是屬于中等規(guī)模的分類表。 3.      提供足夠的失誤信息(失誤本身、失誤類型和失誤發(fā)生范圍)。例如In the past, people are vp6, 4- kind to each other, 失誤用方括號(hào)表示,放在失誤之后。 vp6In the past, people are vp6,4-kind to each other,vp6,4- are 為vp(動(dòng)詞)第6種(時(shí)

4、態(tài))失誤,4-為失誤發(fā)生的范圍,-表示失誤的位置,4表示失誤前有4個(gè)詞。要聯(lián)系這4個(gè)詞,才能判斷areare這個(gè)詞用錯(cuò)了。  4.      開放性。容許研究者根據(jù)需要對(duì)失誤類型進(jìn)行補(bǔ)充或進(jìn)一步再分出細(xì)類。例如sn8為句子結(jié)構(gòu)有缺陷,研究者可以對(duì)這種失誤再分為若干細(xì)類來研究。這需要把sn8的失誤全部檢索出來,然后定出第三級(jí)的分類范疇,如sn81,sn82,等等。 5.         對(duì)語體或失誤的來由暫不作標(biāo)注,因?yàn)檫@需要標(biāo)注者較多的主觀判斷,更難以統(tǒng)一

5、。言語失誤分類表(總數(shù):61)詞形 動(dòng)詞短語 名詞短語 代詞 碼 類型 碼 類型 碼 類型 碼 類型 fm1 Spelling vp1 pattern np1 pattern pr1 Reference fm2 word building vp2 set phrase np2 set phrase pr2 anticipatory it fm3 capitalization vp3 agreement np3 agreement pr3 Agreement     vp4 finite/non-finite np4 case pr4 Case     v

6、p5 non-finite np5 countability pr5 wh-     vp6 tense np6 number pr6 Indefinite     vp7 voice np7 article         vp8 mood np8 quantifiers         vp9 modal/auxiliary np9 other determiners     形容詞短語 副詞 介詞短語 連詞 碼 類型 碼 類型 碼 類型 碼 類型 a

7、j1 pattern ad1 order pp1 pattern cj1 pattern aj2 set phrase ad2 modification pp2 set phrase cj2 set phrase aj3 degree ad3 degree         aj4 -ed/-ing confusion             aj5 predicative/attributive             詞語 搭配 句子

8、 碼 類型 碼 類型 碼 類型 wd1order cc1 noun/noun sn1 run-on sentence wd2 part of speech cc2 noun/verb sn2 sentence fragment wd3 substitution cc3 verb/noun sn3 dangling modifier wd4 absence cc4 adj/noun sn4 illogical comparison wd5 redundancy cc5 verb/adv sn5 topic prominence wd6 repetition cc6 adv/adj sn6 Coo

9、rdination wd7 ambiguity     sn7 Subordination         sn8 structural deficiency         sn9 Punctuation 標(biāo)注說明 碼 分 類 類 別 說 明 fm1wordSpelling(拼寫)spelling, coinage, abbreviation, apostrophefm2wordword building(構(gòu)詞)derivation, inflection, compounding, plur

10、ality (noun), irregularity(verb), 3rd person singular form(verb), syllabification, hyphenation, word division or fusion     fm3wordCapitalization(大小寫)lower initial letter for upper initial letter or vice versavp1vb phrPattern(及物性型式)error in transitivity(vi as vt or vice versa), transitive

11、verb pattern/ grammatical(cf Oxford advanced learners dictionary of current English edited by A. S. Hornby)vp2vb phrset phrase(固定詞組)phrasal verb and verbal phrase: error in form or usevp3vb phrAgreement(主謂一致性)number agreement with its subject (noun or pronoun)vp4vb phrfinite/non-finite(定式)finite ver

12、b for non-finite verb or vice versavp5vb phrnon-finite(不定式)infinitive error: form and use/ infinitive for participle or vice versa/ -ed participle for -ing participle or vice versavp6vb phrTense(時(shí)態(tài))error in tense use within a sentence/ the sequence of tenses between sentencesvp7vb phr voice (語態(tài))erro

13、r in the use of voice: active for passive or vice versavp8vb phrMood(語氣)error in the use of mood: imperative, subjunctive/ improper structure of conditional sentencesvp9vb phrmodal/auxiliary(情態(tài))misuse of modal/auxiliary verbs/ wrong form of modal verb(or auxiliary verb) and verb combination (e.g ten

14、se form, voice form, etc)np1nn phrPattern(名詞型式)Error in combination with other words/grammaticalnp2nn phrset phrase(固定詞組)omission or replacement of a fixed element that goes after a certain nounnp3nn phrAgreement(主謂一致性)number agreement of a noun with its determiner or a word that refers to itnp4nn p

15、hrCase(格)possessive case error: form or usenp5nn phrCountability(可數(shù)性)uncountable noun used as countable nounnp6nn phrNumber(數(shù))countable noun used with no determiner or -s/ a or -s with plural nounnp7nn phrArticle(冠詞)a/an confusion or definite/indefinite confusionnp8nn phrQuantifiers(數(shù)量詞)misuse or co

16、nfusion between many/much, (a) few/(a) little, some/any, etcnp9nn phrother determiners(其他限定詞)misuse or confusion of demonstratives, wh- determiners, numerals, etc.pr1pronReference(指稱)incorrect/ambiguous pronoun reference/anaphoricpr2pronanticipatory it(先行it)improper or wrong use of anticipatory it /

17、 it replaced by a demonstrative, etcpr3pronAgreement(主謂一致性)number agreement with a noun it refers topr4pronCase(格)case error of any personal pronounpr5pronwh-(wh-代詞)misuse or confusion of interrogative, relative and conjunctive pronounspr6pronIndefinite(不定式)misuse or confusion of indefinite pronouns

18、 such as all/both, few/little, some/any, either/neither, etcaj1 adjPattern(形容詞型式)error in the combination with other words/grammaticalaj2adjset phrase(固定詞組)error in the idiomatic use of an adjectival phrase/ omission or replacement of a fixed element that goes after a certain adjectiveaj3adjDegree(級(jí)

19、)adjective degree error: form and useaj4adj-ed/-ing confusion(-ed/-ing混淆)-ed adjective for -ing adjective or vice versaaj5adjpredicative/attributive(謂語/定語)predicative adjective used as attributive adjectivead1advOrder(詞序)improper adverb placement/wrong positionad2advModification(修飾語)adjective modifi

20、er used as verb modifier/ other kinds of confusionad3advDegree(級(jí))adverb degree error: form and usepp1prepPattern(介詞型式)unacceptable combination with other words/grammaticalpp2prepset phrase(固定詞組)error in the formation or use of an idiomatic prepositional phrasecj1conjPattern(連詞型式)unacceptable combina

21、tion with other words/grammaticalcj2conjset phrase(固定詞組)error in the formation or use of a phrase functioning as a conjunctionwd1wordOrder(詞序)misplacement of any word other than an adverbwd2wordpart of speech(詞類)error in part of speech: right root but wrong word classwd3wordSubstitution(替代)error in

22、word choice: right word class but wrong selection (any part of speech)wd4wordAbsence(缺少)omission of a word(any part of speech)wd5wordRedundancy(冗余)oversuppliance of a word(any part of speech)wd6wordRepetition(重復(fù))unnecessary repeating of a word wd7wordAmbiguity(歧義)not clear word meaning/semanticcc1no

23、tionaln/n collocation(名詞/名詞)improper noun(phrase) and noun(phrase) combination/semanticcc2notionaln/v collocation(名詞/動(dòng)詞)improper noun(phrase) and verb(phrase) combination/semanticcc3notionalv/n collocation(動(dòng)詞/名詞)improper verb and noun(phrase) combination/semanticcc4notionala/n collocation(形容詞/名詞)imp

24、roper adjective and noun(phrase) combination/semanticcc5notionalv/ad collocation(動(dòng)詞/副詞)improper verb and adverb (or ad/v) combination/semanticcc6notionalad/a collocation(副詞/形容詞)improper adverb and adjective combination/semanticsn1sentencerun-on sentence(不斷句)improper addition of clauses/fused sentenc

25、esn2sentencesentence fragment(片段)subordinate clause as a sentence/ any phrase as a sentencesn3sentencedangling modifier(垂懸修飾語)illogical adverbial modification of a clausesn4sentenceillogical comparison(比較不符合邏輯) error in the comparison of words or phrases in a sentence which can not be comparedsn5sen

26、tencetopic prominence(主題突出)the co-occurrence of an initial noun phrase and its equivalent(usually a pronoun) in the same sentencesn6sentenceCoordination(并列)faulty parallelism of clauses (or words/phrases) in a sentencesn7sentenceSubordination(主從)faulty attachment of a subordinate clause to the main

27、clausesn8sentencestructural deficiency(結(jié)構(gòu)缺陷)error in the grammatical construction of a sentence: improper splitting, pattern shifting, confusing structure, etcsn9sentencePunctuation(標(biāo)點(diǎn)符號(hào))overuse, absence, choice, apostrophe, comma splice, etc.  標(biāo)準(zhǔn)化處理后的各種失誤頻數(shù)及其比例 失誤類型 st2 st3 st3 st4 st5 總計(jì) 百分比(

28、%) fm1 1928.8 2877.4 2112.6 1826.7 1686.7 10432.2 17.47 fm2 349.3 448.9 438.9 226.9 328.7 1792.7 3 fm3 1474.4 731.8 405.8 694.1 174.6 3480.7 5.83 vp1 259.4 325.9 498.4 103.4 200.8 1387.9 2.32 vp2 179 139.3 61.2 104.2 22.1 505.8 0.85 vp3 374 524.6 785.2 273.1 327 2283.9 3.82 vp4 140.8 159.1 110.8 63.

29、9 51.6 526.2 0.88 vp5 140 118.7 107.4 89.9 46.7 502.7 0.84 vp6 1165.7 356 311.6 379.8 215.6 2428.7 4.07 vp7 172.7 104.1 98.4 63.9 46.7 485.8 0.81 vp8 27.1 16.3 8.3 25.2 11.5 88.4 0.15 vp9 111.4 274.3 278.5 42.9 86.1 793.2 1.33 np1 46.9 33.5 28.9 16.8 10.7 136.8 0.23 np2 24.7 22.4 17.4 19.3 2.5 86.3

30、0.14 np3 202.1 247.7 249.6 210.9 186 1096.3 1.84 np4 66.8 55.9 26.4 22.7 21.3 193.1 0.32 np5 58.9 98 71.9 60.5 84.4 373.7 0.63 np6 374 654.4 481 358.8 354.1 2222.3 3.72 np7 237.9 107.5 89.3 174.8 54.9 664.4 1.11 np8 35 65.4 47.9 13.4 7.4 169.1 0.28 np9 6.4 41.3 12.4 7.6 5.7 73.4 0.12 pr1 82 236.5 20

31、5 89.9 18.9 632.3 1.06 pr2 16.7 78.3 23.1 4.2 0 122.3 0.2 pr3 52.5 54.2 172.7 28.6 60.6 368.6 0.62 pr4 74.8 37 20.7 48.7 10.7 191.9 0.32 pr5 26.3 53.3 14.1 7.6 10.7 112 0.19 pr6 9.5 2.6 5 3.4 0 20.5 0.03 aj1 6.4 18.9 15.7 5 9 55 0.09 aj2 9.5 3.4 9.9 5.9 7.4 36.1 0.06 aj3 38.2 39.6 32.2 43.7 97.5 251

32、.2 0.42 aj4 16.7 2.6 22.3 12.6 5.7 59.9 0.1 aj5 0.8 3.4 7.4 1.7 0 13.3 0.02 ad1 35.8 96.3 39.7 27.7 15.6 215.1 0.36 ad2 42.2 37.8 12.4 9.2 4.9 106.5 0.18 ad3 7.2 12 9.9 1.7 2.5 33.3 0.06 pp1 136.1 98 43 169.7 28.7 475.5 0.8 pp2 25.5 262.3 143.8 37 27.9 496.5 0.83 cj1 27.8 20.6 18.2 21.8 12.3 100.7 0

33、.17 cj2 4 7.7 13.2 5.9 4.9 35.7 0.06 Wd1 43.8 151.3 114.1 25.2 37.7 372.1 0.62 Wd2 324.6 929.6 772.8 226.9 242.6 2496.5 4.18 Wd3 1102 1634.7 1815 757.1 359.8 5668.6 9.49 Wd4 585.6 829.8 443.8 403.3 427 2689.5 4.5 Wd5 410.6 613.1 518.2 265.5 171.3 1978.7 3.31 Wd6 27.1 37 22.3 34.5 29.5 150.4 0.25 Wd7

34、 261.8 430.8 261.2 228.6 209.8 1392.2 2.33 cc1 72.4 65.4 76 23.5 36.1 273.4 0.46 cc2 35 177.1 49.6 6.7 21.3 289.7 0.49 Cc3 168.7 514.2 417.4 75.6 112.3 1288.2 2.16 Cc4 64.5 94.6 134.7 42 39.3 375.1 0.63 Cc5 23.9 40.4 29.8 5 4.1 103.2 0.17 Cc6 17.5 12 6.6 2.5 1.6 40.2 0.07 Sn1 419.3 596.8 576.9 118.5

35、 42.6 1754.1 2.94 Sn2 424.9 389.6 303.3 132.8 76.2 1326.8 2.22 Sn3 10.3 20.6 17.4 2.5 10.7 61.5 0.1 Sn4 17.5 24.9 6.6 20.2 4.9 74.1 0.12 Sn5 9.5 14.6 17.4 2.5 4.9 48.9 0.08 Sn6 84.3 41.3 39.7 41.2 1.6 208.1 0.35 Sn7 49.3 55.9 63.6 23.5 3.3 195.6 0.33 Sn8 1103.6 446.3 862.1 493.2 231.9 3137.1 5.25 Sn

36、9 861.7 573.6 337.2 649.5 322.9 2744.9 4.6 總計(jì) 14105.2 16160.6 13935.9 8883.4 6633.8 59718.9 100   按大類區(qū)分言語失誤排列表         st2 st3 st4 st5 st6 總計(jì) 百分比 累積百分比 詞形 3752.5 4058.1 2957.3 2747.7 2190 15705.6 26.299 26.299 詞匯 2755.5 4626.3 3947.4 1941.1 1477.7 14748 24.696 50.995 句法 2980

37、.4 2163.6 2224.2 1483.9 699 9551.1 15.993 66.988 動(dòng)詞 2570.1 2018.3 2259.8 1146.3 1008.1 9002.6 15.075 82.063 名詞 1052.7 1326.1 1024.8 884.8 727 5015.4 8.398 90.461 搭配 382 903.7 714.1 155.3 214.7 2369.8 3.968 94.429 代詞 261.8 461.9 440.6 182.4 100.9 1447.6 2.424 96.853 介詞 161.6 360.3 186.8 206.7 56.6 97

38、2 1.628 98.481 形容詞 71.6 67.9 87.5 68.9 119.6 415.5 0.696 99.177 副詞 85.2 146.1 62 38.6 23 354.9 0.594 99.771 連詞 31.8 28.3 31.4 27.7 17.2 136.4 0.228 99.999 總計(jì) 14105.2 16160.6 13935.9 8883.4 6633.8 59718.9 99.999 百分比 0.24 0.27 0.23 0.15 0.11   中國學(xué)習(xí)者最常見的言語失誤         類型 st2 st3

39、st4 st5 st6 總計(jì) 百分比 fm1 1928.8 2877.4 2112.6 1826.7 1686.7 10432.2 17.47 wd3 1102 1634.7 1815 757.1 359.8 5668.6 9.49 fm3 1474.4 731.8 405.8 694.1 174.6 3480.7 5.83 sn8 1103.6 446.3 862.1 493.2 231.9 3137.1 5.25 sn9 861.7 573.6 337.2 649.5 322.9 2744.9 4.6 wd4 585.6 829.8 443.8 403.3 427 2689.5 4.5 w

40、d2 324.6 929.6 772.8 226.9 242.6 2496.5 4.18 vp6 1165.7 356 311.6 379.8 215.6 2428.7 4.07 vp3 374 524.6 785.2 273.1 327 2283.9 3.82 np6 374 654.4 481 358.8 354.1 2222.3 3.72 wd5 410.6 613.1 518.2 265.5 171.3 1978.7 3.31 fm2 349.3 448.9 438.9 226.9 328.7 1792.7 3 sn1 419.3 596.8 576.9 118.5 42.6 1754

41、.1 2.94 wd7 261.8 430.8 261.2 228.6 209.8 1392.2 2.33 vp1 259.4 325.9 498.4 103.4 200.8 1387.9 2.32 sn2 424.9 389.6 303.3 132.8 76.2 1326.8 2.22 cc3 168.7 514.2 417.4 75.6 112.3 1288.2 2.16 np3 202.1 247.7 249.6 210.9 186 1096.3 1.84 vp9 111.4 274.3 278.5 42.9 86.1 793.2 1.33 np7 237.9 107.5 89.3 17

42、4.8 54.9 664.4 1.11 pr1 82 236.5 205 89.9 18.9 632.3 1.06   從上表可看出,1.         詞形的3種失誤(拼寫、構(gòu)詞、大小寫)均在其中,而拼寫更是居榜首,占失誤中的17.47%。3種失誤合并共占20.57%。2.         詞匯失誤7種中有5種(替代、缺少、詞類、冗余、歧義),占失誤中的23.81%。3.    &#

43、160;    句法失誤9種中有4種(結(jié)構(gòu)缺陷、標(biāo)點(diǎn)符號(hào)、不斷句、片段),占失誤中的15.01%。4.         動(dòng)詞詞組9種中有4種(時(shí)態(tài)、主謂不一致、及物性、情態(tài)),占失誤中的11.54%5.         名詞詞組9種中有3種(數(shù)、主謂不一致、冠詞),占6.67%。6.         其他失誤(動(dòng)詞/名詞搭配、代詞指稱),占3.22%。中國學(xué)習(xí)者最常見拼寫失誤表         頻數(shù) 詞 頻數(shù) 詞 頻數(shù) 詞 頻數(shù) 詞 379 MORTALITY 23 THEMSELVES 15 LIMITED 12 WRITING 113 KNOWLEDGE 21 FESTIVAL 15 NOTICE 11 ARTICLE 78 POLLUTION 20 BELIEVE 15 OURSELVES 11 CONTRARY 76

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論