2019第十二屆一階段優(yōu)秀5258隊(duì)題_第1頁
2019第十二屆一階段優(yōu)秀5258隊(duì)題_第2頁
2019第十二屆一階段優(yōu)秀5258隊(duì)題_第3頁
2019第十二屆一階段優(yōu)秀5258隊(duì)題_第4頁
2019第十二屆一階段優(yōu)秀5258隊(duì)題_第5頁
已閱讀5頁,還剩26頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

編號(hào)專用頁參賽隊(duì)伍的參賽隊(duì)號(hào):(請(qǐng)各個(gè)參賽隊(duì)提前填寫好競(jìng)賽統(tǒng)一編號(hào)(由競(jìng)賽送至評(píng)委團(tuán)前編號(hào)競(jìng)賽評(píng)閱編號(hào)(由競(jìng)賽評(píng)委團(tuán)評(píng)閱前進(jìn)行編號(hào)2019年第十二屆“認(rèn)證杯”數(shù)學(xué)中國(guó) 關(guān)鍵 外星語KMP算法相似字符串搜索匹配模糊匹配算 要KMPCherryKlingon鍵盤語言對(duì)外星語的構(gòu)詞形式作出初步20C++15-21庫(kù),將這些單詞分為高頻詞匯和低頻詞匯,之后針對(duì)高低頻詞匯以不同概率隨機(jī)從詞305000-8000KMP非常少,由此我們對(duì)過程中產(chǎn)生的問題進(jìn)行了分析。在主串和子串的匹配過程中,若出現(xiàn)匹配失敗的情況,子串會(huì)向右“滑動(dòng)”一段盡可能遠(yuǎn)的距離以減少一些不必要的比較過程。然而,在滑動(dòng)的過程中,由于主串不發(fā)生回溯,損失了很多有效的匹配位KMP(填寫(填寫所選題目: 題英Inthispaper,wesearchforthesamealphabeticsequencefragmentprobleminthetextcomposedofdifferentaliensegmentsintheerrortolerancerange.AsimilarstringsearchmatchingalgorithmmodelbasedonKMPalgorithmisestablished,whichisfoundinaspecificnumberoftextstrings.Substrings,andthealgorithmistestedby pileddata,andtheiradvantagesanddisadvantagesareyzedandjudged,andoptimizationysisismade.Model1:Forthebasiclanguagemodelofalienlanguages,weconstructalexicallibraryalienlanguagesthroughhypothesesandogymodelsandrandomlyextractthevocabularyintotextsforysis.Firstly,accordingtoCherry'sKlingonkeyboardlanguagedesignedforaliens,preliminarydesignoftheformationoftheforeignlanguageismade.Secondly,combinedwithourownreasonableguess,20Englishlettersrepresentingtheaminoacidnameareusedasthebasicelementsofthealienlanguage.Inaddition,aconversiontableforEnglishandalienlanguagesisdesigned.Consideringthecombinationofvowelsandconsonantsintheformofwordformation,alargenumberofwordswithlengthsof15-21lettersarerandomlygeneratedbyC++softwaretoformathesaurus,whicharedividedintohighfrequencyvocabularyandlowfrequencyvocabulary.Then,forhighandlowfrequencyvocabulary,werandomlyextractvocabularyfromthelexiconwithdifferentprobabilitiestoform30segmentsoftextcontaining5000-8000letters,useitasthetextdataforthenextysis,andfinallycountthefrequencyofoccurrenceofeachletterinthetexttocheckwhetheritisconsistentwithnormaldistribution.Model2:Fortheproblemofstringmatchinginthetitle,weusetheKMP-basedsimilarstringsearchmatchingalgorithmmodeltosearchforthequalifiedsequenceoflettersdifferentsegmentsoftheFirst,wetestedthegeneratedtextsusingtheexistingKMPalgorithmandfoundthatthenumberofsequencesthatmetthecriteriawasverysmall,soweyzedtheproblemsgeneratedintheprocess.Inthematchingprocessbetweenthemainstringandthesubstring,ifamatchfails,thesubstringwill"slide"totherightasfaraspossibletoreducesomeunnecessarymanyeffectivematchingsitesarelost,resultingininaccurateresults.Throughtheimprovementofthemodel,weconstructasimilarstringsearchmatchingalgorithmmodelbasedonKMPalgorithm,andperformacertaindegreeofbacktrackingonthemainstring,whichisbeneficialtothesubstringtoformamatchwiththemainstringwithintheerrortolerance.Inthisway,theaccuracyoftheresultsisensured,thetimecomplexityisminimized,andtheefficiencyofthealgorithmisimproved.Inthetestprocessofthemodel,wediscusstheadvantagesanddisadvantagesoftheabovemodels,andproposethescopeandfieldofeachmodel.一、問題重 相關(guān)背 要解決的問 二、問題分 三、模型假 四、符號(hào)說明及名詞定 五、模型的建立與求 模型一:文本的生成模 外星字母單元的基本形 外星語字母與英語字母的轉(zhuǎn)換及元輔音分 隨機(jī)文本的生 字母頻數(shù)的統(tǒng)計(jì)學(xué)檢 模型二:基于KMP的相似字符串搜索匹配算 現(xiàn)有KMP算法的簡(jiǎn) 基于KMP的相似字符串搜索匹配模型的建 結(jié)果分 模型的優(yōu)化分 六、模型的評(píng) 模型一:文本生成模 模型的優(yōu) 模型的缺 模型二:基于KMP的相似字符串搜索匹配算法模 模型的優(yōu) 模型的缺 七、模型的推 八、參考文 九、附 生成30段文本的C++程序代 截取的部分詞匯 30段文本其中一段文本(為例 基于KMP的模糊匹配算法C++代 基于KMP改進(jìn)后的模糊匹配算法C++代

一、問題重目前,我們發(fā)現(xiàn)了一種未知的語言,了解到其文字是由20個(gè)字母組成。現(xiàn)已獲得15–21二、問題分30段文本查找符合條件的片段。無論是理論分析和字符串查找匹配是題目的要求。在不同段文本中搜索相同的字母序列片段可80003015-21知的字符串序列在文本串中進(jìn)行定位,而這里的序列片段是未知的。針對(duì)此問題,我600015-211500450080%20%30三、模型假四、符號(hào)說明及名詞Si表示文本串中第iPjjnext[j]表示模式串中下標(biāo)j之前的最長(zhǎng)等值前后綴的長(zhǎng)度;

五、模型的建立與求1CherryKlingon鍵盤語言單位形成,可能會(huì)采用自己所熟悉的形式創(chuàng)建語言模式,從而利用代表20種氨基120序號(hào)123456789字母ACDEFGHIKL序號(hào)字母MNPQRSTVWY21ZA234(55678D9FHKLMNP個(gè)QRSTVWY326305000-800031234567892文本文本的生成模統(tǒng)計(jì)學(xué)檢生抽類構(gòu)單詞前綴與后英(總詞匯60004500,低頻詞元輔音分猜想及假庫(kù)抽出高頻詞和低頻30段包含80003KMPKMPKMPS(主串)匹配到iP(子串)j匹配過程執(zhí)行到比較字符SiPj

0in1,0

jm1(2)SiPjj0時(shí),則對(duì)Si1Pj即把子串和主串向右移動(dòng)一位再?gòu)淖哟牡谝粋€(gè)字符進(jìn)行匹配[2];當(dāng)1jmnext[jijnextj]PSjnext

時(shí)執(zhí)行SiPnextjjm1或in1nextj

-j,P[0k1]P[jkj0

j0

(5-經(jīng)判斷,KMPO(mn,空間復(fù)雜度為O(m)[3]檢查Si1和P是否jnext[jSiSi結(jié)令i不j=1j<j=重復(fù)以上操模式串P匹配到j(luò)位其中,next數(shù)組為KMP算法的,具體求解步驟如下P[j]?假設(shè)根據(jù)定義5nextKMPKMP012345678adc0a1g2g3v4caahcde6KMP6204adcaggvca01234ahcdadcaggvca01234ahcde主子7KMP7,S[2,3,4,5P[0,1,2,3在容錯(cuò)范圍內(nèi)可以進(jìn)行相互匹配,但當(dāng)主串i6j4012345678主adcaggvca01234子ahcde8KMP8,KMPi6j0,6030KMP的結(jié)果出現(xiàn)比較大的偏差,精確度不夠。012345678adca0g1g2v3c4aahcde9KMP9,KMPKMP匹配失敗時(shí),主串i3,j0iij1。這樣,算法的準(zhǔn)字符子串,而是一個(gè)字符片段。運(yùn)行算法,我們得到匹配結(jié)果。內(nèi)容為程序運(yùn)行時(shí)匹配出來的一些相似片段,由于篇幅有限,未全部截取。通過這個(gè)相似片段我們可以發(fā)現(xiàn)在外星語中出現(xiàn)頻率高的子串。104KMPKMPnext假如在搜索的過程中得到next[jk,在模式串中存在SjSk,則當(dāng)目標(biāo)串中的字SiPjSiPk一定也不相同,即不需要再將Si再進(jìn)行比較,直接比較Si和Pnext[j],由此next[j]修改為nextval[j]。其具體的義是nextval[01PjPnext[j]nextval[jnextval[next[jnextval[j]next[j]。通過將next替換為nextvalKMP在此基礎(chǔ)上,我們更進(jìn)一步分析發(fā)現(xiàn)由于容錯(cuò)性的存在,nextKMPKMPKMP該模型今后進(jìn)行優(yōu)化的關(guān)鍵。六、模型的評(píng)KMP七、模型的推過類比現(xiàn)有語言并在其他語言的基礎(chǔ)上進(jìn)行合理的創(chuàng)新,可以開發(fā)出方便且實(shí)用的新語言。KMP法極大地提高了相似字符串搜索匹配的準(zhǔn)確度,可以應(yīng)用到文獻(xiàn)檢索、語言識(shí)別、文字編輯處理[5]等多個(gè)方面。另外,此模型還可運(yùn)用到檢測(cè)系統(tǒng)中。隨著網(wǎng)絡(luò)的迅速發(fā)展,大流量網(wǎng)絡(luò)環(huán)境的需求嚴(yán)重影響了基于模式匹配技術(shù)的網(wǎng)絡(luò)應(yīng)用的性能。因此,將該算法模型應(yīng)用到檢測(cè)系統(tǒng)的檢測(cè)引擎中會(huì)大大提高匹配檢測(cè)的速度和性能,為之類的工作提供更加智能化、自動(dòng)化的服務(wù)。八、參考文[1],英設(shè)計(jì)程序可翻譯外星人語言,科技視界[J],(30):26-[2],,,檢測(cè)系統(tǒng)中模式匹配算法的研究,電子學(xué)報(bào)34(z1):2488-,字符串模式匹配算法的研究及改進(jìn),電子測(cè)試[J],(20):64-,數(shù)據(jù)結(jié)構(gòu),:KMP[J],西南民族大學(xué)學(xué)報(bào)·自然科學(xué)九、附30C++程序代碼#include"iostream"#include<string>#include<fstream>usingnamespacestd;60006000 {stringJiyuan[5]={"a","e","i","ai","ei"};stringfu[21]={"ch","ng","d","f","gh","h","sh","k","l","m","n","p","q","r","s","t","v","w","tlh","y","wh"};stringpiviot[13]={"circum","anti","dism","extra","hyper","momo","super","auto","micro","trans","cor","multi","post"};stringpostfix[13]= for(inti=0;i<6000;{

intwordLength=((double)rand()/RAND_MAX)*6+intjiyuanNumber=((double)rand()/RAND_MAX)*1+1; //取1~2個(gè)元音intpiviotNumber=((double)rand()/RAND_MAX)*2; //取0~2個(gè)前綴intpostfixNumber=((double)rand()/RAND_MAX)*2; //取0~2個(gè)后綴intfuLength;intselectedpiviotIndex=((double)rand()/RAND_MAX)*12; intselectedpostfixIndex=((double)rand()/RAND_MAX)* intselectedFuIndex=((double)rand()/RAND_MAX)*20;//被選中的輔音的下標(biāo)為0~20intselectedYuanIndex=((double)rand()/RAND_MAX)*4; charJiLength[5]={Jiyuan[0].length(),Jiyuan[1].length(),Jiyuan[2].length(),Jiyuan[3].length(),Jiyuan[4].length()charFulength[21]={fu[2].length(),fu[3].length(),fu[4].length(),fu[5].length(),fu[6].length(),fu[7].length(),fu[8].length()u[16].length(),fu[17].length(),fu[18].length(),fu[19].length(),fu[20].length()charpiviotlength[13]={piviot[2].length(),piviot[3].length(),piviot[4].length(),piviot[5].length(),piviot[6].length(),piviot[7].length(),piviot[8].length(),piviot[9].length(),piviot[10].length(),piviot[11].length(),piviot[12].length()};charpostfixlength[13]={th()stringselectedfu=""; stringselectedJi=""; stringselectedpiviot=""; stringselectedpostfix intcount intcount10;count{//fuLength=wordLength-jiyuanNumber-piviotNumber-while(jiyuanNumber>{

selectedJi=selectedJi+selectedYuanIndexdouble)randRAND_MAX)* count1=count1+JiLength[selectedYuanIndex];}while(piviotNumber>{

selectedpiviot=selectedpiviotIndexdouble)randRAND_MAX* count1=count1+piviotlength[selectedpiviotIndex];}while(postfixNumber>{

selectedpostfix=selectedpostfix+selectedpostfixIndexdouble)randRAND_MAX)* count1=count1+postfixlength[selectedpostfixIndex];}fuLength=wordLength-count1;while(count<fuLength){if(count==fuLength-{selectedFuIndex=((double)rand()/RAND_MAX)*20;{selectedFuIndex=((double)rand()/RAND_MAX)*}selectedfu=fu[selectedFuIndex]+selectedfu;count=count+Fulength[selectedFuIndex];}if(count==fuLength-{selectedFuIndex=((double)rand()/RAND_MAX)*5;while(Fulength[selectedFuIndex]!=1){selectedFuIndex=((double)rand()/RAND_MAX)*}selectedfu=fu[selectedFuIndex]+selectedfu;count=count+Fulength[selectedFuIndex];}if(count<fuLength-{selectedfu=fu[selectedFuIndex]+selectedfu;selectedFuIndex=((double)rand()/RAND_MAX)*20;count=count+Fulength[selectedFuIndex];}}}word=word+selectedpiviot+selectedJi+selectedfu+selectedpostfix+"}return}{stringtext="";stringcompare="";intnumber=((double)rand()/RAND_MAX)*3000+5000; intmove=0;Sleep(((double)rand()/RAND_MAX)*300+intstartIndex=((double)rand()/RAND_MAX)*75000+0;for(intcount=0;count<number;count=count+move){move=((double)rand()/RAND_MAX)*6+15; if(startIndex==0){for(intk=startIndex;k<startIndex+move;{text=text+}}{

//startIndex=startIndex+intflag=startIndex=startIndex+for(intk=startIndex;k<startIndex+flag+move;{if(word[k]=={k=k+1;}text=text+}}//startIndex=((double)rand()/RAND_MAX)*75000+}return}int{//intstringword,text1, text3,text4,text5,text6,text7,text8,text9,text10,text11,text12,text13,text14,text15,text16,text17,text18,text19,text20,text21,text22,text23,text24,text25,text26,text27,text28,text29,word=out<<word<<endl;text1=createText(word);ofstreamout1("text1.txt");out1<<text1<<endl;//number=((double)rand()/RAND_MAX)*900+text2=createText(word);ofstreamout2("text2.txt");out2<<text2<<endl;//number=((double)rand()/RAND_MAX)*900+100;text3=createText(word);ofstreamout3("text3.txt");out3<<text3<<endl;text4=createText(word);ofstreamout4("text4.txt");out4<<text4<<endl;text5=createText(word);ofstreamout5("text5.txt");out5<<text5<<endl;text6=createText(word);ofstreamout6("text6.txt");out6<<text6<<endl;text7=createText(word);ofstreamout7("text7.txt");out7<<text7<<endl;text8=createText(word);ofstreamout8("text8.txt");out8<<text8<<endl;text9=createText(word);ofstreamout9("text9.txt");out9<<text9<<endl;text10=createText(word);ofstreamout10("text10.txt");out10<<text10<<endl;text11=createText(word);out11<<text11<<endl;text12=createText(word);ofstreamout12("text12.txt");out12<<text12<<endl;text13=createText(word);ofstreamout13("text13.txt");out13<<text13<<endl;text14=createText(word);ofstreamout14("text14.txt");out14<<text14<<endl;text15=createText(word);ofstreamout15("text15.txt");out15<<text15<<endl;text16=createText(word);ofstreamout16("text16.txt");out16<<text16<<endl;text17=createText(word);ofstreamout17("text17.txt");out17<<text17<<endl;text18=createText(word);ofstreamout18("text18.txt");out18<<text18<<endl;text19=createText(word);ofstreamout19("text19.txt");out19<<text19<<endl;text20=createText(word);ofstreamout20("text20.txt");out20<<text20<<endl;text21=createText(word);ofstreamout21("text21.txt");out21<<text21<<endl;text22=createText(word);ofstreamout22("text22.txt");out22<<text22<<endl;text23=createText(word);ofstreamout23("text23.txt");out23<<text23<<endl;text24=createText(word);ofstreamout24("text24.txt");out24<<text24<<endl;text25=createText(word);ofstreamout25("text25.txt");out25<<text25<<endl;text26=createText(word);ofstreamout26("text26.txt");out26<<text26<<endl;text27=createText(word);ofstreamout27("text27.txt");out27<<text27<<endl;text28=createText(word);ofstreamout28("text28.txt");out28<<text28<<text29=createText(word);ofstreamout29("text29.txt");out29<<text29<<endl;text30=createText(word);ofstreamout30("text30.txt");out30<<text30<<endl;cout傳輸成功!endl;return}30(為例)KMPC++#include<stdio.h>#include<string.h>#include<stdlib.h>#include<Windows.h>#include<time.h>#defineMaxSize10000typedefstruct{chardata[MaxSize'0MaxSizeintlength0;//}voidStrAssign(MyString&s,char //s為型參{inti=for(i=0;cstr[i]!='\0';i++)s.data[i]=cstr[i];s.length=}voidGetNextval(MyStringtint //由模式串tnextval{intj=0,k=-1;nextval[0]=-1;while(j<t.length){if(k==-1||t.data[j]=={j++;if(t.data[j]!=t.data[k])nextval[j]=k;}

nextval[j]=k=}}intKMPIndex(MyStringsMyStringtKMP{intcurrentPostion=//s為主串t為子串intnextval[MaxSize];inti=0;intj=0;intk=intcount=0;inttimes=intkmpIndex[MaxSize0用來存放找到的位置。intflag=0;i=j0;//j0({if(j==-1||s.data[i]==t.data[j]||{if(count<4&&s.data[i]!=t.data[j])}{

j=nextval[j];j=0;ifit.lengths.length)//ST{}count0;//如果說設(shè)置容錯(cuò)后,仍然會(huì)運(yùn)行到這里,那么就需要重新把count}if(j>={ printf("\t1次//for循環(huán)用來判斷inttempK=for(inttemp=0;temp<tempK;{if((i-t.length)==kmpIndex[temp]&&0!=(i-{flag=1;}}if(flag=={kmpIndex[currentPostion]=(i- printf("1次,下標(biāo)為:%d子串為:kmpIndex[currentPostion]);if(currentPostion!=0){//kmpIndex[currentPostion]printf("匹配結(jié)果:在主串中起始下標(biāo)為:%d,kmpIndex[currentPostion]);printf("內(nèi)容為:");//fort.length長(zhǎng)度的字符串for(intprintI=0;printI<t.length;printI++) printf("%c",s.data[currentPostion+printI]);printf("%c",s.data[(i-t.length)+printI]);}//currentPostion1}flag=//for,此時(shí)相當(dāng)于多了一個(gè)匹配成功的且不是重復(fù)匹配的count=0;j0;j0,}}return}void{intnext[MaxSize],nextval[MaxSize];inti=0;intj=longstringLength;inttimes;intrandomNumber;FILE*pFile;char*buffer;size_tresult;MyStrings,t;pFile=fopen("text3.txt","rb");if(pFile==NULL){fputs("Fileerror",stderr);}fseek(pFile,0,SEEK_END);stringLength=fl(pFile);/*分配內(nèi)存整個(gè)文件buffer=(char*)malloc(sizeof(char)*stringLength);if(buffer==NULL){fputs("Memoryerror",stderr);}/*bufferresult=fread(buffer,1,stringLength,pFile);if(result!=stringLength){fputs("Readingerror",stderr);}StrAssign(sbuffer);//分配成功后,把它放入sfor(i=0;*buffer!='\n'&&i<stringLength-21;{//trandomNumber=((double)rand()/RAND_MAX)*6+15;printf("randomNumber:%d\n",randomNumber);for(j=0;j<randomNumber;{t.data[j]=printf("%c",}t.lengthj;//t.lengthj15times=KMPIndex(s,t);iftimes1&×0)//{}{}

printf("與該子串%s1次\n",t.data);printf("總共匹配成功:%d次\n",times);}}KMPC++#include<stdio.h>#include<string.h>#include<stdlib.h>#include<Windows.h>#include<time.h>#defineMaxSize10000typedefstruct{chardata[MaxSize'0MaxSizeintlength0;//}voidStrAssign(MyString&s,char //s為型參{inti=for(i=0;cstr[i]!='\0';i++)s.data[i]=cstr[i];s.length=}intKMPIndex(MyStringsMyStringtKMP{intcurrentPostion=//s為主串t為子串intnextval[MaxSize];inti=0;intj=0;intk=intcount=0;inttimes=intkmpIndex[MaxSize0用

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論