![(完整版)基于深度強(qiáng)化學(xué)習(xí)的flappybird_第1頁(yè)](http://file4.renrendoc.com/view/baa4167b0009f3c2c9360487aa1869c0/baa4167b0009f3c2c9360487aa1869c01.gif)
![(完整版)基于深度強(qiáng)化學(xué)習(xí)的flappybird_第2頁(yè)](http://file4.renrendoc.com/view/baa4167b0009f3c2c9360487aa1869c0/baa4167b0009f3c2c9360487aa1869c02.gif)
![(完整版)基于深度強(qiáng)化學(xué)習(xí)的flappybird_第3頁(yè)](http://file4.renrendoc.com/view/baa4167b0009f3c2c9360487aa1869c0/baa4167b0009f3c2c9360487aa1869c03.gif)
![(完整版)基于深度強(qiáng)化學(xué)習(xí)的flappybird_第4頁(yè)](http://file4.renrendoc.com/view/baa4167b0009f3c2c9360487aa1869c0/baa4167b0009f3c2c9360487aa1869c04.gif)
![(完整版)基于深度強(qiáng)化學(xué)習(xí)的flappybird_第5頁(yè)](http://file4.renrendoc.com/view/baa4167b0009f3c2c9360487aa1869c0/baa4167b0009f3c2c9360487aa1869c05.gif)
版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、SHANGHAIJIAOTONGUNIVERSITYProjectTitle:PlayingtheGameofFlappyBirdwithDeepReinforcementLearningGroupNumber:g-07GroupMembers:WangWenqing116032910080GaoXiaoning116032910032QianChen116032910073 ITOC o 1-5 h z HYPERLINK l bookmark8 o Current Document Introduction1 HYPERLINK l bookmark10 o Current Documen
2、t DeepQ-learningNetwork2 HYPERLINK l bookmark12 o Current Document Q-learning2 HYPERLINK l bookmark14 o Current Document ReinforcementLearningProblem2 HYPERLINK l bookmark16 o Current Document Q-learningFormulation63 HYPERLINK l bookmark34 o Current Document DeepQ-learningNetwork4 HYPERLINK l bookma
3、rk46 o Current Document InputPre-processing5 HYPERLINK l bookmark48 o Current Document ExperienceReplayandStability5 HYPERLINK l bookmark50 o Current Document DQNArchitectureandAlgorithm6 HYPERLINK l bookmark90 o Current Document Experiments7 HYPERLINK l bookmark92 o Current Document 3.1ParametersSe
4、ttings7 HYPERLINK l bookmark161 o Current Document 3.2ResultsAnalysis9 HYPERLINK l bookmark163 o Current Document Conclusion11 HYPERLINK l bookmark165 o Current Document References12PlayingtheGameofFlappyBirdwithDeepReinforcementLearning PlayingtheGameofFlappyBirdwithDeepReinforcementLearningAbstrac
5、tLettingmachineplaygameshasbeenoneofthepopulartopicsinAltoday.Usinggametheoryandsearchalgorithmstoplaygamesrequiresspecificdomainknowledge,lackingscalability.Inthisproject,weutilizeaconvolutionalneuralnetworktorepresenttheenvironmentofgames,updatingitsparameterswithQ-learning,areinforcementlearninga
6、lgorithm.WecallthisoverallalgorithmasdeepreinforcementlearningorDeepQ-learningNetwork(DQN).Moreover,weonlyusetherawimagesofthegameofflappybirdastheinputofDQN,whichguaranteesthescalabilityforothergames.Aftertrainingwithsometricks,DQNcangreatlyoutperformhumanbeings.IntroductionFlappybirdisapopulargame
7、intheworldrecentyears.Thegoalofplayersisguidingthebirdonscreentopassthegapconstructedbytwopipesbytappingscreen.Iftheplayertapthescreen,thebirdwilljumpup,andiftheplayerdonothing,thebirdwillfalldownataconstantrate.Thegamewillbeoverwhenthebirdcrashonpipesorground,whilethescoreswillbeaddedonewhenthebird
8、passthroughthegap.InFigure1,therearethreedifferentstateofbird.Figure1(a)representsthenormalflightstate,(b)representsthecrashstate,(c)representsthepassingstate.(b)pi*叭*(a)(c)Figure1:(a)normalflightstate(b)crashstate(c)passingstateOurgoalinthispaperistodesignanagenttoplayFlappybirdautomaticallywiththe
9、sameinputcomparingtohumanplayer,whichmeansthatweuserawimagesandrewardstoteachouragenttolearnhowtoplaythisgame.Inspiredby1,weproposeadeepreinforcementlearningarchitecturetolearnandplaythisgame.Recentyears,ahugeamountofworkhasbeendoneondeeplearningincomputervision6.Deeplearningextractshighdimensionfea
10、turesfromrawimages.Therefore,itisnaturetoaskwhetherthedeeplearningcanbeusedinreinforcementlearning.However,therearefourchallengesinusingdeeplearning.Firstly,mostsuccessfuldeeplearningapplicationstodatehaverequiredlargeamountsofhand-labelledtrainingdata.RLalgorithms,ontheotherhand,mustbeabletolearnfr
11、omascalarrewardsignalthatisfrequentlysparse,noisyanddelayed.Secondly,thedelaybetweenactionsandresultingrewards,whichcanbethousandsoftimestepslong,seemsparticularlydauntingwhencomparedtothedirectassociationbetweeninputsandtargetsfoundinsupervisedlearning.Thethirdissueisthatmostdeeplearningalgorithmsa
12、ssumethedatasamplestobeindependent,whileinreinforcementlearningonetypicallyencounterssequencesofhighlycorrelatedstates.Furthermore,inRLthedatadistributionchangesasthealgorithmlearnsnewbehaviors,whichcanbeproblematicfordeeplearningmethodsthatassumeafixedunderlyingdistribution.Thispaperwilldemonstrate
13、thatusingConvolutionalNeuralNetwork(CNN)canovercomethosechallengementionedaboveandlearnsuccessfulcontrolpolicesfromrawimagesdatainthegameFlappybird.ThisnetworkistrainedwithavariantoftheQ-learningalgorithm6.ByusingDeepQ-learningNetwork(DQN),weconstructtheagenttomakerightdecisionsonthegameflappybirdba
14、relyaccordingtoconsequentrawimages.DeepQ-learningNetworkRecentbreakthroughsincomputervisionhavereliedonefficientlytrainingdeepneuralnetworksonverylargetrainingsets.Byfeedingsufficientdataintodeepneuralnetworks,itisoftenpossibletolearnbetterrepresentationsthanhandcraftedfeatures23.Thesesuccessesmotiv
15、ateustoconnectareinforcementlearningalgorithmtoadeepneuralnetwork,whichoperatesdirectlyonrawimagesandefficientlyupdateparametersbyusingstochasticgradientdescent.Inthefollowingsection,wedescribetheDeepQ-learningNetworkalgorithm(DQN)andhowitsmodelisparameterized.Q-learning2.1.1ReinforcementLearningPro
16、blemQ-learningisaspecificalgorithmofreinforcementlearning(RL).AsFigure2show,anagentinteractswithitsenvironmentindiscretetimesteps.Ateachtimet,theagentreceivesanstatesandarewardr.Itthenchoosesanactionafromthesetoftttactionsavailable,whichissubsequentlysenttotheenvironment.Theenvironmentmovestoanewsta
17、tesandtherewardrassociatedwiththetransitiont+1t+1(s,a,s)isdetermined4.ttt+1*EnvironmemRcwaidActionStateFigure2:TraditionalReinforcementLearningscenarioThegoalofanagentistocollectasmuchrewardaspossible.Theagentcanchooseanyactionasafunctionofthehistoryanditcanevenrandomizeitsactionselection.Notethatin
18、ordertoactnearoptimally,theagentmustreasonaboutthelongtermconsequencesofitsactions(i.e.,maximizethefutureincome),althoughtheimmediaterewardassociatedwiththismightbenegativeQ-learningFormulation6InQ-learningproblem,thesetofstatesandactions,togetherwithrulesfortransitioningfromonestatetoanother
19、,makeupaMarkovdecisionprocess.Oneepisodeofthisprocess(e.g.onegame)formsafinitesequenceofstates,actionsandrewards:s,a,r,s,a,r,.,s,a,r,s001112n-1n-1nnHeresrepresentsthestate,aistheactionandristherewardafterperformingtheiii+1actiona.Theepisodeendswithterminalstates.Toperformwellinthelong-term,weinneedt
20、otakeintoaccountnotonlytheimmediaterewards,butalsothefuturerewardswearegoingtoget.Definethetotalfuturerewardfromtimepointtonwardas:TOC o 1-5 h zR=r+r+.+r+r(1)ttt+1n-1nInordertoensurethedivergenceandbalancetheimmediaterewardandfuturereward,totalrewardmustusediscountedfuturereward:R=r+丫r+.+yn-t-ir+yn-
21、tr=yi-tr(2)ttt+1n-1nii=tHereyisthediscountfactorbetween0and1,themoreintothefuturetherewardis,thelesswetakeitintoconsideration.Transformingequation(2)canget:R=r+yR(3)ttt+1InQ-learning,defineafunctionQ(s,a)representingthemaximumdiscountedfuturettrewardwhenweperformactionainstate:tTOC o 1-5 h zQ(s,a)=m
22、axR(4)ttt+1ItiscalledQ-function,becauseitrepresentsthe“quality”ofacertainactioninagivenstate.Agoodstrategyforanagentwouldbetoalwayschooseanactionthatmaximizesthediscountedfuturereward:n(s)=argmaxQ(s,a)(5)tatttHerenrepresentsthepolicy,therulehowwechooseanactionineachstate.Givenatransition(s,a,s),equa
23、tion(3)(4)cangetfollowingbellmanequation-maximumttt+1PlayingtheGameofFlappyBirdwithDeepReinforcementLearning(9) futurerewardforthisstateandactionistheimmediaterewardplusmaximumfuturerewardforthenextstate:Q(s,a)=r+ymaxQ(s,a)(6)tttat+iTheonlywaytocollectinformationabouttheenvironmentisbyinteractingwit
24、hit.Q-learningistheprocessoflearningtheoptimalfunctionQ(s,a),whichisatablein.ttHereistheoverallalgorithm1:Algorithm1Q-learningInitializeQnum_states,num_actionsarbitrarilyObserveinitialstates0RepeatSelectandcarryoutanactionaObserverewardrandnewstatesQ(s,a)=Q(s,a)+a(r+丫maxQ(s,a)-Q(s,a)as=sUntiltermina
25、tedDeepQ-learningNetworkInQ-learning,thestatespaceoftenistoobigtobeputintomainmemory.Agameframeof80 x80binaryimageshas2640(states,whichisimpossibletoberepresentedbyQ-table.Whatsmore,duringtraining,encounteringaknownstate,Q-learningjustperformarandomaction,meaningthatitsnotheuristic.Inorderovercometh
26、esetwoproblems,justapproximatetheQ-tablewithaconvolutionalneuralnetworks(CNN)78.ThisvariationofQ-learningiscalledDeepQ-learningNetwork(DQN)910.AftertrainingtheDQN,amultilayerneuralnetworkscanapproachthetraditionaloptimalQ-tableasfollowed:Q(s,a;9)=Q*(s,a)(7)ttttAsforplayingflappybird,thescreenshotsti
27、sinputtedintotheCNN,andtheoutputsaretheQ-valueofactions,asshowninFigure3:Figure3:InDQN,CNNsinputisrawgameimagewhileitsoutputsareQ-valuesQ(s,a),oneneuroncorrespondingtooneactionsQ-value.InordertoupdateCNNsweight,definingthecostfunctionandgradientupdatefunctionas910:1L=2r+max,Q(sVL=(r+ymaxQ(statat+1,a
28、;t+1,a;0-)Q(s,a;0)tt(8)0-)-Q(s,a;0)VQ(s,a;0)tt0ttPlayingtheGameofFlappyBirdwithDeepReinforcementLearning 0=0-+VL(0-)(10)9-Here,0aretheDQNparametersthatgettrainedand0arenon-updatedparametersfortheQ-valuefunction.Duringtraining,useequation(9)toupdatetheweightsofCNN.Meanwhile,obtainingoptimalrewardinev
29、eryepisoderequiresthebalancebetweenexploringtheenvironmentandexploitingexperience.-greedyapproachcanachievethistarget.Whentraining,selectarandomactionwithprobabilityorotherwisechoosetheoptimalactiona=argmaxincreaseinnumberofupdates.Q(s,a;0)t.TheannealslinearlytozerowithInputPre-processingWorkingdire
30、ctlywithrawgameframes,whichare288x512pixelRGBimages,canbecomputationallydemanding,soweapplyabasicpreprocessingstepaimedatreducingtheinputdimensionality.colorgraybinarystackupFigure4:Pre-processgameframes.Firstconvertframestograyimages,thendownsamplethemtospecificsize.Afterwards,convertthemtobinaryim
31、ages,finallystackuplast4framesasastate.Inordertoimprovetheaccuracyoftheconvolutionalnetwork,thebackgroundofgamewasremovedandsubstitutedwithapureblackimagetoremovenoise.AsFigure4shows,therawgameframesarepreprocessedbyfirstconvertingtheirRGBrepresentationtogray-scaleanddown-samplingittoan80 x80image.T
32、henconvertthegrayimagetobinaryimage.Inaddition,stackuplast4gameframesasastateforCNN.Thecurrentframeisoverlappedwiththepreviousframeswithslightlyreducedintensitiesandtheintensityreducesaswemovefartherawayfromthemostrecentframe.Thus,theinputimagewillgivegoodinformationonthetrajectoryonwhichthebirdiscu
33、rrentlyin.ExperienceReplayandStabilityBynowwecanestimatethefuturerewardineachstateusingQ-learningandapproximatetheQ-functionusingaconvolutionalneuralnetwork.ButtheapproximationofQ-valuesusingnon-linearfunctionsisnotverystable.InQ-learning,theexperiencesrecordedinasequentialmannerarehighlycorrelated.
34、IfsequentiallyusethemtoupdatetheDQNparameters,thetrainingprocessmightstuckinalocalminimalsolutionordiverge.ToensurethestabilityoftrainingofDQN,weuseatechnicaltrickcalledexperiencereplay.Duringgameplaying,particularnumberofexperience(s,a,r,s)arettt+1t+1storedinareplaymemory.Whentrainingthenetwork,ran
35、dommini-batchesfromthereplaymemoryareusedinsteadofthemostrecenttransition.Thisbreaksthesimilarityofsubsequenttrainingsamples,whichotherwisemightdrivethenetworkintoalocalminimum.Asaresultofthisrandomnessinthechoiceofthemini-batch,thedatathatgoesintoupdatetheDQNparametersarelikelytobede-correlated.Fur
36、thermore,tobetterthestabilityoftheconvergenceofthelossfunctions,weuseacloneoftheDQNmodelwithparameters0-.Theparameters0-areupdatedto9aftereveryCupdatestotheDQN.DQNArchitectureandAlgorithmAsshowninFigure5,firstly,gettheflappybirdgameframe,andafterpre-processingdescribedinsection2.3,stackuplast4frames
37、asastate.InputthisstateasrawimagesintotheCNNwhoseoutputisthequalityofspecificactioningivenstate.,theagentperformsanactionAccordingtopolicy兀(s)=argmaxQ(s,a),withprobability&,tatttotherwiseperformarandomaction.Thecurrentexperienceisstoredinareplaymemory,arandommini-batchofexperiencesaresampledfromthem
38、emoryandusedtoperformagradientdescentontheCNNsparameters.Thisisaninteractiveprocessuntilsomecriteriaarebeingsatisfied.RtindomlvsampleExperienceMemoryFigure5:DQNstrainingarchitecture:upperdataflowshowthetrainingprocess,whilethelowerdataflowdisplaytheinteractiveprocessbetweentheagentandenvironment.Raw
39、image頭RcairdrThecompleteDQNtrainingprocessisshowninAlgorithm2.Weshouldnotethatthefactorissettozeroduringtest,andwhiletrainingweuseadecayingvalue,balancingtheexplorationandexploitation.Algorithm2DeepQ-learningNetworkInitializereplaymemoryDtocertaincapacityNInitializetheCNNwithrandomweights0Initialize
40、0-=:9forgames=1:maxGamesdoforsnapShots=1:TdoWithprobabilityselectarandomactionatotherwiseselecta=:argmaxQ(s,a;0)Executeaandobserverandnextsatestt+1t+1Storetransition(st,at,rt+1,st+1)inreplaymemoryDSamplemini-batchoftransitionsfromDforj=1:batchSizedoifgameterminatesatnextstatethenelseQ_pred=:rj+maxQ(
41、s,a;0-)ja/+1endifPerformgradientdescentonL=(Q_pred-Q(s,a;0)2accordingto2ttequation(0)endforEveryCstepsreset0-=:0endforendforExperimentsThissectionwilldescribeouralgorithmsparameterssettingandtheanalysisofexperimentresults.3.1ParametersSettingsFigure6illustratesourCNNslayerssetting.Theneuralnetworksh
42、as3CNNhiddenlayersfollowedby2fullyconnectedhiddenlayers.Table1showthedetailedparametersofeverylayer.HerewejustuseamaxpoolinginthefirstCNNhiddenlayer.Also,weusetheReLUactivationfunctiontoproducetheneuraloutput.Figure6:ThelayersettingofCNN:thisCNNhas3convolutionallayersfollowedby2fullyconnectedlayers.
43、Asfortraining,weuseAdamoptimizertoupdatetheCNNsparameters.Table1:ThedetailedlayerssettingofCNNLayerInputconv1max_poolconv2conv3fc4fc580 x80 x420 x20 x3210 x10 x325x5x645x5x64512Filtersize8x82x24x43x3StrideNumfilters326464512ActivationReLUReLUReLUReLULinearOutput20 x20 x3210 x10 x325x5x645x5x645HTabl
44、e1listsalltheparametersettingofDQN.Weuseadecayedrangingfrom0.1to0.001tobalanceexplorationandexploitation.Whatsmore,Table2showsthatthebatchstochasticgradientdescentoptimizerisAdamwithbatchsizeof32.Finally,wealsoallocatealargereplaymemory.Table2:ThetrainingparametersofDQNParametersvalueObservesteps100
45、000Exploresteps3000000Initialepsilon0.1Finalepsilon0.001Replaymemory50000batchsize32learningrate0.000001FPS30optimizationalgorithmAdam3.2ResultsAnalysisWetrainourmodelabout4millionepochs.Figure7showstheweightsandbiasesofCNNsfirsthiddenlayer.Theweightsandbiasesfinallycentralizearound0,withlowvariance
46、,whichdirectlystabilizeCNNsoutputQ-valueQ(s,a)andreducettprobabilityofrandomaction.ThestabilityofCNNsparametersleadstoobtainingoptimalpolicy.Figure7:Left(right)figureisthehistogramofweights(biases)ofCNNsfirsthiddenlayerFigure8isthecostvalueofDQNduringtraining.Thecostfunctionhasaslowdowntrend,closeto
47、0after3.5millionepochs.ItmeansthatDQNhaslearnedthemostcommonstatesubspaceandwillperformoptimalactionwhencomingacrossknownstate.Inaword,DQNhasobtaineditsbestactionpolicy.Figure8:DQNscostfunction:theplotshowsthetrainingprogressofDQN.Wetrainedourmodelabout4millionepochs.Whenplayingflappybird,ifthebirdg
48、etsthroughthepipe,wegiveareward1,ifdead,give-1,otherwise0.1.Figure9istheaveragereturnedrewardfromenvironment.Thestabiltiyinfinaltrainingstatemeansthattheagentcanautomaticallychoosethebestaction,andtheenvironmentgivesthebestrewardinturns.Weknowthattheagentandenvironmenthasenterintoafriendlyinteractio
49、n,guaranteeingthemaximaltotalreward.Figure9:Theaveragereturnedrewardfromenvironment.Weaveragethereturnedrewardevery1000epochs.FromthisFigure10,thepredictedmaxQ-valuefromCNNconvergesandstabilizesinavalueafterabout100000.ItmeansthatCNNcanaccuratelypredictthequalityofactionsinspecificstate,andwecanstea
50、dilyperformactionswithmaxQ-value.TheconvergenceofmaxQ-valuesstatesthatCNNhasexploredthestatespacewidelyandgreatlyapproximatedtheenvironmentwell.Figure10:TheaveragemaxQ-valueobtainedfromCNNsoutput.WeaveragethemaxQ-valueevery1000epochs.Figure11illustratestheDQNsactionstrategy.IfthepredictedmaxQ-valuei
51、ssohigh,weareconfidentthatwewillgetthroughthegapwhenperformtheactionwithmaxQ-valuelikeA,C.IfthemaxQ-valueisrelativelylow,andweperformtheaction,wemighthitthepipe,likeB.Inthefinalstateoftraining,themaxQ-valueisdramaticallyhigh,meaningthatweareconfidenttogetthroughthegapsifperformingtheactionswithmaxQ-
52、value.maxQLE.a)valueiji5ABC*tr*wlTiMQFigure11:TheleftmostplotshowstheCNNspredictedmaxQ-valuefora100framessegmentofthegameflappybird.ThethreescreenshotscorrespondtotheframeslabeledbyA,B,andCrespectively.E0-4*04nnA7x-eConclusionWesuccessfullyuseDQNtoplayflappybird,whichcanoutperformhumanbeings.DQNcanautomaticallylearnknowledgefromenvironmentjustusingrawimagetoplaygameswithoutpriorknowledge.ThisfeaturegiveDQNthepowertoplayalmostsim
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年度個(gè)人住房按揭貸款合同范本與參考版本
- 2025年度智慧城市建設(shè)中的智能監(jiān)控設(shè)備安裝合同
- 2025年度智能空調(diào)系統(tǒng)安裝與綠色建筑認(rèn)證服務(wù)合同
- 2025年度數(shù)據(jù)中心冷卻系統(tǒng)基礎(chǔ)裝修合同范本
- 2025年度體育賽事贊助合同補(bǔ)充協(xié)議范本
- 2025年度深圳文化教育產(chǎn)業(yè)合作合同
- 2025年度環(huán)保設(shè)備銷售與安裝服務(wù)合同-@-2
- 2025年度跨境電商商品購(gòu)銷合同模板指南
- 二零二五年度國(guó)際貿(mào)易實(shí)務(wù)合同風(fēng)險(xiǎn)防控策略2篇
- 2025年度企業(yè)周轉(zhuǎn)資金借款協(xié)議合同規(guī)范
- 2025-2030年中國(guó)電解鋁市場(chǎng)需求規(guī)模分析及前景趨勢(shì)預(yù)測(cè)報(bào)告
- 閩教版(2020)小學(xué)信息技術(shù)三年級(jí)上冊(cè)第2課《人工智能在身邊》說(shuō)課稿及反思
- 正面上手發(fā)球技術(shù) 說(shuō)課稿-2023-2024學(xué)年高一上學(xué)期體育與健康人教版必修第一冊(cè)
- 佛山市普通高中2025屆高三下學(xué)期一模考試數(shù)學(xué)試題含解析
- 人教 一年級(jí) 數(shù)學(xué) 下冊(cè) 第6單元 100以內(nèi)的加法和減法(一)《兩位數(shù)加一位數(shù)(不進(jìn)位)、整十?dāng)?shù)》課件
- 事故隱患排查治理情況月統(tǒng)計(jì)分析表
- 2024年中國(guó)黃油行業(yè)供需態(tài)勢(shì)及進(jìn)出口狀況分析
- 永磁直流(汽車)電機(jī)計(jì)算程序
- 正常分娩 分娩機(jī)制 助產(chǎn)學(xué)課件
- 讀書分享-精力管理課件
- 新上崗干部的90天轉(zhuǎn)身計(jì)劃課件
評(píng)論
0/150
提交評(píng)論