版權說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權,請進行舉報或認領
文檔簡介
FinanceandEconomicsDiscussionSeries
FederalReserveBoard,Washington,D.C.
ISSN1936-2854(Print)
ISSN2767-3898(Online)
ManufacturingSentiment:ForecastingIndustrialProductionwithTextAnalysis
TomazCajner,LelandD.Crane,ChristopherKurz,NormanMorin,PaulE.Soto,BetsyVrankovich
2024-026
Pleasecitethispaperas:
Cajner,Tomaz,LelandD.Crane,ChristopherKurz,NormanMorin,PaulE.Soto,andBetsyVrankovich(2024).“ManufacturingSentiment:ForecastingIndustrialProductionwithTextAnalysis,”FinanceandEconomicsDiscussionSeries2024-026.Washington:BoardofGov-ernorsoftheFederalReserveSystem,
/10.17016/FEDS.2024.026
.
NOTE:StafworkingpapersintheFinanceandEconomicsDiscussionSeries(FEDS)arepreliminarymaterialscirculatedtostimulatediscussionandcriticalcomment.TheanalysisandconclusionssetfortharethoseoftheauthorsanddonotindicateconcurrencebyothermembersoftheresearchstafortheBoardofGovernors.ReferencesinpublicationstotheFinanceandEconomicsDiscussionSeries(otherthanacknowledgement)shouldbeclearedwiththeauthor(s)toprotectthetentativecharacterofthesepapers.
ManufacturingSentiment:
ForecastingIndustrialProductionwithTextAnalysis*
TomazCajner
NormanMorin
LelandD.Crane
PaulE.Soto
April2024
ChristopherKurz
BetsyVrankovich
Abstract
Thispaperexaminesthelinkbetweenindustrialproductionandthesentimentex-pressedinnaturallanguagesurveyresponsesfromU.S.manufacturingirms.Wecom-pareseveralnaturallanguageprocessing(NLP)techniquesforclassifyingsentiment,rangingfromdictionary-basedmethodstomoderndeeplearningmethods.Usingamanuallylabeledsampleasgroundtruth,weindthatdeeplearningmodels—partiallytrainedonahuman-labeledsampleofourdata—outperformothermethodsforclas-sifyingthesentimentofsurveyresponses.Further,wecapitalizeonthepanelnatureofthedatatotrainmodelswhichpredictirm-levelproductionusinglaggedirm-leveltext.Thisallowsustoleveragealargesampleof“naturallyoccurring”labelswithnomanualinput.Wethenassesstheextenttowhicheachsentimentmeasure,aggregatedtomonthlytimeseries,canserveasausefulstatisticalindicatorandforecastindustrialproduction.Ourresultssuggestthatthetextresponsesprovideinformationbeyondtheavailablenumericaldatafromthesamesurveyandimproveout-of-sampleforecast-ing;deeplearningmethodsandtheuseofnaturallyoccurringlabelsseemespeciallyusefulforforecasting.Wealsoexplorewhatdrivesthepredictionsmadebythedeeplearningmodels,andindthatarelativelysmallnumberofwords—associatedwithverypositive/negativesentiment—accountformuchofthevariationintheaggregatesentimentindex.
JELcodes:C1,E17,O14
Keywords:IndustrialProduction,NaturalLanguageProcessing,MachineLearning,Forecasting
*AllauthorsareattheFederalReserveBoardofGovernors.WethanktheInstituteforSupplyManage-ment,includingKristinaCahill,TomDerry,DebbieFogel-Monnissen,RoseMarieGoupil,PaulLee,SusanMarty,andDenisWolowiecki,foraccesstoandhelpwiththemanufacturingsurveydatathatunderlietheworkdescribedbythispaper.WearethankfulforcommentsandsuggestionsfromStephenHansen,AndreasJoseph,JuriMarcucci,ArthurTurrell,andparticipantsattheSocietyforGovernmentEconomistsAnnualConference,theESCoEConferenceonEconomicMeasurement,theGovernmentAdvancesinStatisticalProgrammingConference,theSocietyforEconomicMeasurementConference,andtheNontraditionalData,MachineLearning,andNaturalLanguageProcessinginMacroeconomicsConference.Theanalysisandcon-clusionssetforthherearethoseoftheauthorsanddonotindicateconcurrencebyothermembersoftheresearchstafortheBoardofGovernors.
2
1Introduction
Inrecentyearstherehasbeenanexplosionofinterestinnaturallanguageprocessing(NLP)withininanceandmacroeconomics.Theuseoftextdatatoforecastandassistinmodelestimationisbecomingincreasinglycommonplace.Still,therearemanyopenquestionsaroundtheuseofNLPinempiricalwork.Forexample,whichofthenumerousavailablemethodsworkbest,andworkbestinspeciiccontexts?Areof-the-shelftoolsappropriate,oraretheregreaterreturnstospecializingmodelstothedataathand?Howusefulistextforforecastingrealoutputindicators,suchasmanufacturingoutput?WhatexplainsthepredictionsmadebycomplicatedNLPmodels?Thispaperaddressesthesequestions,usinganoveldatasetandavarietyofNLPmethodsrangingfromtraditionaldictionariestoine-tunedtransformerneuralnetworks.
OurprimarydatasourceisthemonthlysurveymicrodataunderlyingtheInstituteforSupplyManagement’s(ISM)ManufacturingReportonBusiness.ThesurveyistakenbypurchasingmanagersatarepresentativesampleofU.S.manufacturingirms.Partofthesurveyconsistsofcategorical-responsequestionsaboutaspectsoftheircurrentoperations,includingproduction,inventories,backlogs,employment,andneworders.Theanswerstothesequestionsareoftheform“worse/thesame/betterthanlastmonth”,andareaggregatedintothewidely-reportedISMdifusionindexes.Butthesurveyalsoincludesfree-responsetextboxes,wherepurchasingmanagerscanprovidefurthercommentseitheringeneraloraboutspeciicaspectsoftheirbusinesses;thesecommentsareanovelsourceofsignalabouttheeconomyandourfocusinthispaper
.1
Ourirststepistoquantifythetextintoaneconomicallyimportantandinterpretablemeasure.Wefocusonsentiment,giventhatwavesofoptimismandpessimismhavehis-
toricallybeenlinkedtobusinesscycleluctuations(Keynes,
1937).Webeginbyevaluating
variousNLPmethodsintermsoftheirabilitytocorrectlyclassifythesentimentexpressedinindividualcomments.Ourcontextisfairlyspeciic:thedataaremanufacturing-sectorpurchasingmanagersopiningaboutaboutthebusinessoutlookfortheirirm,withoutmuchdiscussionofinancialconditions.Whiletherearenumeroussentimentclassiicationmod-elsavailable,manyweredevelopedwithotherdatainmind,suchassocialmediaposts
(Nielsen
,
2011)
.Evenwithineconomicsandinance,mostworkhasfocusedoninance-
1WhileISMcollectstheseresponsesthroughthesurvey,thistextisconidentialandnotincor-poratedintothepublicizedindexes.AsampleofresponsesarepublishedinthemonthlyISMRe-portonBusiness(see
/supply-management-news-and-reports/reports/
ism-report-on-business/)
.
3
relatedlanguage(Araci,
2019;
Correaetal.,
2021;
Huangetal.,
2022)
.Thelackofresultsformanufacturing-speciicdatasetsmotivatesourassessmentofavarietyofNLPtechniques.
Onecommonapproachistocountthefrequencyofwordswithinasentimentdictionary.Economistsinitiallyusedpositiveandnegativewordsfrompsychologyliterature,buthavesincemovedontousingdomain-speciicwords(e.g.,
Correaetal.,
2021)andusingsimple
wordcountstomeasureothertypesoftone,suchasuncertainty(see
Bakeretal.,
2016
and
Gentzkowetal.,
2019).Whilethismethodistransparent,itmayfailtocapturenegation,
synonyms,andoftenrequirescontext-speciicdictionariesthatmaynotbeavailable.Morerecentlydevelopedtechniquesemploydeeplearningmethodsthataccountforthenuancesoflanguage.WefocusonvariantsofBERT(see
Devlinetal.,
2018),aprecursorofpopular
largelanguagemodelslikeChatGPT.Thesemodelsarepre-trained:theparametersaresetbyexposingthemodeltoalargecorpusoftext—suchastheentiretyofWikipedia—andattemptingtopredictmissingwordsortherelationshipbetweensentences.Thepre-trainedmodelscanbeusedtoclassifysentimentdirectly,ortheycanbefurthertrained(“ine-tuned”)onaspeciicdataset.Thelatterapproachattemptstogetthebestofbothworlds:asolidabilitytoparselanguagefromtheexposuretoalargequantityoftrainingdata,plusthecontext-speciicnuancefromtheine-tuningdata.Whiledeeplearninggetsenormousattention,itisex-anteunclearwhetheritshouldoutperformcarefullycurateddictionariesinourcontext.
Comparingtheaccuracyofthesediferentmethodsonasampleofhand-codedcommentsfromourdatasetweindthatdeeplearningdoeshaveanadvantageonourdata,inpartbecausethebrevityofthecommentsmeansthatmanycommentshavenooverlapwithdictionaryterms.Inaddition,weindthatthereisvalueinspecializingthemodelstoourdata:themodelsine-tunedonourdatahavethehighestsentimentclassiicationaccuracyonahold-outsample.Theseresultspointtotheadvantagesofusingpre-trainedmodels,aswellascarefullyspecializingthemtothetaskathand.OurhopeisthattheseresultshelpguideothereconomistswhendecidingbetweenNLPapproaches.
Thesentimentmeasuresbasedonfree-formtextualresponsesintheISMdataaggre-gateintoindexesthatcloselymirrorboththedifusionindexbasedontheresponsestothecategoricalsurveyandaggregatemanufacturingoutput,asmeasuredbythemanufactur-ingcomponentofindustrialproduction.Wefurtherinvestigatetherelationshipbetweentheaveragesentimentexpressedbypurchasingmanagersandmanufacturingoutputeconometri-cally.Ourbaselineforecastingmodelaskswhethersentimentcanhelpforecastmanufacturingoutputandincludes—amongothercontrols—someoftheISMdifusionindexes,sothetest
4
iswhetherthesentimentindexeshaveadditionalinformationbeyondtheISMcategoricalresponsesdata.Weindthatmostdictionary-basedtextvariablesdonothelppredictman-ufacturingoutput,withtheexceptionofacuratedinancialstability-speciicdictionary.Ontheotherhand,sentimentvariablesfromthedeeplearningmodelsarepredictiveoffuturemanufacturingoutput.Out-of-sampleforecastingexercisesshowthattheinancialstabil-itydictionaryanddeeplearningtechniquessigniicantlyreducethemeansquaredforecasterrorsaswell.Overall,ourresultssuggestthatpurchasingmanagers’surveyresponsescon-tainusefulforward-lookinginformation,andthatsentiment-basedmeasurescanimprovetheaccuracyofforecastsofmanufacturingoutput.
Theexercisesdescribedaboverelyonamanually-labeledsampleofthedata,bothtoassesstheaccuracyofdiferentmethodsandtohelpine-tunesomeofthedeep-learningbasedmethods.However,thepanelmicrodataallowforadiferentapproach.Sinceirmsareinthesurveyformultiplemonths,wecanlinkthetext(andother)datafromagivenmonthtonextmonth’sirm-levelproductiondata.Fittingamodeltothesedataletsusforecastirm-levelproductionusingirm-levellaggedinformation.Thismethodologyhastwoadvantages.First,itgivesusamuchlargertrainingsamplesizeascomparedtothemanuallylabeleddata.Second,italignsthetrainingdataobjectiveverypreciselywiththeaggregateforecastingobjective.Onthissecondpoint,wedoourbestwhenmanuallylabelingdatatodiscernwhetherthecommentisindicativeofrisingorfallingindustrialproduction.Butthereareplentyofambiguouscases,sotherearesomeclearadvantagestolettingthedataspeak,andseeingwhattextisactuallyassociatedwithfuture(irmlevel)changesinproduction.Weindthatine-tuninginthiswayiscompetitivewithusingthemanuallabels,andinsomecasespreferable.
Finally,wemakeprogressontheexplainabilityofdeeplearningmodels.Thesemodelsarenotoriouslyopaque,aconsequenceoftheirveryhighparametercountandextremelynonlineararchitecture.Thiscanmakeitdi伍culttotrusttheoutputsofsuchmodels,asitisnotinitiallycleariftheseeminglygoodpredictionsarebasedonsolidfoundations.Weuseastandardmachinelearninginterpretabilitymethod—Shapleydecompositions—toscorethecontributionofeachindividualwordineachcomment.Ourresultspointtoasensibleinterpretationofourdeeplearningmodels.First,thescoreforeachwordisroughlyconstantovertime:wordsdonotdramaticallychangetheiraverageconnotation(thoughtheunderlyingdeeplearningmodelallowsforthis).Second,therearefattailstothescores:mostwordshavescoresveryclosetozero(neutral),witharelativelysmallnumberofwordshavingextremesentiment.Forexample,themostpositivewordsinclude
5
“brisk”,“excellent”,“booming”,“improve”,and“e伍cient”;amongthemostnegativewords
are“unstable”,“insu伍cient”,“fragile”,“inconsistent”,and“questionable”.Theclose-to-neutralwordscontributeverylittletoaggregatesentiment,evenafteraccountingforthefactthattheyoccurveryfrequently.Finally,weindthatchangesinouraggregatedsentimentindexarelargelyaccountedforbychangesinthefrequencyofthewordswiththemostextreme(positiveornegative)sentimentscores,withthevastmajorityofwordsplayinglittlerole.Thus,whileitmaybedi伍culttomanuallyconstructadomain-speciicdictionaryfromscratch,itispossibletoextractafairlysimple,interpretabledictionaryfromthedeeplearningmodel.
Ourpapercontributestotwostrandsofliterature.First,ourcomparisonofNLPtech-niquesformeasuringsentimentaddstothegrowingbodyofliteratureincorporatingNLPintoeconomicandinancialresearch.Sincetheseminalworkof
Tetlock
(2007),manystudies
haveuseddictionary-basedmethods(Bakeretal.,
2016;
Hassanetal.,
2019;
Youngetal.,
2021
;
Cowheyetal.,
2022),andreinedlexiconsforspeciiccontextshavebeenshownto
improveperformanceinmeasurementandforecasting(Correaetal.,
2021;
Gardneretal.,
2022
;
Sharpeetal.,
2023).Machinelearningtechniqueshavealsobeenusedtoselectword
lists(ManelaandMoreira,
2017;
Soto,
2021).Morerecentpapersincorporatemoresophis
-
ticatedmachinelearningmethodstoextractthetenseandtopicoftexts(Angelicoetal.,
2022
;
HanleyandHoberg,
2019;
Hansenetal.,
2018;
Kalamaraetal.,
2022)
.AdvancesinNLP,particularlytheuseofdeeplearningtechniques,havesigniicantlyimprovedsentiment
classiication(HestonandSinha,
2017;
Araci,
2019;
Huangetal.,
2022;
Bybee,
2023;
Jhaet
al.
,
2024)
.
Second,wecontributetotheliteratureonforecastingindustrialproduction(D’Agostino
andSchnatz
,
2012;
LahiriandMonokroussos,
2013;
Ardiaetal.,
2019;
Cimadomoetal.,
2022
;
Andreouetal.,
2017).Ouranalysisoftherelationshipbetweensentimentandindus
-trialproductionprovidesnewinsightsintotheroleofunstructuredtextdataineconomic
forecasting(Marcucci,
2024)
.BycomparingvariousNLPtechniques,weareabletoiden-tifywhichmethodsaremostefectiveforclassifyingsentimentandincorporatingthemintopredictivemodelsofindustrialproduction.
Thepapermostsimilartooursis
Shapiroetal.
(2022),whoindthatdomainspeciic
dictionariescanimprovepredictionsofhumanratedsentiment.Weindbroadlysimilarresultsusingainancialstability(ratherthanageneralpurpose)dictionarytomeasuresentiment,butmoveonestepfurtherbyprovidingarobustcomparisontolargelanguagemodels.Ourpaperdifersfromtheirsintwoimportantways.First,wefocusoncreating
6
asentimentindexfromirm-leveldata,ratherthanbeginningtheanalysisatanaggregatemacroeconomiclevel.Insteadofmeasuringconsumersentimentthroughnewspaperarticles,wemeasuremanufacturingsentimentfromapanelofsurveyresponses.Ouruniquemicro-leveldataallowustounderstandthevalueoftextbeyondcategoricalresponsesandnaturallyoccurringlabels.Second,
Shapiroetal.
(2022)compareslexicon-basedsentimentapproaches
onlytobaselineBERT,whichatthetimewasthemostdevelopedtransfer-learningbasedmodel.WealsoconsidernewerdeeplearningmodelsbasedonBERT,particularlythoseine-tunedondomainspeciicandnaturallyoccurringdata.Weapplyinterpretabilitytechniquestothese‘blackbox’modelsandshowthataggregatesentimentindexesderivedfromdeeplearninghingeonthefrequenciesofrelativelyfewwords.
Theremainderofthepaperisstructuredasfollows.Section
2
presentsourdata.Section
3
reviewshowwemeasuresentimentfromthetextualsurveydataandSection
4
overviewstheresultingindexes.Section
5
presentstheempiricalstrategyandindings,andSection
6
evaluatesthemechanismsthroughwhichirmsurveyresponsespredictindustrialproduction.Section
7
concludes.
2Data
TheprimarydataforthisstudycomesfromtheInstituteforSupplyManagement(ISM).Eachmonth,ISMconductsasurveyofpurchasingmanagersfromasampleofmanufacturingirmsintheUnitedStates
.2
Difusionindexesbasedontheresponses(describedbelow)arepublishedveryrapidly,andarecloselywatchedbymarkets.Ashighlightedin
Boketal.
(2018),notonlydoessuchsurveydataprovideimportantsignalaboutthestateofthe
economy,buttheISMdatainparticularprovidesthe“earliestavailableinformationforthenationaleconomyonanygivenquarter”.Inaddition,theISMdatahavealongtime
series,whichisconducivetotime-seriesmodeling.3
Thetimelinessandrelevanceofthedatamotivatesourexplorationofthefree-responsetext.
TheISMsurveyincludesaseriesofquestionsabouttherespondents’operations,includingtheirproductionlevels,neworders,backlog,employment,supplierdeliverytimes,inputinventories,exports,andimports.Thesequestionshaveacategoricalresponse,wherethepurchasingmanagersspecifywhetherthesemetricshaveincreased,decreased,orstayedthesamebetweenlastmonthandthecurrentmonth.Thecategoricalresponsesareaggregated
2ISMalsosurveysnon-manufacturingirmsandhospitalsseparately.
3ISMseriesextendbackto1948,butmoststatisticalanalysesusedatathatstartsin1972.
7
intopublicly-releaseddifusionindexes,discussedmorebelow.Inadditiontothecategorical
response,purchasingmanagerscanprovidefurtherexplanationinaccompanyingtextboxes.Therearefreeresponsequestionsaccompanyingnearlyeverycategoricalquestion,askingforthereasonfortheresponse.Inadditionthereisa“GeneralRemarks”ieldatthebeginning,wheretherespondentcanputanygeneralremarkstheywish.TentotwelveofthesetextresponsesarefeaturedintheISM’sdatareleasetoprovidecontextforthedifusionindexes,butotherwisearenotreleasedpublicly.
TheISMmanufacturingsurveydatesbacktothe1930s.Thedatasetweanalyzecoversirm-monthobservationsfromNovember2001toJanuary2020.Mostrecently,thesamplecoversroughly350responsespermonth.Thedark-shadedareaofFigure
1
showstheper-centageofirmsinthesamplewithtextresponsesovertime.Theigureillustratesthatthemajorityofrespondentsprovidetextinadditiontotheirquantitativesurveyanswers.TheblacklineinFigure
1
presentstheaveragewordcountoverthesampleperiod.Thewordcountsrangefrom10to33wordsonaveragepermonth.Themeanwordcountappearstoluctuateoverthebusinesscycleandjumpsdramaticallyin2018.Thesuddenincreaseinwordcountin2018ismostlyduetoheightenedtensionssurroundingtradepolicyatthetime.Indeed,afterremovingresponsesthatcontaintheword“tarif,”weobserveasmootherincreaseinwordcounts(seeFigure
A1
intheappendixforfurtherdetails).
Table
1
providesasummaryofthetextresponses.Nearly49percentofthegeneralremarkssectionscontaintext,whilethenextmostcommonsectionscontainingtextarethoserelatedtoemployment,production,andneworders.Thelastrowshowsstatisticsforallthetextieldsconcatenatedtogether:69percentofirm-monthobservationshaveanytextatall,andthetextisabout17wordslongonaverage.TheaveragewordcountishighestfortheGeneralRemarkssection,withanaverageof8wordsusedintheseresponses.Whenconsideringonlythoseresponsesthatcontaintext,theaveragewordcountfortheGeneralRemarkssectionincreasesto16words.
TurningfromISM’ssurveymicrodata,weuseseveraltimeseriesinourforecastingexer-cises.Ourfocusisonforecastingthemanufacturingindustrialproduction(IP)index.Weuserealtimedataontherighthandside,relectingwhatpolicymakersknewatthetime,andforecastthefullyrevisedseries.InadditiontoIPseries,weusetheISMdifusionindexesasregressors.Thedifusionindexesareaggregationsofthecategoricalresponsequestionsinthesurvey.Forexample,theproductiondifusionindexisaweightedaverageoftheresponsestotheproductionquestion(paraphrasing,“Isproductionhigher/thesame/lowerthanlastmonth?”),withthe“Higher”responsesgettingweight100,“Same”responsesgettingweight
8
50,and“Lower”responsesgettingweight0.Theformulaforthedifusionindexinperiodt,withNt
totalirmsrespondingisshowninequation(1):
Dt=[100·1fResponseiis“Higher”g+50·1fResponseiis“Same”g](1)
Thesedifusionindexeshavevaluesbetween0and100,with0indicatingthatallrespondentssaythingsareworseand100indicatingthatallrespondentssaythingsarebetter
.4
ISMpublishesindexesforeachquestion,aswellasa“PMIComposite”,whichisanequally-weightedaverageofthedifusionindexesforneworders,production,employment,supplierdeliveries,andinventories.
3MeasuringSentiment
OurgoalistoextractusefulinformationfromtheISMsurveytextresponses.Wefocusonsentimentanalysis:measuringtheextenttowhichthepurchasingmanagersresponseispositiveornegative.Evenfocusingonsentimentanalysis,thewiderangeofNLPtechniquesavailablecanmakeitchallengingtochooseanappropriatemethod.Inthissectionwediscussthemethodsweuse,leavingacompletedescriptionoftheapproachestotheAppendix.
3.1Dictionaries
Oneofthesimplestmethodsformeasuringsentimentisdictionary-basedanalysis,whichinvolvescountingthefrequencyofapredeterminedlistofsentimentwordsinthetext.
WeusecommonsentimentdictionariessuchastheHarvard(Tetlock,
2007)andAFINN
(Nielsen
,
2011)wordlists
.However,wealsorecognizethatcertainwordsthatmaybeconsiderednegativeinothercontextsmaynotbeconsiderednegativeinthecontextofinance,suchas“taxing”or“l(fā)iability”.Assuch,wealsoapplyinance-speciicwordlists,includingthesentimentwordlistfrom
LoughranandMcDonald
(2011)(henceforth,“LM”)
andtheinancialstabilitywordlistfrom
Correaetal.
(2021).Foralldictionaries,wescore
commentsonascaleof-1to+1,usingthepercentoftotalwordsinthecommentthatarepositivelessthepercentoftotalwordsthatarenegative.Whenwerequirediscrete
4Theresponsesare“better”,“same”,or“worse”forthenewordersquestion,production,andnewexportorders.Foremployment,inventories,prices,andimportstheresponsesare“higher”,“same”,and“l(fā)ower”.Forbacklogsthechoicesare“greater”,“same”,and“l(fā)ess”.
9
classiications,asinFigure
2,weclassifythecommentaspositiveifthescoreisgreaterthan
zero,negativeifitislessthanzero,andneutralifitequalszero.
3.2DeepLearningModels
Anotherapproachtosentimentanalysisinvolvesittingamodeltothedata.Wetryseveralvariationsonthistheme.Unlikethedictionarymethods,alloftheseapproachesrequirelabeleddata:asampleofobservationsthathavealreadybeenclassiied,whichisusedtoitthemodelandclassifytheremainingobservations.
Wecreatealabeleddatasetfromarandomlyselectedsubsampleof1,000responses
withtextfromtheindividualquestions.5
Eachresponsewasclassiiedforsentimentbytwoeconomistsusingthefollowingquestionasaguide:“Isthiscommentconsistentwithmanu-facturingIPrisingmonthovermonth?”Theclassiicationswereeitherpositive,neutral,ornegative,where“neutral”includescaseswhereisitisimpossibletodeterminethesentiment.Botheconomistsagreedonthesentimentclassiicationforroughly700cases.Thissubsam-pleisfurthersplitintoa“training”dataset,usedtoitthemodels,and“test”dataset,usedtoassesstherelativemeritsofthemodels
.6
Deeplearningmodelshavegainedpopularityinrecentyears,drivenbytheirimpres-siveperformanceonlanguage-relatedtasks.Muchoftheprogresshasoccurredwithinaparticularclassofdeeplearningmodelscalledtransformers(see,e.g.,
Devlinetal.,
2018,
Radfordetal.,
2018,
Chungetal.,
2022,
Ouyangetal.,
2022,and
Touvronetal.,
2023)
.Thedeiningfeatureoftransformers—relativetootherneuralnetworkarchitectures—isamechanismcalledattention;awaytointeractwordswithinasentence,allowingthecon-textofaparticularwordtoinluencethemeaning.Afullexplanationoftransformersandtheattentionmechanismisbeyondthescopeofthispaper,butwedoprovideabriefsum-maryintheAppendix.Theimportantpointsarethat(unlikedictionariesandbag-of-wordsapproaches)transformerstakeintoaccountinteractionsbetweenwords,wordorder,andcontext-dependentmeanings(polysemy).
Onenotabletransformermodelis“BERT”,orBidirectionalEncoderRepresentationsfromTransformers,developedby
Devlinetal.
(2018)
.ItisimportanttonotethatBERTisapre-trainedmodel:
Devlinetal.
(2018)speciiedthearchitectureandthentrainedthe
modelonacorpusincludingtheentiretyof(English)Wikipediaandanumberofbooks.
5Note,thatthecategoricalresponsescanbeconsideredakindoflabelforthecorrespondingtext.InSection
4.1
weinvestigatehowwellmodelscanpredictthecategoricalresponsefromtheassociatedtext.
6Thetestdataconsistsofobservationsfrom2018m1to2020m1andisnotusedbyanyofthemodelsduringtraining.
10
Themodelislargebythestandardsoftheeconomicsliterature,wit
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 廣東科貿(mào)職業(yè)學院《學校課外音樂活動組織》2023-2024學年第一學期期末試卷
- 廣東交通職業(yè)技術學院《建設項目環(huán)境影響評價》2023-2024學年第一學期期末試卷
- 廣東技術師范大學《水文預報實驗》2023-2024學年第一學期期末試卷
- 廣東潮州衛(wèi)生健康職業(yè)學院《界面設計導論》2023-2024學年第一學期期末試卷
- 女員工培訓課件
- 廣安職業(yè)技術學院《運籌學》2023-2024學年第一學期期末試卷
- 《巖石的破壞判據(jù)》課件
- 贛南師范大學《Moecuar》2023-2024學年第一學期期末試卷
- nfabe培訓課件教學課件
- 甘孜職業(yè)學院《二外(法語-德語-俄語-阿拉伯語)》2023-2024學年第一學期期末試卷
- eva乳液工藝流程
- 體育行業(yè)新媒體營銷策劃方案之在社交平臺上推廣體育賽事和運動品牌
- 建筑工程幕墻工程作業(yè)活動風險分級管控清單
- 期末復習計劃:部編版六年級上冊道德與法治教案
- 《大學生禮儀規(guī)范》課件
- 銷售心態(tài)的轉(zhuǎn)變與創(chuàng)新思路
- 教師個人三進兩聯(lián)一交友計劃
- (完整版)四年級上冊數(shù)學豎式計算題100題直接打印版
- 10kV線路施工安全及技術交底
- 北京的春節(jié)說課 公開課教學設計
- 《頭痛》醫(yī)學課件
評論
0/150
提交評論