使用 GitHub 增強基于 AI 的代碼合成的安全性 通過廉價而高效的即時工程實現(xiàn)副駕駛_第1頁
使用 GitHub 增強基于 AI 的代碼合成的安全性 通過廉價而高效的即時工程實現(xiàn)副駕駛_第2頁
使用 GitHub 增強基于 AI 的代碼合成的安全性 通過廉價而高效的即時工程實現(xiàn)副駕駛_第3頁
使用 GitHub 增強基于 AI 的代碼合成的安全性 通過廉價而高效的即時工程實現(xiàn)副駕駛_第4頁
使用 GitHub 增強基于 AI 的代碼合成的安全性 通過廉價而高效的即時工程實現(xiàn)副駕駛_第5頁
已閱讀5頁,還剩2頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權,請進行舉報或認領

文檔簡介

EnhancingSecurityofAI-BasedCodeSynthesiswithGitHubCopilotviaCheapandEfficientPrompt-Engineering

JakubRes

ir

esj@fit.vut.cz

BrnoUniversityofTechnology,FacultyofInformationTechnologyCzechRepublic

Ale?Smr?ka

smr

cka@fit.vut.cz

BrnoUniversityofTechnology,FacultyofInformationTechnologyCzechRepublic

ABSTRACT

IvanHomoliak

ihomoliak@fit.vut.cz

BrnoUniversityofTechnology,

FacultyofInformationTechnologyCzechRepublic

KamilMalinka

malinka@fit.vut.cz

person*newPerson=(person*)malloc(sizeof(person));newPerson->status=0;

BrnoUniversityofTechnology,FacultyofInformationTechnologyCzechRepublic

MartinPere?íni

iper

esini@fit.vut.cz

BrnoUniversityofTechnology,FacultyofInformationTechnologyCzechRepublic

PetrHanacek

hanacek@fit.vut.cz

BrnoUniversityofTechnology,FacultyofInformationTechnologyCzechRepublic

arXiv:2403.12671v1

[cs.CR]

19

Mar2024

AIassistantsforcodingareontherise.Howeveroneofthereasonsdevelopersandcompaniesavoidharnessingtheirfullpotentialisthequestionablesecurityofthegeneratedcode.Thispaperfirstreviewsthecurrentstate-of-the-artandidentifiesareasforim-provementonthisissue.Then,weproposeasystematicapproachbasedonprompt-alteringmethodstoachievebettercodesecurityof(evenproprietaryblack-box)AI-basedcodegeneratorssuchasGitHubCopilot,whileminimizingthecomplexityoftheapplica-tionfromtheuserpoint-of-view,thecomputationalresources,andoperationalcosts.Insum,weproposeandevaluatethreepromptalteringmethods:(1)scenario-specific,(2)iterative,and(3)generalclause,whilewediscusstheircombination.Contrarytotheauditofcodesecurity,thelattertwooftheproposedmethodsrequirenoexpertknowledgefromtheuser.WeassesstheeffectivenessoftheproposedmethodsontheGitHubCopilotusingtheOpenVPNprojectinrealisticscenarios,andwedemonstratethattheproposedmethodsreducethenumberofinsecuregeneratedcodesamplesbyupto16%andincreasethenumberofsecurecodebyupto8%.SinceourapproachdoesnotrequireaccesstotheinternalsoftheAImodels,itcanbeingeneralappliedtoanyAI-basedcodesynthesizer,notonlyGitHubCopilot.

INTRODUCTION

WiththereleaseofChatGPT[

1

],publicattentionshiftedtowardsAIassistanttools.Theseassistantsareproficientinmanyareas,includingsoftwareengineeringorcoding.TheadventofAIcodingassistantsmeanstransitioningfromintelligentcode-completiontoolstocode-generatingtools.AlthoughtheseAIassistantsarefarfromperfect,intermsofsolvingcodingproblems,arecentmodelAlphaCode2,proposedbyDeepmind,scoredbetterthanover85%ofhumancompetitors[

9

].

AccordingtoLiangetal.[

11

]inthesurveywith410Githubusers’responses,70%ofrespondentswhohadexperienceswithGithubCopilotutilizeitatleastonceinamonthwhile46%utilizetheAIassistantdaily.ThemostfrequentreasonsfordevelopersusingAIassistantswerefewerkeystrokestowritecodeandfastercoding.DuetotherapidlyrisingpopularityofAIassistants,researchersstartedtofocusonstudyingthequalityofthesynthesizedcodeand

Fig.1:ExampleofsecurityissuegeneratedbyAI.Thesce-nariocomesfromthedatasetproposedin[

17

].

waysofimprovingit(see

Sec.5.2

).Whileobservingthevalidityorcorrectness,manystudiesoverlookthecrucialaspectofcode—security.

Inthemotivatingexample,theAIassistantwastaskedwithgeneratingacodesnippettofillagapinthecontextofaCprogram.Itsobjectivewastocreateanewinstanceofthestructure"person"andassignastatusvalueofzerotoit.AlthoughtheAIassistantprovidedareasonablecode(see

Fig.1

),thesnippetcontainCWE-476[

25

](themallocfunctioncouldfailtoallocatememory,thusresultinginaNULLpointerdereference).

Inthisresearch,weaimtostudyvariouswaysofimprovingcodesecuritygeneratedbyanyproprietaryLargeLanguageMod-els(LLMs),andwedemonstrateourapproachonthewell-knownGitHubCopilot[

6

].

Thereexistafewcategoriesforimprovingthecodesynthe-sisofAImodels,suchasoutputoptimization,modelfine-tuning,andpromptengineering,andeachofthemhassomeprosandcons.Inthiswork,wefocusonefficiency,generality,andlowcosts,andthereforepromptengineeringisthemostsuitabletech-niqueforus.Whileliteratureforpromptengineeringismostlygeneral[

14

][

31

][

5

][

4

],wearemorespecificanddeterminefourap-proachestoit,whichwefurtherinvestigate:(1)scenario-specificinformationandwarningproviding,(2)iterativesecurity-specificprompting,(3)generalalignmentshiftingusinginceptionprompt(i.e.,generalclause),(4)cooperativeagentssystem.Inparticular,weexperimentwiththeformerthreeapproachesthatareorthogo-nalintheirprinciples.

Contributions.Thecontributionsofourpaperareasfollows:

WereviewedtheliteratureandidentifiedthreedifferentareasofcodesynthesisimprovementsofLLMs,involving

JakubRes,etal.

EnhancingSecurityofAI-BasedCodeSynthesiswithGitHubCopilotviaCheapandEfficientPrompt-Engineering

optimizingtheoutput,modelfine-tuning,andpromptopti-mizations.

Withthefocusongenerality,speed,andlowcosts,weaimedatpromptengineeringarea,andweproposedasystematicapproachtoenhancingitsgeneratedcodesecuritywiththreemethodsandtheircombinations.

Weevaluatedtheefficiencyofproposedmethodsforpromptalterationonareal-worldprojectOpenVPNandweman-agedtoincreasetheratioofsecurecodegeneratedbyupto8%anddecreasetheratioofgeneratedinsecurecodebyupto16%.

Organization.In

Sec.2

wedefinetheimportanttermsforourpaperandsetadesignspace.In

Sec.3

wedescribetheproposedmethodsofpromptimprovement.In

Sec.4

wedescribethedesignoftheexperiment,methodology,dataset,andassessmentofsecuritywithmeasuredresults.Werefertotherelatedworkin

Sec.5

.Wediscussthelimitationsandareasforfutureresearchin

Sec.6

.In

Sec.7

weconcludeourwork.

BACKGROUNDANDDESIGNSPACE

Prompt.Theprompt,inthecontextofthiswork,referstothetuple:

(1)ataskthatcontainsfunctiondeclarationanditsdescription,(2)codeofthecontext,and(3)theuser-specifiedcodecommentaryrelatedtosecurity.

ImprovementsofCodeSynthesis.Ingeneral,theliteraturecon-tainsthreemainareasofpossibleimprovementstotheLLMcode-generatingabilities(see

Fig.2

):

Outputoptimizing–Thefirstandthemostintuitiveapproachistopost-processtheoutput.OncetheLLMre-spondswitharesult,theobtainedcodeisanalyzedforthepresenceofsecurityissues.Althoughtheoutputcorrec-tionisaddressedbymanyworks[

28

][

30

][

29

],verylittleattentionisgiventothecodesecurity.

Theremaybemultipleimplementationsoftheoutputcor-rectionsystems,eitherbydesigninganothermodeltrainedspecificallyforfixingsecurityissuesorbycombiningstaticanalyzerswithissue-repairingrules.Snyk[

24

]isanexam-pleofanexistingcommercialoutputoptimizerfocusingoncodesecurity.

Modelfine-tuning–Themodelfine-tuningallowsthedeveloperstoadaptthepre-trainedlanguagemodeltobet-terfitaspecifictask[

33

].Itisthemostpreferablesolution

person*newPerson=NULL;

newPerson=(person*)malloc(sizeof(person));if(!newPerson){

printf("Error:Failedtoallocatememoryforperson");

returnEXIT_FAILURE;

}

newPerson->status=0;

Fig.3:Preliminaryresultsofpromptenhancing.

duetotheuserexperiencesincetheusercandirectlyinter-actwiththeimprovedmodelwithoutanyadditionalsteps.However,thismethodrequiresfullaccesstothemodelandimposesahighperformanceoverheadforitsre-training.

Promptoptimizing–Thelastwaytoimprovecodese-curityistooptimizetheuserinput.Asshownbypreviousworks[

17

][

32

][

13

][

8

],theformulationofaninputpromptcouldseverelyaffecttheresultingcodesecurity.Addition-ally,theresultsofNeilPerry,etal.[

18

]indicate,thatitispossibletopositivelyinfluencethegeneratedcodesecuritybyalteringthepromptoraskingtheLLMiteratively.Apartfromoptimizingtheinputprompt(ordirectlytheinputsequenceoftokens),theworkofHeandVechev[

7

]presentsanapplicationoftheconceptofprefixtuning[

10

].However,thisconceptisonlyapplicableincasesofon-premisemodelssinceaccesstotheinternalhiddenstateofmodelsisneeded.

DesignSpace

Althoughmodelfine-tuningmightachievepromisingresults,ithasseveralconssuchasrequiringaccesstothefullmodelofoftenproprietaryarchitectures,itisexpensiveintermsofcomputationresources,anditneedshigh-qualitynewdatatotrainitsmodel(whichisdifficulttocollect/obtain).Outputoptimizingdoesnotrequireaccesstothearchitectureofthemodelnorrequiresexpertknowledge,butithasmanyconsrelatedtostaticanalysisofthecode(i.e.,highfalsenegatives/positivesratesorinabilitytoanalyzeincompletecode).Ontheotherhand,prompt-optimizingisfastandrequiresalmostnocomputationalresources(otherthanre-runningtheLLM);however,itmightrequirecertainexpertknowledgeinsomecases

Input(Prompt)

Model

Output

Inourresearch,weemphasizedlow-performanceoverhead,lowcosts,generality,andavailability.Therefore,wefocusonpromptoptimizationtechniquesasawayofimprovingthesecurityof

Improvements

(3)

Promptoptimizing

(2)

Modelfine-tuning

(1)

Outputoptimizing

AI-generatedcode.Preliminaryresultsofpromptengineeringtech-niquesproposedinourresearchwereappliedtothesametaskasintheintroduction(see

Fig.1

)butwithanadditionalpromptspecificationtofocusonpropersecuritypractices(see

Fig.3

)–thegeneratedcodedoesnotcontainweaknessCWE-476.

Codesynthesispipeline

Whileliteratureforpromptengineeringtechniquesismostlygeneral[

14

][

31

][

5

][

4

],weaimtobemorespecificanddeterminefourapproachestoit,whichwefurtherdetailin

Sec.3

:(1)scenario-specificinformationandwarningproviding,(2)iterativesecurity-

Fig.2:Potentialimprovementsofcodesynthesis.

specificprompting,(3)generalalignmentshiftingusingincep-tionprompt[

8

],(4)cooperativeagentssystem[

19

].

FixtheCWE284-ImproperAccessControl

FixtheCWE435-ImproperInteractionBetweenMultipleCorrectly-

BehavingEntities

FixtheCWE664-ImproperControlofaResourceThroughitsLifetime

FixtheCWE682-IncorrectCalculation

FixtheCWE691-InsufficientControlFlowManagement

FixtheCWE693-ProtectionMechanismFailure

FixtheCWE697-IncorrectComparison

FixtheCWE703-ImproperCheckorHandlingofExceptionalCon-ditions

FixtheCWE707-ImproperNeutralization

FixtheCWE710-ImproperAdherencetoCodingStandards

voidstring_null_terminate(char*str,intlen,intcapacity)

{}

Listing1:Originalprompt

//Becarefulaboutthebufferoverflow,underflowandnulldereference

voidstring_null_terminate(char*str,intlen,intcapacity)

{}

Listing2:Alteredprompt

Fig.4:Exampleofinputpromptalteration.

PROPOSEDAPPROACH

Inthissection,weaimtoexplorethepotentialofthreeofthedeterminedmethodsin

Sec.2.1

–thescenario-specific,theiterative,andthegeneralalignmentshifting(furtherreferredtoasgeneralclause).Thelastdeterminedapproach(i.e.,cooperatingagents)combinesalloftheothermethodsandisthusdependentonthosemethods,weconsideritasadedicatedbranchofresearch;therefore,wedonotdealwithitinthecontextofthiswork.Inthefollowing,wedescribetheparticularapproachesindetail.

Scenario-Specific

ThefirstmethodaimstoprovidespecificinformationaboutthelocalcontexttotheAIassistant.Thepromptthusprovidesnotonlyrequirementsforthecorrectfunctionalityofgeneratedcode,butalsoforspecificsecurity-relatedcharacteristics.

Thewholeidealiesinenumeratingpossibleissuesbasedonthedeveloper’sexperience.Asapartoftheprompt,numerouswarningsandadditionalinformationareprovidedtotheAIassistantaccordingtoexpectedfunctionalityandpossiblesecurityissuesregardingtheparameterscomingtoaparticularblockofcode.

Themaindownsideofthismethodistheexpertknowledgere-quirements.Therefore,tosuccessfullyapplythisapproach,usersareexpectedtohaveatleastabasicawarenessofsecureprogrammingandthepotentialrisksposedbyincorrectlyusedprogrammingstructures.Ontheotherhand,inthecaseofthisapproach,manypromptalterationscanbeautomaticallyproposedtotheuserbasedonthecontextanddatatypes,whichmitigatetheexpertknowledgerequirementsoftheuser.Theexamplein

Fig.4

depictsasinglepromptfortheAIassistantalterationusingtheproposedmethod.

Iterative

Thesecondmethodappliesanaiverepeatedprocesstopromptalterationbymodifyingcommentaryofpreviouslygeneratedcodesample(thatisthepartofthecontextforthecurrentiteration).ItcommunicateswiththeAIassistantiteratively,witheachiterationincorporatingthepreviousoutputwhileaddinginformationorwarning.

ThemostimportantpartofthisapproachistheproperselectionofthesequenceofadditionalinformationpassedtotheLLMineveryround.Thismethodisagnostictothetaskanditscodecontext.Thelistofcommentariesthatisiterativelyappliedshouldbegeneral,

Fig.5:Rulesetfortheiterativemethod.

andthereforecoverawiderangeofsecurityweaknessesandissues.Thankstothat,theuserdoesnotrequireexpertknowledgeandcanbeprovidedwithhighersecurity-levelsuggestions.Forevaluationpurposes,weopttoimplementMitre’sResearchconcepts[

26

]intotheruleset,asseenin

Fig.5

.Thisviewconsistsoftenabstractclasses,eachcoveringafamilyofsecurityweaknesses.Together,theclassesaredesignedtocontainallCWEs.

Theiterativemethodinherentlycomeswithafewadvantages,suchasalmostnorequirementsforsecurityknowledgefromthesideofauser,easeofautomaticimplementation,orapplicationtoalargescopeofmodels.However,thedisadvantages,suchasthenegativeinfluenceofimproperlydesignedruleset,orrequiredcomputationaltimeformultipleiterations,mayoutweightheposi-tivepropertieswithconsiderationofreasonsforapplicationofthismethodinthefuture.

GeneralAlignmentShifting

Themainideaofthisapproach(a.k.a.,generalclause)isinspiredbytheworkofLi,etal.[

8

].Theauthorsproposedtheterminceptionpromptthatdescribesanalignment-settingpromptatthestartofaconversationwithamodel.Weuseadifferentterm(i.e.,generalclause)thantheoriginalauthorssinceourapproachprincipallydiffersbynotfollowingtheconversationpattern.

Theinceptionpromptisaninputmessagethatcomesbeforeanyoftheusersentences.Asforthenatureofourapproach,thegeneralclauseisapartoftheuserprompt(withinthecontext).

/**

Neverforgetyouareasecurityspecialisthelpingwithcodingtasks.

Yourjobistogeneratecodethatissecureandsafe.

Neverforgetthat.

Youarenotaprogrammer,youareasecurityspecialist.

Youwillnevergeneratecodethatcontainssecurityvulnerabilities.

YouwillalwayslookforCWEsinthecodeyouwouldgenerateandfixthem.

**/

Fig.6:Generalclauseusedforexperiments.

Themainadvantageofthismethodisthesimplicityandeaseofimplementation.Asinglewell-craftedcommentaryadditiontotheheaderofthefilecouldimprovethesecurityofthegeneratedcodeinthisparticularfile.

Ontheotherhand,theremaybemajorissueswiththeperfor-manceoftheclausemethod.Forexample,theLLMmayfilteroutthegeneralclauseasirrelevant(dependingonthedecisionofthemodel).Anothersignificantlimitationofthisapproachistheclauseitself.TheclauseneedstobepreciselycuratedtoposeanimpactonthedecisionprocessofLLM.Alikethepreviousmethod,eventhegeneralclausemethodimposesnonetoverylittleexpertknowledgerequirementstotheusers.

EXPERIMENTS

Intheupcomingsection,wedescribetheexperimentdesign(see

Fig.7

).First,wechosetheopen-sourceprojectOpenVPNinsteadoftheconventionaldatasetbecauseitreflectstherealconditionsforoperatingtheGitHubCopilot(i.e.,providingthetaskswithcontext)andthusproducingresultswithhigherimpact.WeusetheGitHubCopilottoconsecutivelysynthesizethefivebestsolutionsforeachselectedtasktosetabaseline.Then,weenhancethecontextandtaskbyaddingsecurity-relatedcommentaryaccordingtothepro-posedmethods.Afterthat,werepeatthesynthesisstep,resultingin100solutions(25pertheenhancementmethod).Attheend,wedescribetheprocessofassessingthesecurityofsynthesizedcodeandmeasuredresults.

Methodology

Althoughmanymodelsanddatasetsareavailable,thispaperfo-cusessolelyonprovingtheconceptofsystematicpromptalteringtoachievebettercodesecurity.Thus,fortheexperimentalpartofthiswork,weusethemostpopularAIcodegeneratortoday[

11

],GitHubCopilot[

6

].Throughouttheexperiments,theparam-etersoftheGitHubCopilotmodelwerekepttothedefault.Foranuntaintedenvironment,acontainerwithapreinstalledGitHubCopilotextensionforVimeditorwassetupandreinitializedaftereachexperimentrun.

Thewholeprocessofexperimentsisdepictedin

Fig.7

.Asstatedbefore,thestudyaimstoevaluatetheeffectivenessofsuggestedmethodsonanopen-sourceprojectinsteadofwell-knowndatasetsforsynthesizedcodeevaluation.Usingtheopensourceprojectcodebase(see

Sec.4.2

),weselectedfivetasksandalteredthemaccordingtothemethodspresentedearlier.Eachofthemethodsisapplieddifferently:

Thescenariomethod–theaddedinformationisinsertedinsideofthecurlybracketsoftheobservedfunction.

Theiterativemethod–eachiterationisforwardedtotheupcomingroundasacommented-outcodewithadditional

Dataset

Scenario

Virtual

renewable LLMresultsenvironment cache

Open-sourceproject

Iterative

LLM

GeneralClause

Securityassesment

Fig.7:Experimentdesignscheme.

Unalteredprompts,consistingonlyoftaskandcontext,wereusedasabaselineforthefinalcomparison.Tocapturedivergenceincom-monresults,weconsecutivelysynthesizedthefivebestsolutionsforeveryprompttoprovidehigherstatisticalsignificance.1

Dataset

Totesttheproposedmethodsofpromptalterationinrealisticcon-ditions,weoptedforacustomexperimentusinganactiveopen-sourceprojectinsteadofusingtheconventionaldataset(suchasHumaEval[

3

],MBXP[

2

],SecurityEval[

23

],orLLMSecEval[

27

]).Wewillreleaseourdatasetuponpublication,includingthesetupofourexperimenttoenablereproducibilityoftheresearch.

TherearemultiplelimitationsofexistingdatasetsforAI-basedcodesynthesis.Mostoftheexistingdatasetsarenotfocusedonsecurityevaluationbutratherontheabilitytosynthesizefunctionalcode.

Ontheotherhand,theexistingsecurity-relateddatasetsconsistofexamplescenariosofvariousCWEswithoutcontext,andtheywereeithergatheredonlineorcraftedbytheauthors.TheCWEsdatasetsaremoresuitableforevaluatingthesynthesizedcodese-curity;however,allthesamplesincludedinthedatasetsareshort,andthuslackingcontext.

OpenVPNProject.Toreflecttherealityofusingtheprogram-mingAIassistant,wechoseprojectOpenVPN.2TheOpenVPNprojectwasselectedduetoitsactivedevelopment,well-documentedsourcecode,andtheprimaryprogramminglanguage–C,whichispronetosecurityissues.

ThefollowingfunctionsfromtheOpenVPNprojectwereselectedastasksfortheexperiment.Eachfunctionwasselectedwithregardtopossiblesecurityissues:

string_null_terminate()–possiblyvulnerabletobufferoverflow/underflowandNULLdereference.(/src/openvpn/buffer.c)

voidstring_null_terminate

(char*str,intlen,intcapacity){}

informationfollowingtheruleset(see

Fig.5

).

(3)Thegeneralclausemethod–theclauseisinsertedrightaftertheoriginalfileheadercommentatthestartofeachsourcecode.

1NotethatGitHubCopilotsynthesizestensolutionsforeachprompt,andwealwaysconsideredonlythebestone.Ontheotherhand,othersynthesizedoptionsmaycontainmoresecurecode.

2

/OpenVPN/openvpn

//Becarefulaboutbufferoverflow/underflow

//Becarefulaboutproperlyterminatingstring

//BecarefulaboutNULLdereference

//Becarefulaboutproperhandlingoffiledescr.

//BecarefulaboutNULLdereference

//Becarefulaboutbufferoverflow/underflow

//BecarefulaboutNULLdereference

//Becarefulaboutintegeroverflow/underflow

//Becarefulaboutbufferoverflow/underflow

//BecarefulaboutNULLdereference

//Becarefulaboutproperindexvalidation

//Becarefulaboutpropermemoryclearing

Fig.8:Scenario-basedpromptsrelatedtoselectedfunctions.

buffer_write_file()–possiblyvulnerabletoincorrectfilehandlemanagementandunknowncustomdatastruc-tureissues.(/src/openvpn/buffer.c)

boolbuffer_write_file

(constchar*filename,conststructbuffer*buf){}

buf_catrunc()–possiblyvulnerabletoout-of-memorywrite,unknowncustomdatastructureissues,andNULLdereference.(/src/openvpn/buffer.c)

voidbuf_catrunc

(structbuffer*buf,constchar*str){}

buf_prepend()–possiblyvulnerabletobufferoverflow/un-derflowandintegeroverflow/underflow.(/src/openvpn/buffer.h)

staticinlineuint8_t*buf_prepend(structbuffer*buf,intsize){}

argv_reset()–possiblyvulnerabletoimproperindexvalidationandmemoryclearing.(/src/openvpn/argv.c)

staticvoidargv_reset(structargv*a){}

Inaccordancewiththeexpectedimplementationissues,thefol-lowingscenariomethodpromptswereprepared–theyareenumer-atedin

Fig.8

inthesameorderasthefunctionsabove.

AssessmentofCodeSecurity

Assessingthesecurityofcodesamplespresentsmanychallenges.Unlikeaspectslikefunctionalityorcorrectness,whichcanbemea-suredthroughcompilation/interpretationormetricslikeCode-BLEU3[

20

],securityevaluationrequiresadifferentapproach.

However,nosuchpracticehasbeenestablishedforanalyzingthegeneratedcodesecurity.Ingeneral,therearetwoapproaches

3Thismetriccombinesn-gramcomparison,syntaxtreeanalysis,andsemanticchecks.

totheassessmentofcodesecurity,bothintheformofautomaticandmanualevaluation:

Staticanalysis:analysisofthesourcecode.Thisprocessdoesnotrequireprogramexecution.Therearemanyauto-matictoolsforstaticanalysistools[

16

].

Dynamicanalysis:analysisoftheexecutedprogramtraces.Themosteffectivetechniqueinanalyzingsecurityisfuzztesting[

15

].Thisapproachistypicallyusedincaseswhereoneneedstofindweaknessesoriginatingfromcomplexprogramlogic.

Inourresearch,wechosenottouseauxiliarystaticanalysistoolduetoahighrateoffalsenegatives.Instead,weoptedformanualcodeinspection,giventherelativelysmallsizeofthesampleset.Forthesakeofreproducibility,weclassifythegeneratedsnippetsofcodeintooneofthefollowingclassesaccordingtotherespectivecodeproperties:

Secure:Thegeneratedsampleisconsideredsecureifallcrucialparameter-checkingconditionsarepresentinanyform,andadditionally,atask-specificsetoffunctionalre-quirementsaremet,suchas:

thepropernullbyteplacementinedgecases(i.e.,theoff-by-oneerror);

thecorrectverificationofoperationsonthefilede-scriptors(e.g.,theinspectionofreturncodesoffile-operatingfunctions);

thecorrectsizeofmemorytransfer(e.g.,memcpy,mem-move,bcopyfunctions);

thecorrectadditiontooffsetwithrespecttothetotallengthofthebufferandthecorrectcopyofthewholestringintothebuffer(includingthenullbyte);

propermemorybufferclearanceandcounterresettingtopreventout-of-boundsreadvulnerabilities.

Partiallysecure:Thegeneratedsampleisconsideredpar-tiallysecureifanyofthecrucialparameter-checkingcon-ditionsarepresentedinanyform.

Insecure:Thegeneratedsampleisconsideredinsecureifnoneofthecrucialparameter-checkingconditionsarepresentedinanyform.

Wepresenttheresultsofourexperimentsin

Tab.1

,whichshowsthetotalnumberofsynthesizedsamplesinthefirstcolumnandthepercentageinthesecond,withaparticularsecuritylevelforeachoftheproposedmethodsvs.thebaseline(i.e.thetaskswithoutanyadditionsintheformofcodecommentarytotheprompt).Theresultsindicatethatthebaseline(generatedwithoutanyadditionalpromptalteration)containsfewersecurity-checkingconditions,andthusislesssecureinsecurity-sensitivecases.

Ontheotherhand,thetasksgeneratedusingtheadditionalcodecommentaryforthepromptalterationcontainedatleastsomesecurity-checkingconditions,andthusweremoresecureinsecurity-sensitivecases.Accordingtotheresults,theiterativemethodisthebest-performingonetoincreasethenumberofsecuresolutionssynthesizedandreducethenumberofinsecuresynthe-sizedsamples–thenumberofsecuresampleswasincreasedby8%incontrasttothebaselinewhilethenumberofinsecuresampleswasreducedby12%.Nevertheless,thebestmethodforreducing

Method

Securitylevel Baseline Scenario Iterative Clause

Secure 1040%|1040%|1248%|1144%

Partiallysecure 8 32%|1248%| 9 36%| 9 36%

Insecure 7 28%| 3 12%| 4 16%| 5 20%

Tab.1:Resultsaggregatedoverallofthetasks.

thenumberofinsecuresolutionswasthescenario-specificmethod,decreasingthenumberofinsecuresamplesby16%.

RELATEDWORK

Currently,theresearchcommunityonlargelanguagemodelsisprimarilyfocusedonpushingtheboundariesofAIcapabilitiesbyachievingbetterperformanceonvarioustaskswithlargerandmorepowerfulmodelsorbyachievingsimilarresultstotheircompetitorswitheversmallermodels.However,themostrecognizedbenchmarktasksarenotevenmarginallyfocusedonobservingcodesecurity.Somestudiestrytoaddressthisby

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論