版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
IntroductiontoGeometricLearninginPythonwithGeomstats
NinaMiolane,NicolasGuigui,HadiZaatiti,ChristianShewmake,HatemHajri,DanielBrooks,AliceLeBrigant,JohanMathe,BenjaminHou,YannThanwerdas,etal.
Tocitethisversion:
NinaMiolane,NicolasGuigui,HadiZaatiti,ChristianShewmake,HatemHajri,etal..IntroductiontoGeometricLearninginPythonwithGeomstats.SciPy2020-19thPythoninScienceConference,Jul2020,Austin,Texas,UnitedStates.pp.48-57,10.25080/Majora-342d178e-007.hal-02908006
HALId:hal-02908006
https://inria.hal.science/hal-02908006
Submittedon28Jul2020
HALisamulti-disciplinaryopenaccessarchiveforthedepositanddisseminationofsci-entificresearchdocuments,whethertheyarepub-lishedornot.ThedocumentsmaycomefromteachingandresearchinstitutionsinFranceorabroad,orfrompublicorprivateresearchcenters.
L’archiveouvertepluridisciplinaireHAL,estdestinéeaudép?tetàladiffusiondedocumentsscientifiquesdeniveaurecherche,publiésounon,émanantdesétablissementsd’enseignementetderecherchefran?aisouétrangers,deslaboratoirespublicsouprivés.
PAGE
48
PROC.OFTHE19thPYTHONINSCIENCECONF.(SCIPY2020)
INTRODUCTIONTOGEOMETRICLEARNINGINPYTHONWITHGEOMSTATS
PAGE
49
IntroductiontoGeometricLearninginPythonwithGeomstats
NinaMiolane??,NicolasGuigui§,HadiZaatiti,ChristianShewmake,HatemHajri,DanielBrooks,AliceLeBrigant,JohanMathe,BenjaminHou,YannThanwerdas,StefanHeyder,OlivierPeltre,NiklasKoep,YannCabanes,ThomasGerald,PaulChauchat,BernhardKainz,ClaireDonnat,SusanHolmes,XavierPennec
https://youtu.be/Ju-Wsd84uG0
令
Abstract—Thereisagrowinginterestinleveragingdifferentialgeometryinthemachinelearningcommunity.Yet,theadoptionoftheassociatedgeometriccomputationshasbeeninhibitedbythelackofareferenceimplementation.Suchanimplementationshouldtypicallyallowitsusers:(i)togetintuitiononconceptsfromdifferentialgeometrythroughahands-onapproach,oftennotprovidedbytraditionaltextbooks;and(ii)torungeometricmachinelearningalgorithmsseamlessly,withoutdelvingintothemathematicaldetails.Toaddressthisgap,wepresenttheopen-sourcePythonpackagegeomstatsandintro-ducehands-ontutorialsfordifferentialgeometryandgeometricmachinelearn-ingalgorithms-GeometricLearning-thatrelyonit.Codeanddocumentation:/geomstats/geomstatsandgeomstats.ai.
IndexTerms—differentialgeometry,statistics,manifold,machinelearning
Introduction
Dataonmanifoldsarisenaturallyindifferentfields.Hyperspheresmodeldirectionaldatainmolecularandproteinbiology[
KH05
]andsomeaspectsof3Dshapes[
JDM12
],[
HVS+16
].Densityesti-mationonhyperbolicspacesarisestomodelelectricalimpedances[
HKKM10
],networks[
AS14
],orreflectioncoefficientsextractedfromaradarsignal[
CBA15
].SymmetricPositiveDefinite(SPD)matricesareusedtocharacterizedatafromDiffusionTensorImaging(DTI)[
PFA06
],[
YZLM12
]andfunctionalMagneticResonanceImaging(fMRI)[
STK05
].Thesemanifoldsarecurved,differentiablegeneralizationsofvectorspaces.Learningfromdataonmanifoldsthusrequirestechniquesfromthemathematicaldisciplineofdifferentialgeometry.Asaresult,thereisagrowinginterestinleveragingdifferentialgeometryinthemachinelearningcommunity,supportedbythefieldsofGeometricLearningandGeometricDeepLearning[
BBL+17
].
Despitethisneed,theadoptionofdifferentialgeometriccomputationshasbeeninhibitedbythelackofareferenceimplementation.Projectsimplementingcodeforgeometrictoolsareoftencustom-builtforspecificproblemsandarenoteasilyreused.SomePythonpackagesdoexist,buttheymainlyfocusonoptimization(Pymanopt[
TKW16
],Geoopt[
BG18
],[
Koc19
],
*Correspondingauthor:
nmiolane@
?StanfordUniversity
§UniversitéC?ted’Azur,Inria
Copyright?2020NinaMiolaneetal.Thisisanopen-accessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalauthorandsourcearecredited.
McTorch[
MJK+18
]),arededicatedtoasinglemanifold(PyRie-mann[
Bar15
],PyQuaternion[
Wyn14
],PyGeometry[
Cen12
]),orlackunit-testsandcontinuousintegration(TheanoGeometry[
KS17
]).Anopen-source,low-levelimplementationofdifferentialgeometryandassociatedlearningalgorithmsformanifold-valueddataisthusthoroughlywelcome.
Geomstatsisanopen-sourcePythonpackagebuiltformachinelearningwithdataonnon-linearmanifolds[
MGLB+
]:afieldcalledGeometricLearning.Thelibraryprovidesobject-orientedandextensivelyunit-testedimplementationsofessentialmanifolds,operations,andlearningmethodswithsupportfordifferentexecutionbackends-namelyNumPy,PyTorch,andTensorFlow.Thispaperillustratestheuseofgeomstatsthroughhands-onintroductorytutorialsofGeometricLearning.Thesetu-torialsenableusers:(i)tobuildintuitionfordifferentialgeometrythroughahands-onapproach,oftennotprovidedbytraditionaltextbooks;and(ii)torungeometricmachinelearningalgorithmsseamlesslywithoutdelvingintothelower-levelcomputationalormathematicaldetails.Weemphasizethatthetutorialsarenotmeanttoreplacetheoreticalexpositionsofdifferentialgeometryandgeometriclearning[
Pos01
],[
PSF19
].Rather,theywillcom-plementthemwithanintuitive,didactic,andengineering-orientedapproach.
PresentationofGeomstats
Thepackage
geomstats
isorganizedintotwomainmodules:
geometry
and
learning
.Themodulegeometryimplementslow-leveldifferentialgeometrywithanobject-orientedparadigmandtwomainparentclasses:ManifoldandRiemannianMetric.StandardmanifoldsliketheHypersphereortheHyperbolicspaceareclassesthatinheritfromManifold.Atthetimeofwriting,thereareover15manifoldsimplementedingeomstats.TheclassRiemannianMetricprovidescomputationsrelatedtoRiemanniangeometryonsuchmanifoldssuchastheinnerproductoftwotangentvectorsatabasepoint,thegeodesicdistancebetweentwopoints,theExponentialandLogarithmmapsatabasepoint,andmanyothers.
Themodulelearningimplementsstatisticsandmachinelearningalgorithmsfordataonmanifolds.Thecodeisobject-orientedandclassesinheritfromscikit-learnbaseclassesandmixinssuchasBaseEstimator,ClassifierMixin,orRegressorMixin.Thismoduleprovidesimplementations
ofFréchetmeanestimators,K-means,andprincipalcomponentanalysis(PCA)designedformanifolddata.Thealgorithmscanbeappliedseamlesslytothedifferentmanifoldsimplementedinthelibrary.
Thecodefollowsinternationalstandardsforreadabilityandeaseofcollaboration,isvectorizedforbatchcomputations,un-dergoesunit-testingwithcontinuousintegration,andincorporatesbothTensorFlowandPyTorchbackendstoallowforGPUac-celeration.Thepackagecomeswitha
visualization
modulethatenablesuserstovisualizeandfurtherdevelopanintuitionfordifferentialgeometry.Inaddition,the
datasets
moduleprovidesinstructivetoydatasetsonmanifolds.Therepositories
examples
and
notebooks
provideconvenientstartingpointstogetfamiliarwithgeomstats.
FirstSteps
Tobegin,weneedtoinstallgeomstats.Wefollowthein-stallationproceduredescribedinthe
firststeps
oftheonlinedocumentation.Next,inthecommandline,wechoosethebackendofinterest:NumPy,PyTorchorTensorFlow.Then,weopentheiPythonnotebookandimportthebackendtogetherwiththevisualizationmodule.Inthecommandline:
exportGEOMSTATS_BACKEND=numpy
then,inthenotebook:
importgeomstats.backendasgs
importgeomstats.visualizationasvisualization
visualization.tutorial_matplotlib()
INFO:Usingnumpybackend
Modulesrelatedtomatplotlibandloggingshouldbeim-portedduringsetuptoo.Moredetailsonsetupcanbefoundonthedocumentationwebsite:geomstats.ai.AllstandardNumPyfunctionsshouldbecalledusingthegs.prefix-e.g.gs.exp,gs.log-inordertoautomaticallyusethebackendofinterest.
Tutorial:StatisticsandGeometricStatistics
ThistutorialillustrateshowGeometricStatisticsandLearningdif-ferfromtraditionalStatistics.Statisticaltheoryisusuallydefinedfordatabelongingtovectorspaces,whicharelinearspaces.Forexample,weknowhowtocomputethemeanofasetofnumbersorofmultidimensionalarrays.
Nowconsideranon-linearspace:amanifold.AmanifoldMofdimensionmisaspacethatispossiblycurvedbutthatlookslikeanm-dimensionalvectorspaceinasmallneighborhoodofeverypoint.Asphere,liketheearth,isagoodexampleofamanifold.Whathappenswhenweapplystatisticaltheorydefinedforlinearvectorspacestodatathatdoesnotnaturallybelongtoalinearspace?Forexample,whathappensifwewanttoperformstatisticsonthecoordinatesofworldcitieslyingontheearth’ssurface:asphere?Letuscomputethemeanoftwodatapointsonthesphereusingthetraditionaldefinitionofthemean.
fromgeomstats.geometry.hypersphereimport\Hypersphere
n_samples=2
sphere=Hypersphere(dim=2)points_in_manifold=sphere.random_uniform(
n_samples=n_samples)
PointsLinearmean
PointsFréchetmean
Fig.1:Left:Linearmeanoftwopointsonthesphere.Right:Fréchetmeanoftwopointsonthesphere.Thelinearmeandoesnotbelongtothesphere,whiletheFréchetmeandoes.Thisillustrateshowlinearstatisticscanbegeneralizedtodataonmanifolds,suchaspointsonthesphere.
linear_mean=gs.sum(
points_in_manifold,axis=0)/n_samples
TheresultisshowninFigure
1
(left).Whathappened?Themeanoftwopointsonamanifold(thesphere)isnotonthemanifold.Inourexample,themeanofthesecitiesisnotontheearth’ssurface.Thisleadstoerrorsinstatisticalcomputations.Thelinesphere.belongs(linear_mean)returnsFalse.Forthisreason,researchersaimtobuildatheoryofstatisticsthatis-byconstruction-compatiblewithanystructurewithwhichweequipthemanifold.ThistheoryiscalledGeometricStatistics,andtheassociatedlearningalgorithms:GeometricLearning.
Inthisspecificexampleofmeancomputation,GeometricStatisticsprovidesageneralizationofthedefinitionof“mean”tomanifolds:theFréchetmean.
fromgeomstats.learning.frechet_meanimport\FrechetMean
estimator=FrechetMean(metric=sphere.metric)estimator.fit(points_in_manifold)frechet_mean=estimator.estimate_
NoticeinthiscodesnippetthatgeomstatsprovidesclassesandmethodswhoseAPIwillbeinstantlyfamiliartousersofthewidely-adoptedscikit-learn.WeplottheresultinFigure
1
(right).ObservethattheFréchetmeannowbelongstothesurfaceofthesphere!
Beyondthecomputationofthemean,geomstatsprovidesstatisticsandlearningalgorithmsonmanifoldsthatleveragetheirspecificgeometricstructure.Suchalgorithmsrelyonelementaryoperationsthatareintroducedinthenexttutorial.
Tutorial:ElementaryOperationsforDataonManifolds
Theprevioustutorialshowedwhyweneedtogeneralizetradi-tionalstatisticsfordataonmanifolds.Thistutorialshowshowtoperformtheelementaryoperationsthatallowusto“translate”learningalgorithmsfromlinearspacestomanifolds.
Weimportdatathatlieonamanifold:the
worldcities
dataset,thatcontainscoordinatesofcitiesontheearth’ssurface.WevisualizeitinFigure
2
.
importgeomstats.datasets.utilsasdata_utils
data,names=data_utils.load_cities()
ParisMoscow
Istanbul
BeijingManilla
InitialpointEndpointGeodesic
Fig.2:Subsetoftheworldcitiesdataset,availableingeomstatswiththefunctionload_citiesfromthemoduledatasets.utils.Cities’coordinatesaredataonthesphere,whichisanexampleofamanifold.
Howcanwecomputewithdatathatlieonsuchamanifold?Theelementaryoperationsonavectorspaceareadditionandsubtraction.Inavectorspace(infactseenasanaffinespace),wecanaddavectortoapointandsubtracttwopointstogetavector.Canwegeneralizetheseoperationsinordertocomputeonmanifolds?
Forpointsonamanifold,suchasthesphere,thesameoperationsarenotpermitted.Indeed,addingavectortoapointwillnotgiveapointthatbelongstothemanifold:inFigure
3
,addingtheblacktangentvectortothebluepointgivesapointthatisoutsidethesurfaceofthesphere.So,weneedtogeneralizetomanifoldstheoperationsofadditionandsubtraction.
Onmanifolds,theexponentialmapistheoperationthatgeneralizestheadditionofavectortoapoint.Theexponentialmaptakesthefollowinginputs:apointandatangentvectortothemanifoldatthatpoint.TheseareshowninFigure
3
usingthebluepointanditstangentvector,respectively.Theexponentialmapre-turnsthepointonthemanifoldthatisreachedby“shooting”withthetangentvectorfromthepoint.“Shooting”meansfollowinga“geodesic”onthemanifold,whichisthedottedpathinFigure
3
.Ageodesic,roughly,istheanalogofastraightlineforgeneralmanifolds-thepathwhose,length,orenergy,isminimalbetweentwopoints,wherethenotionsoflengthandenergyaredefinedbytheRiemannianmetric.Thiscodesnippetshowshowtocomputetheexponentialmapandthegeodesicwithgeomstats.
fromgeomstats.geometry.hypersphereimport\Hypersphere
sphere=Hypersphere(dim=2)
initial_point=paris=data[19]vector=gs.array([1,0,0.8])tangent_vector=sphere.to_tangent(
vector,base_point=initial_point)
end_point=sphere.metric.exp(tangent_vector,base_point=initial_point)
geodesic=sphere.metric.geodesic(initial_point=initial_point,initial_tangent_vec=tangent_vector)
Similarly,onmanifolds,thelogarithmmapistheoperationthatgeneralizesthesubtractionoftwopointsonvectorspaces.Thelogarithmmaptakestwopointsonthemanifoldasinputsandreturnsthetangentvectorrequiredto“shoot”fromonepointto
Fig.3:Elementaryoperationsonmanifoldsillustratedonthesphere.Theexponentialmapattheinitialpoint(bluepoint)shootstheblacktangentvectoralongthegeodesic,andgivestheendpoint(orangepoint).Conversely,thelogarithmmapattheinitialpoint(bluepoint)takestheendpoint(orangepoint)asinput,andoutputstheblacktangentvector.Thegeodesicbetweenthebluepointandtheorangepointrepresentsthepathofshortestlengthbetweenthetwopoints.
theother.Atanypoint,itistheinverseoftheexponentialmap.InFigure
3
,thelogarithmoftheorangepointatthebluepointreturnsthetangentvectorinblack.Thiscodesnippetshowshowtocomputethelogarithmmapwithgeomstats.
log=sphere.metric.log(
point=end_point,base_point=initial_point)
Weemphasizethattheexponentialandlogarithmmapsdependonthe“Riemannianmetric”chosenforagivenmanifold:observeinthecodesnippetsthattheyarenotmethodsofthesphereobject,butratherofitsmetricattribute.TheRiemannianmetricdefinesthenotionofexponential,logarithm,geodesicanddistancebetweenpointsonthemanifold.Wecouldhavechosenadifferentmetriconthespherethatwouldhavechangedthedistancebetweenthepoints:withadifferentmetric,the“sphere”could,forexample,looklikeanellipsoid.
Usingtheexponentialandlogarithmmapsinsteadoflinearadditionandsubtraction,manylearningalgorithmscanbegen-eralizedtomanifolds.Weillustratedtheuseoftheexponentialandlogarithmmapsonthesphereonly;yet,geomstatspro-videstheirimplementationforover15differentmanifoldsinitsgeometrymodulewithsupportforavarietyofRiemannianmetrics.Consequently,geomstatsalsoimplementslearningalgorithmsonmanifolds,takingintoaccounttheirspecificgeo-metricstructurebyrelyingontheoperationswejustintroduced.Thenexttutorialsshowmoreinvolvedexamplesofsuchgeometriclearningalgorithms.
Tutorial:ClassificationofSPDMatrices
Tutorialcontextanddescription
Wedemonstratethatanystandardmachinelearningalgorithmcanbeappliedtodataonmanifoldswhilerespectingtheirgeometry.Intheprevioustutorials,wesawthatlinearoperations(mean,linearweighting,additionandsubtraction)arenotdefinedonmanifolds.However,eachpointonamanifoldhasanassociatedtangentspacewhichisavectorspace.Assuch,inthetangentspace,theseoperationsarewelldefined!Therefore,wecanusethelogarithmmap(seeFigure
3
fromtheprevioustutorial)togofrompointson
manifoldstovectorsinthetangentspaceatareferencepoint.Thisfirststrategyenablestheuseoftraditionallearningalgorithmsonmanifolds.
Asecondstrategycanbedesignedforlearningalgorithms,suchasK-NearestNeighborsclassification,thatrelyonlyondistancesordissimilaritymetrics.Inthiscase,wecancomputethepairwisedistancesbetweenthedatapointsonthemanifold,usingthemethodmetric.dist,andfeedthemtothechosenalgorithm.
Bothstrategiescanbeappliedtoanymanifold-valueddata.Inthistutorial,weconsidersymmetricpositivedefinite(SPD)matri-cesfrombrainconnectomicsdataandperformlogisticregressionandK-NearestNeighborsclassification.
SPDmatricesintheliterature
Beforedivingintothetutorial,letusrecallafewapplicationsofSPDmatricesinthemachinelearningliterature.SPDmatricesareubiquitousacrossmanyfields[
CS16
],eitherasinputoforoutputtoagivenproblem.InDTIforinstance,voxelsarerepresentedby"diffusiontensors"whichare3x3SPDmatricesrepresentingellipsoidsintheirstructure.Theseellipsoidsspatiallycharacterizethediffusionofwatermoleculesinvarioustissues.EachDTIthusconsistsofafieldofSPDmatrices,whereeachpointinspacecorrespondstoanSPDmatrix.Thesematricesthenserveasinputstoregressionmodels.In[
YZLM12
]forexample,theauthorsuseanintrinsiclocalpolynomialregressiontocomparefibertractsbetweenHIVsubjectsandacontrolgroup.Similarly,infMRI,itispossibletoextractconnectivitygraphsfromtimeseriesofpatients’resting-stateimages[
WZD+13
].TheregularizedgraphLaplaciansofthesegraphsformadatasetofSPDmatrices.Thisprovidesacompactsummaryofbrainconnectivitypatternswhichisusefulforassessingneurologicalresponsestoavarietyofstimuli,suchasdrugsorpatient’sactivities.
Moregenerallyspeaking,covariancematricesarealsoSPDmatriceswhichappearinmanysettings.Covarianceclusteringcanbeusedforvariousapplicationssuchassoundcompressioninacousticmodelsofautomaticspeechrecognition(ASR)systems[
SMA10
]orformaterialclassification[
FHP15
],amongothers.Covariancedescriptorsarealsopopularimageorvideodescriptors[
HHLS16
].
Lastly,SPDmatriceshavefoundapplicationsindeeplearning.Theauthorsof[
GWB+19
]showthatanaggregationoflearneddeepconvolutionalfeaturesintoanSPDmatrixcreatesarobustrepresentationofimageswhichoutperformsstate-of-the-artmeth-odsforvisualclassification.
ManifoldofSPDmatrices
LetusrecallthemathematicaldefinitionofthemanifoldofSPDmatrices.ThemanifoldofSPDmatricesinndimensionsisembeddedintheGeneralLineargroupofinvertiblematricesanddefinedas:
SPD={S∈Rn×n:ST=S,?z∈Rn,z/=0,zTSz>0}.
TheclassSPDMatricesSpaceinheritsfromtheclassEmbeddedManifoldandhasanembedding_manifoldattributewhichstoresanobjectoftheclassGeneralLinear.SPDmatricesin2dimensionscanbevisualizedasellipseswithprincipalaxesgivenbytheeigenvectorsoftheSPDma-trix,andthelengthofeachaxisproportionaltothesquare-rootofthecorrespondingeigenvalue.Thisisimplementedinthe
Class1
Class2
Class3
Fig.4:SimulateddatasetofSPDmatricesin2dimensions.Weobserve3classesofSPDmatrices,illustratedwiththecolorsred,green,andblue.Thecentroidofeachclassisrepresentedbyanellipseoflargerwidth.
visualizationmoduleofgeomstats.Wegenerateatoydata-setandplotitinFigure
4
withthefollowingcodesnippet.
importgeomstats.datasets.sample_sdp_2dassampler
n_samples=100
dataset_generator=sampler.DatasetSPD2D(n_samples,n_features=2,n_classes=3)
ellipsis=visualization.Ellipsis2D()
fori,xinenumerate(data):
y=sampler.get_label_at_index(i,labels)ellipsis.draw(
x,color=ellipsis.colors[y],alpha=.1)
Figure
4
showsadatasetofSPDmatricesin2dimensionsorganizedinto3classes.Thisvisualizationhelpsindevelopinganintuitionontheconnectomesdatasetthatisusedintheupcomingtutorial,wherewewillclassifySPDmatricesin28dimensionsinto2classes.
ClassifyingbrainconnectomesinGeomstats
Wenowdelveintothetutorialinordertoillustratetheuseoftraditionallearningalgorithmsonthetangentspacesofmanifoldsimplementedingeomstats.Weusebrainconnectomedatafromthe
MSLP2014SchizophreniaChallenge
.Theconnectomesarecorrelationmatricesextractedfromthetime-seriesofresting-statefMRIsof86patientsat28brainregionsofinterest:theyarepointsonthemanifoldofSPDmatricesinn=28dimensions.Ourgoalistousetheconnectomestoclassifypatientsintotwoclasses:schizophrenicandcontrol.FirstweloadtheconnectomesanddisplaytwoofthemasheatmapsinFigure
5
.
importgeomstats.datasets.utilsasdata_utils
data,patient_ids,labels=\data_utils.load_connectomes()
MultiplemetricscanbeusedtocomputeonthemanifoldofSPDmatrices[
DKZ09
].Asmentionnedintheprevioustutorial,differ-entmetricsdefinedifferentgeodesics,exponentialandlogarithmmapsandthereforedifferentalgorithmsonagivenmanifold.Here,weimporttwoofthemostcommonlyusedmetricsontheSPDmatrices,thelog-Euclideanmetricandtheaffine-invariantmetric[
PFA06
],butwehighlightthatgeomstatscontainsmanymore.WealsocheckthatourconnectomedataindeedbelongstothemanifoldofSPDmatrices:
Schizophrenic Healthy
Correlations
Andwiththeaffine-invariantmetric,replacingle_metricby
ai_metricintheabovesnippet:
INFO:0.71
Weobservethattheresultdependsonthemetric.TheRiemannianmetricindeeddefinesthenotionofthelogarithmmap,which
-0.5
1. isusedtocomputetheFréchetMeanandthetangentvectorscorrespondingtotheinputdatapoints.Thus,changingthemetric
changestheresult.Furthermore,somemetricsmaybemore
Fig.5:Subsetoftheconnectomesdataset,availableingeomstatswiththefunctionload_connectomesfromthemod-uledatasets.utils.Connectomesarecorrelationmatricesof28time-seriesextractedfromfMRIdata:theyareelementsofthemanifoldofSPDmatricesin28dimensions.Left:connectomeofaschizophrenicsubject.Right:connectomeofahealthycontrol.
importgeomstats.geometry.spd_matricesasspd
manifold=spd.SPDMatrices(n=28)
le_metric=spd.SPDMetricLogEuclidean(n=28)ai_metric=spd.SPDMetricAffine(n=28)(gs.all(manifold.belongs(data)))
INFO:True
Great!Now,althoughthesumoftwoSPDmatricesisanSPDmatrix,theirdifferenceortheirlinearcombinationwithnon-positiveweightsarenotnecessarily.ThereforeweneedtoworkinatangentspaceoftheSPDmanifoldtoperformsimplemachinelearningthatreliesonlinearoperations.ThepreprocessingmodulewithitsToTangentSpaceclassallowstodoexactlythis.
fromgeomstats.learning.preprocessingimport\ToTangentSpace
ToTangentSpacehasasimplepurpose:itcomputestheFréchetMeanofthedataset,andtakesthelogarithmmapofeachdatapointfromthemean.Thisresultsinadatasetoftangentvectorsatthemean.InthecaseoftheSPDmani-fold,thesearesimplysymmetricmatrices.ToTangentSpacethensqueezeseachsymmetricmatrixintoa1d-vectorofsizedim=28*(28+1)/2,andoutputsanarrayofshape[n_connectomes,dim],whichcanbefedtoyourfavoritescikit-learnalgorithm.
WeemphasizethatToTangentSpacecomputesthemeanoftheinputdata,andthusshouldbeusedinapipeline(ase.g.scikit-learn’sStandardScaler)toavoidleakinginformationfromthetestsetattraintime.
fromsklearn.pipelineimportmake_pipeline
fromsklearn.linear_modelimportLogisticRegression
fromsklearn.model_selectionimportcross_validate
pipeline=make_pipeline(
suitablethanothersfordifferentapplications.Indeed,wefindpublishedresultsthatshowhowusefulgeometrycanbewithdataontheSPDmanifold(e.g[
WAZF18
],[
NDV+14
]).
Wesawhowtousetherepresentationofpointsonthemanifoldastangentvectorsatareferencepointtofitanymachinelearningalgorithm,andwecomparedtheeffectofdifferentmetricsonthemanifoldofSPDmatrices.Anotherclassofmachinelearningal-gorithmscanbeusedveryeasilyonmanifoldswithgeomstats:thoserelyingondissimilaritymatrices.WecancomputethematrixofpairwiseRiemanniandistances,usingthedistmethodoftheRiemannianmetricobject.Inthefollowingcode-snippet,weuseai_metric.distandpassthecorrespondingmatrixpairwise_distofpairwisedistancestoscikit-learn’sK-Nearest-Neighbors(KNN)classificationalgorithm:
fromsklearn.neighborsimportKNeighborsClassifierclassifier=KNeighborsClassifier(
metric='precomputed')
result=cross_validate(
classifier,pairwise_dist,labels)(result['test_score'].mean())
INFO:0.72
Thistutorialshowedhowtoleveragegeomstatstousestandardlearningalgorithmsfordataonamanifold.Inthenexttutorial,weseeamorecomplicatedsituation:thedatapointsarenotprovidedbydefaultaselementsofamanifold.Wewillneedtousethelow-levelgeomstatsoperationstodesignamethodthatembedsthedatasetinthemanifoldofinterest.Onlythen,wecanusealearningalgorithm.
Tutorial:LearningGraphRepresentationswithHyperbolicSpaces
Tutorialcontextanddescription
Thistutorialdemonstrateshowtomakeuseofthelow-levelgeometricoperationsingeomstatstoimplementamethodthatembedsgraphdataintothehyperbolicspace.Thankstothedis-coveryofhyperbolicembeddings,learningonGraph-StructuredData(GSD)hasseenmajorachievementsinrecentyears.Ithadbeenspeculatedforyearsthathyperbolicspacesmaybetterrep-resentGSDthanEuclideanspaces[
Gro87
][
KPK+10
][
BPK10
][
ASM13
].Thesespeculationshaverecentlybeenshown
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 廣安職業(yè)技術(shù)學(xué)院《零售管理實(shí)務(wù)》2023-2024學(xué)年第一學(xué)期期末試卷
- 《折舊的計(jì)算方法》課件
- 《絕密微信課程》課件
- 贛南師范大學(xué)《現(xiàn)代文與中學(xué)語文教學(xué)》2023-2024學(xué)年第一學(xué)期期末試卷
- 艾灸熏蒸培訓(xùn)課件
- 甘肅中醫(yī)藥大學(xué)《暖通空調(diào)》2023-2024學(xué)年第一學(xué)期期末試卷
- 2022年三級人力資源管理師《技能操作》考試題庫(綜合題2)
- 三年級科學(xué)下冊第二單元動物的生命周期第1課蠶卵里孵出的新生命教學(xué)材料教科版
- 精肉培訓(xùn)課件
- 《海量數(shù)據(jù)公司簡介》課件
- 人教版2024-2025學(xué)年第一學(xué)期八年級物理期末綜合復(fù)習(xí)練習(xí)卷(含答案)
- 《上帝擲骰子嗎:量子物理史話》導(dǎo)讀學(xué)習(xí)通超星期末考試答案章節(jié)答案2024年
- 病例報(bào)告表(CRF)模板
- 四年級下冊混合運(yùn)算100道及答案
- 初中物理競賽教程(基礎(chǔ)篇)第16講比熱容
- 親子鑒定書(共3頁)
- 容器支腿計(jì)算公式(支腿計(jì)算主要用于立式容器的支腿受力及地腳螺栓計(jì)算)
- 建設(shè)工程項(xiàng)目施工安全管理流程圖3頁
- 旋翼式煤泥干燥設(shè)備在平舒礦的應(yīng)用
- 消防職業(yè)技能鑒定個人承諾書
- 有限元分析用到的材料屬性表
評論
0/150
提交評論