loading

Logout succeed

Logout succeed. See you again!

ebook img

Information Geometry PDF

pages411 Pages
release year2017
file size2.616 MB
languageEnglish

Preview Information Geometry

Ergebnisse der Mathematik und Volume 64 ihrer Grenzgebiete 3.Folge A Series of Modern Surveys in Mathematics EditorialBoard L.Ambrosio,Pisa V.Baladi,Paris G.-M.Greuel,Kaiserslautern M.Gromov,Bures-sur-Yvette G.Huisken,Tübingen J.Jost,Leipzig J.Kollár,Princeton G.Laumon,Orsay U.Tillmann,Oxford J.Tits,Paris D.B.Zagier,Bonn Forfurthervolumes: www.springer.com/series/728 Nihat Ay (cid:2) Jürgen Jost (cid:2) Hông Vân Lê (cid:2) Lorenz Schwachhöfer Information Geometry NihatAy HôngVânLê InformationTheoryofCognitiveSystems MathematicalInstituteofASCR MPIforMathematicsintheSciences CzechAcademyofSciences Leipzig,Germany Praha1,CzechRepublic and SantaFeInstitute,SantaFe,NM,USA LorenzSchwachhöfer DepartmentofMathematics JürgenJost TUDortmundUniversity GeometricMethodsandComplexSystems Dortmund,Germany MPIforMathematicsintheSciences Leipzig,Germany and SantaFeInstitute,SantaFe,NM,USA ISSN0071-1136 ISSN2197-5655(electronic) ErgebnissederMathematikundihrerGrenzgebiete.3.Folge/ASeriesofModernSurveys inMathematics ISBN978-3-319-56477-7 ISBN978-3-319-56478-4(eBook) DOI10.1007/978-3-319-56478-4 LibraryofCongressControlNumber:2017951855 Mathematics Subject Classification: 60A10, 62B05, 62B10, 62G05, 53B21, 53B05, 46B20, 94A15, 94A17,94B27 ©SpringerInternationalPublishingAG2017 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. Printedonacid-freepaper ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerInternationalPublishingAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface Information geometry is the differential geometric treatment of statistical models. Ittherebyprovidesthemathematicalfoundationofstatistics.Informationgeometry therefore is of interest both for its beautiful mathematicalstructure and for the in- sightitprovidesintostatisticsanditsapplications.Informationgeometrycurrently is a very active field. For instance, Springer will soon launch a new topical jour- nal “Information Geometry”. We therefore think that the time is appropriate for a monograph on information geometry that develops the underlying mathematical theoryinfullgeneralityandrigor,thatexplorestheconnectionstoothermathemat- icaldisciplines,andthatprovesabstractandgeneralversionsoftheclassicalresults of statistics, like the Cramér–Rao inequality or Chentsov’s theorem. These, then, arethepurposesofthepresentbook,andwehopethatitwillbecomethestandard referenceforthefield. ParametricstatisticsasintroducedbyR.Fisherconsidersparametrizedfamilies of probability measures on some finite or infinite sample space Ω. Typically, one wishes to identify a parameter so that the resulting probability measure best fits the observation among the measures in the family. This naturally leads to quanti- tative questions, in particular, how sensitively the measures in the family depend ontheparameter.Forthis,ageometricperspectiveisexpedient.Thereisanatural metric, the Fisher metric introduced by Rao, on the space of probability measures on Ω. This metric is simply the projective or spherical metric obtained when one considersaprobabilitymeasureasanon-negativemeasurewithascalingfactorto renderitstotalmassequaltounity.TheFishermetricthusisaRiemannianmetric thatinducesacorrespondingstructureonparametrizedfamiliesofprobabilitymea- suresasabove.Furthermore,movingfromonereferencemeasuretoanotheryields an affine structure as discovered by S.I. Amari and N.N. Chentsov. The investiga- tion of these metric and affine structures is therefore called information geometry. Information-theoretical quantities like relative entropies (Kullback–Leibler diver- gences)thenfindanaturalgeometricinterpretation. Information geometry thus provides a way of understanding information- theoreticquantities,statisticalmodels,andcorrespondingstatisticalinferencemeth- ods in geometric terms. In particular, the Fisher metric and the Amari–Chentsov v vi Preface structurearecharacterizedbytheirinvarianceundersufficientstatistics.Severalge- ometricformalismshavebeenidentifiedaspowerfultoolstothisendandemphasize respective geometric aspects of probability theory. In this book, we move beyond theapplicationsinstatisticsanddevelopbothafunctionalanalyticandageometric theorythatareofmathematicalinterestintheirownright.Inparticular,thetheory ofduallyaffinestructuresturnsouttobeananalogueofKählergeometryinareal asopposedtoacomplexsetting. Also,astheconceptofShannoninformationcanberelatedtotheentropycon- ceptsofBoltzmannandGibbs,thereisalsoanaturalconnectionbetweeninforma- tiongeometryandstatisticalmechanics.Finally,informationgeometrycanalsobe usedasafoundationofimportantpartsofmathematicalbiology,likethetheoryof replicatorequationsandmathematicalpopulationgenetics. Samplespacescouldbefinite,butmoreoftenthannot,theyareinfinite,forin- stance,subsetsofsome(finite-oreveninfinite-dimensional)Euclideanspace.The spacesofmeasuresonsuchspacesthereforeareinfinite-dimensionalBanachspaces. Consequently, the differential geometric approach needs to be supplemented by functional analytic considerations. One of the purposes of this book therefore is to provide a general framework that integrates the differential geometry into the functionalanalysis. Acknowledgements We would like to thank Shun-ichi Amari for many fruitful discussions. This work wasmainlycarriedoutattheMaxPlanckInstituteforMathematicsintheSciences inLeipzig.IthasalsobeensupportedbytheBSIatRIKENinTokyo,theASSMS, GCUinLahore-Pakistan,theVNUforSciencesinHanoi,theMathematicalInsti- tuteoftheAcademyofSciencesoftheCzechRepublicinPrague,andtheSantaFe Institute.Wearegratefulfortheexcellentworkingconditionsandfinancialsupport of these institutions during extended visits of some of us. In particular, we should liketothankAntjeVandenbergforheroutstandinglogisticsupport. TheresearchofJ.J. leadingtohisbookcontributionhasreceivedfundingfrom the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no. 267087. The research of H.V.L.issupportedbyRVO:67985840. vii Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 ABriefSynopsis . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 AnInformalDescription . . . . . . . . . . . . . . . . . . . . . . 6 1.2.1 TheFisherMetricandtheAmari–ChentsovStructure forFiniteSampleSpaces . . . . . . . . . . . . . . . . . 7 1.2.2 InfiniteSampleSpacesandFunctionalAnalysis . . . . . 8 1.2.3 ParametricStatistics . . . . . . . . . . . . . . . . . . . . 10 1.2.4 ExponentialandMixtureFamiliesfromthePerspective ofDifferentialGeometry . . . . . . . . . . . . . . . . . 14 1.2.5 InformationGeometryandInformationTheory . . . . . . 15 1.3 HistoricalRemarks. . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4 OrganizationofthisBook . . . . . . . . . . . . . . . . . . . . . 20 2 FiniteInformationGeometry . . . . . . . . . . . . . . . . . . . . . 25 2.1 ManifoldsofFiniteMeasures . . . . . . . . . . . . . . . . . . . 25 2.2 TheFisherMetric . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.3 GradientFields . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.4 Them-ande-Connections. . . . . . . . . . . . . . . . . . . . . 42 2.5 TheAmari–ChentsovTensorandtheα-Connections . . . . . . . 47 2.5.1 TheAmari–ChentsovTensor . . . . . . . . . . . . . . . 47 2.5.2 Theα-Connections . . . . . . . . . . . . . . . . . . . . 50 2.6 CongruentFamiliesofTensors . . . . . . . . . . . . . . . . . . 52 2.7 Divergences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 2.7.1 Gradient-BasedApproach . . . . . . . . . . . . . . . . . 68 2.7.2 TheRelativeEntropy . . . . . . . . . . . . . . . . . . . 70 2.7.3 Theα-Divergence . . . . . . . . . . . . . . . . . . . . . 73 2.7.4 Thef-Divergence . . . . . . . . . . . . . . . . . . . . . 76 2.7.5 Theq-GeneralizationoftheRelativeEntropy. . . . . . . 78 ix x Contents 2.8 ExponentialFamilies . . . . . . . . . . . . . . . . . . . . . . . 79 2.8.1 ExponentialFamiliesasAffineSpaces . . . . . . . . . . 79 2.8.2 ImplicitDescriptionofExponentialFamilies . . . . . . . 84 2.8.3 InformationProjections . . . . . . . . . . . . . . . . . . 91 2.9 HierarchicalandGraphicalModels . . . . . . . . . . . . . . . . 100 2.9.1 InteractionSpaces . . . . . . . . . . . . . . . . . . . . . 101 2.9.2 HierarchicalModels . . . . . . . . . . . . . . . . . . . . 108 2.9.3 GraphicalModels . . . . . . . . . . . . . . . . . . . . . 112 3 ParametrizedMeasureModels . . . . . . . . . . . . . . . . . . . . 121 3.1 TheSpaceofProbabilityMeasuresandtheFisherMetric . . . . 121 3.2 ParametrizedMeasureModels. . . . . . . . . . . . . . . . . . . 135 3.2.1 TheStructureoftheSpaceofMeasures . . . . . . . . . . 139 3.2.2 TangentFibrationofSubsetsofBanachManifolds . . . . 140 3.2.3 PowersofMeasures . . . . . . . . . . . . . . . . . . . . 143 3.2.4 ParametrizedMeasureModelsandk-Integrability . . . . 150 3.2.5 Canonicaln-Tensorsofann-IntegrableModel . . . . . . 164 3.2.6 SignedParametrizedMeasureModels . . . . . . . . . . 168 3.3 ThePistone–SempiStructure . . . . . . . . . . . . . . . . . . . 170 3.3.1 e-Convergence . . . . . . . . . . . . . . . . . . . . . . . 170 3.3.2 OrliczSpaces . . . . . . . . . . . . . . . . . . . . . . . 172 3.3.3 ExponentialTangentSpaces . . . . . . . . . . . . . . . . 176 4 TheIntrinsicGeometryofStatisticalModels . . . . . . . . . . . . 185 4.1 ExtrinsicVersusIntrinsicGeometricStructures . . . . . . . . . . 185 4.2 ConnectionsandtheAmari–ChentsovStructure . . . . . . . . . 189 4.3 TheDualityBetweenExponentialandMixtureFamilies . . . . . 201 4.4 CanonicalDivergences. . . . . . . . . . . . . . . . . . . . . . . 210 4.4.1 DualStructuresviaDivergences. . . . . . . . . . . . . . 210 4.4.2 AGeneralCanonicalDivergence . . . . . . . . . . . . . 213 4.4.3 RecoveringtheCanonicalDivergenceofaDuallyFlat Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 215 4.4.4 ConsistencywiththeUnderlyingDualisticStructure . . . 217 4.5 StatisticalManifoldsandStatisticalModels . . . . . . . . . . . . 219 4.5.1 StatisticalManifoldsandIsostatisticalImmersions . . . . 220 4.5.2 MonotoneInvariantsofStatisticalManifolds . . . . . . . 223 4.5.3 ImmersionofCompactStatisticalManifoldsintoLinear StatisticalManifolds . . . . . . . . . . . . . . . . . . . . 226 4.5.4 ProofoftheExistenceofIsostatisticalImmersions . . . . 228 4.5.5 ExistenceofStatisticalEmbeddings. . . . . . . . . . . . 238 Contents xi 5 InformationGeometryandStatistics . . . . . . . . . . . . . . . . . 241 5.1 CongruentEmbeddingsandSufficientStatistics . . . . . . . . . 241 5.1.1 StatisticsandCongruentEmbeddings . . . . . . . . . . . 244 5.1.2 MarkovKernelsandCongruentMarkovEmbeddings . . 253 5.1.3 Fisher–NeymanSufficientStatistics . . . . . . . . . . . . 261 5.1.4 InformationLossandMonotonicity . . . . . . . . . . . . 263 5.1.5 Chentsov’sTheoremandItsGeneralization . . . . . . . . 268 5.2 EstimatorsandtheCramér–RaoInequality . . . . . . . . . . . . 277 5.2.1 EstimatorsandTheirBias,MeanSquareError,Variance . 277 5.2.2 AGeneralCramér–RaoInequality . . . . . . . . . . . . 281 5.2.3 ClassicalCramér–RaoInequalities . . . . . . . . . . . . 286 5.2.4 EfficientEstimatorsandConsistentEstimators . . . . . . 287 6 FieldsofApplicationofInformationGeometry . . . . . . . . . . . 295 6.1 ComplexityofCompositeSystems . . . . . . . . . . . . . . . . 295 6.1.1 AGeometricApproachtoComplexity . . . . . . . . . . 296 6.1.2 TheInformationDistancefromHierarchicalModels . . . 298 6.1.3 TheWeightedInformationDistance . . . . . . . . . . . . 307 6.1.4 ComplexityofStochasticProcesses . . . . . . . . . . . . 317 6.2 EvolutionaryDynamics . . . . . . . . . . . . . . . . . . . . . . 327 6.2.1 NaturalSelectionandReplicatorEquations . . . . . . . . 328 6.2.2 ContinuousTimeLimits . . . . . . . . . . . . . . . . . . 333 6.2.3 PopulationGenetics . . . . . . . . . . . . . . . . . . . . 336 6.3 MonteCarloMethods . . . . . . . . . . . . . . . . . . . . . . . 348 6.3.1 LangevinMonteCarlo . . . . . . . . . . . . . . . . . . . 350 6.3.2 HamiltonianMonteCarlo . . . . . . . . . . . . . . . . . 351 6.4 Infinite-DimensionalGibbsFamilies . . . . . . . . . . . . . . . 354 AppendixA MeasureTheory . . . . . . . . . . . . . . . . . . . . . . . 361 AppendixB RiemannianGeometry . . . . . . . . . . . . . . . . . . . 367 AppendixC BanachManifolds . . . . . . . . . . . . . . . . . . . . . . 381 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

See more

The list of books you might like