Forskning ved Københavns Universitet - Københavns Universitet

Forside

Geometric Methods in Probabilistic Modelling

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

Dokumenter

  • Anton Mallasto
During the past decade, machine learning has established itself as the foundation of

artificial intelligence, viewing learning as a statistical task that can be well quantified.

An increasingly popular toolkit in machine learning is provided by geometry, which we consider in this thesis in two main categories: (i) the study of the geometry induced by machine learning models; (ii) the design of machine learning models that respect specified geometric properties of data.

First, most machine learning approaches quantify learning objectives by minimizing a loss function between a model and given data. The model can be deterministic or probabilistic, outputting point predictions or stochastic predictions, respectively. In contrast to deterministic models, probabilistic modelling allows us to carry out uncertainty quantification, which helps assess whether a prediction is trustworthy or not. The loss function often describes some kind of geometric similarity between the model and the data, and thus in the case of probabilistic modelling requires studying the geometry of probability measures, which in this thesis is taken to be the optimal transport geometry.

Second, all data has structure. Sometimes, the data structure is not very restrictive, e.g., when the data lives in a vector space, in which case any appropriate vector could a priori be a data point. However, depending on the application, the data structure could be more restrictive. For example, if we are interested in location data on earth, then all possible data points have to lie approximately on a sphere. More generally, the data might naturally live on some low-dimensional surface. This restricts us to models that take the known geometry into account, which can be enforced through the machinery of Riemannian geometry.

Supervisors
OriginalsprogEngelsk
StatusUdgivet - 20 jan. 2020

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk


Ingen data tilgængelig

ID: 233743398