In recent years Distributionally Robust Optimization (DRO) has raised to the status of one of the most used tools for robust estimation. This because it shares some nice properties such as good out-of-sample performances and well-understood regularization effects. The estimator we obtain within this framework is computed by minimizing the expected loss under the worst-case distribution among the ones that are close, in a f -divergence sense, to the empirical distribution which relies just on historical data. In this thesis we propose a regularized in the distributions’ space approach to compute the M-estimator of an unknown parameter using data coming from linear and noisy measurements. The ultimate goal will be to characterize the estimation error which is in general a challenging task but yet very important. Our analysis is performed under the modern assumption of high-dimensional regime in which both the number of measurements and parameters are very large, keeping a fixed proportion while going to infinity which encodes the under/over-parametrization of the problem. Our contribution can be summarized as follows. First, we introduce briefly the tools used in the thesis. These are CGMT, which under the assumption of isotropic Gaussian features permits to recover the estimation error by simply solving a deterministic program with few scalar variables and f -divergences, which is a family of distances that can be used to quantify the discrepancy between probability measures. Building on these results, we formulate the Distributionally Regularized Estimation problem and we will show a dual reformulation of it. Then, we will discuss what are some of the main challenges encountered when applying CGMT to this problem and we will point out how the choice of the regularization parameter λ is crucial in the high-dimension statistics to get a final problem which still encodes robustness parameters. Finally we will show how we can recover the norm of the estimator’s error with a simple deterministic minmax problem.

High-Dimensional Analysis of f-divergence Distributionally Regularized M-estimation

CESCON, RICCARDO
2021/2022

Abstract

In recent years Distributionally Robust Optimization (DRO) has raised to the status of one of the most used tools for robust estimation. This because it shares some nice properties such as good out-of-sample performances and well-understood regularization effects. The estimator we obtain within this framework is computed by minimizing the expected loss under the worst-case distribution among the ones that are close, in a f -divergence sense, to the empirical distribution which relies just on historical data. In this thesis we propose a regularized in the distributions’ space approach to compute the M-estimator of an unknown parameter using data coming from linear and noisy measurements. The ultimate goal will be to characterize the estimation error which is in general a challenging task but yet very important. Our analysis is performed under the modern assumption of high-dimensional regime in which both the number of measurements and parameters are very large, keeping a fixed proportion while going to infinity which encodes the under/over-parametrization of the problem. Our contribution can be summarized as follows. First, we introduce briefly the tools used in the thesis. These are CGMT, which under the assumption of isotropic Gaussian features permits to recover the estimation error by simply solving a deterministic program with few scalar variables and f -divergences, which is a family of distances that can be used to quantify the discrepancy between probability measures. Building on these results, we formulate the Distributionally Regularized Estimation problem and we will show a dual reformulation of it. Then, we will discuss what are some of the main challenges encountered when applying CGMT to this problem and we will point out how the choice of the regularization parameter λ is crucial in the high-dimension statistics to get a final problem which still encodes robustness parameters. Finally we will show how we can recover the norm of the estimator’s error with a simple deterministic minmax problem.
2021
High-Dimensional Analysis of f-divergence Distributionally Regularized M-estimation
f-divergence
optimization
high-dimensions
File in questo prodotto:
File Dimensione Formato  
Cescon_Riccardo.pdf

embargo fino al 13/06/2024

Dimensione 606.11 kB
Formato Adobe PDF
606.11 kB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/40463