High-Dimensional Analysis of f-divergence Distributionally Regularized M-estimation

In recent years Distributionally Robust Optimization (DRO) has raised to the status of one of the most used tools for robust estimation. This because it shares some nice properties such as good out-of-sample performances and well-understood regularization effects. The estimator we obtain within this framework is computed by minimizing the expected loss under the worst-case distribution among the ones that are close, in a f -divergence sense, to the empirical distribution which relies just on historical data. In this thesis we propose a regularized in the distributions’ space approach to compute the M-estimator of an unknown parameter using data coming from linear and noisy measurements. The ultimate goal will be to characterize the estimation error which is in general a challenging task but yet very important. Our analysis is performed under the modern assumption of high-dimensional regime in which both the number of measurements and parameters are very large, keeping a fixed proportion while going to infinity which encodes the under/over-parametrization of the problem. Our contribution can be summarized as follows. First, we introduce briefly the tools used in the thesis. These are CGMT, which under the assumption of isotropic Gaussian features permits to recover the estimation error by simply solving a deterministic program with few scalar variables and f -divergences, which is a family of distances that can be used to quantify the discrepancy between probability measures. Building on these results, we formulate the Distributionally Regularized Estimation problem and we will show a dual reformulation of it. Then, we will discuss what are some of the main challenges encountered when applying CGMT to this problem and we will point out how the choice of the regularization parameter λ is crucial in the high-dimension statistics to get a final problem which still encodes robustness parameters. Finally we will show how we can recover the norm of the estimator’s error with a simple deterministic minmax problem.

High-Dimensional Analysis of f-divergence Distributionally Regularized M-estimation

CESCON, RICCARDO

2021/2022

Abstract

In recent years Distributionally Robust Optimization (DRO) has raised to the status of one of the most used tools for robust estimation. This because it shares some nice properties such as good out-of-sample performances and well-understood regularization effects. The estimator we obtain within this framework is computed by minimizing the expected loss under the worst-case distribution among the ones that are close, in a f -divergence sense, to the empirical distribution which relies just on historical data. In this thesis we propose a regularized in the distributions’ space approach to compute the M-estimator of an unknown parameter using data coming from linear and noisy measurements. The ultimate goal will be to characterize the estimation error which is in general a challenging task but yet very important. Our analysis is performed under the modern assumption of high-dimensional regime in which both the number of measurements and parameters are very large, keeping a fixed proportion while going to infinity which encodes the under/over-parametrization of the problem. Our contribution can be summarized as follows. First, we introduce briefly the tools used in the thesis. These are CGMT, which under the assumption of isotropic Gaussian features permits to recover the estimation error by simply solving a deterministic program with few scalar variables and f -divergences, which is a family of distances that can be used to quantify the discrepancy between probability measures. Building on these results, we formulate the Distributionally Regularized Estimation problem and we will show a dual reformulation of it. Then, we will discuss what are some of the main challenges encountered when applying CGMT to this problem and we will point out how the choice of the regularization parameter λ is crucial in the high-dimension statistics to get a final problem which still encodes robustness parameters. Finally we will show how we can recover the norm of the estimator’s error with a simple deterministic minmax problem.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				CONTROL SYSTEMS ENGINEERING Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2021
			
	Titolo inglese
	
				High-Dimensional Analysis of f-divergence Distributionally Regularized M-estimation
			
	Parola chiave
	
				f-divergence
optimization
high-dimensions
			
	Relatore
	
				FERRANTE, AUGUSTO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Cescon_Riccardo.pdf Open Access dal 14/06/2024 Dimensione 606.11 kB Formato Adobe PDF Visualizza/Apri	606.11 kB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/40463