Extracting Cosmological Information from Large-Scale Structure Morphology using Deep Learning

The Large-Scale Structure (LSS) of the Universe describes the distribution of galaxies and matter on scales much larger than individual galaxies or clusters. This cosmic web forms an intricate network of voids, sheets, filaments, and knots. Its morphology encodes key information about structure formation and cosmological parameters. Classifying these structures with high precision and accuracy is therefore essential for modern cosmological analysis. In this work, we present a reproducible pipeline based on 3D Deep Learning to classify LSS into these four classes. We used N-body simulations from the Quĳote Simulations suite. In particular, we analyzed a cubic volume with a side length of 1 h−1Gpc at redshift z = 0. By applying the Piece- wise Cubic Spline (PCS) mass assignment method, we distributed the particles from the raw snapshots into a 1283 voxel grid. From this grid, we computed the velocity, density and overdensity fields needed to develop the following methods. To train our supervised model, we generated ground-truth labels using the Hoffman method, which uses the eigenvalues of the shear tensor to produce the classification. We implemented a 3D U-Net architecture, which is widely used for image segmentation, to capture the multi-scale spatial correlations of the cosmic web. Before training the model, we applied data augmentation with random 90-degree rotations and axis flips. The U-Net begins with 64 initial filters, descending through four downsampling blocks to a bottleneck of 1024 channels, followed by four corresponding upsampling blocks. Due to the heavy class imbalance present in the data, in fact, voids occupy the majority of the volume, we optimized the model using a hybrid loss function composed of the classic cross-entropy and the Dice loss. Moreover, we applied a weighting scheme based on the effective number of samples per class to avoid biasing toward the majority classes. During inference on the test set, the U-Net achieved a mean Dice coefficient of 0.788, with per-class scores ranging from 0.931 for voids to 0.633 for knots. Despite the strong class imbalance, it is clear that the class balance strategy is effective in maintaining high segmentation fidelity across all scales. The strong point of this work is the physical consistency of the network’s predictions, which is underlined by the robust agreement between the Volume and Mass Filling Fractions (VFF, MFF) computed from the U-Net predictions with the ground-truth Hoffman classification, with differences below 10−3, highlighting that the model successfully captured the underlying density-morphology relation. This confirms the 3D U-Net as a robust, physically-consistent tool for high-resolution cosmic web segmentation.

Extracting Cosmological Information from Large-Scale Structure Morphology using Deep Learning

BOCCANERA, EUGENIA

2025/2026

Abstract

The Large-Scale Structure (LSS) of the Universe describes the distribution of galaxies and matter on scales much larger than individual galaxies or clusters. This cosmic web forms an intricate network of voids, sheets, filaments, and knots. Its morphology encodes key information about structure formation and cosmological parameters. Classifying these structures with high precision and accuracy is therefore essential for modern cosmological analysis. In this work, we present a reproducible pipeline based on 3D Deep Learning to classify LSS into these four classes. We used N-body simulations from the Quĳote Simulations suite. In particular, we analyzed a cubic volume with a side length of 1 h−1Gpc at redshift z = 0. By applying the Piece- wise Cubic Spline (PCS) mass assignment method, we distributed the particles from the raw snapshots into a 1283 voxel grid. From this grid, we computed the velocity, density and overdensity fields needed to develop the following methods. To train our supervised model, we generated ground-truth labels using the Hoffman method, which uses the eigenvalues of the shear tensor to produce the classification. We implemented a 3D U-Net architecture, which is widely used for image segmentation, to capture the multi-scale spatial correlations of the cosmic web. Before training the model, we applied data augmentation with random 90-degree rotations and axis flips. The U-Net begins with 64 initial filters, descending through four downsampling blocks to a bottleneck of 1024 channels, followed by four corresponding upsampling blocks. Due to the heavy class imbalance present in the data, in fact, voids occupy the majority of the volume, we optimized the model using a hybrid loss function composed of the classic cross-entropy and the Dice loss. Moreover, we applied a weighting scheme based on the effective number of samples per class to avoid biasing toward the majority classes. During inference on the test set, the U-Net achieved a mean Dice coefficient of 0.788, with per-class scores ranging from 0.931 for voids to 0.633 for knots. Despite the strong class imbalance, it is clear that the class balance strategy is effective in maintaining high segmentation fidelity across all scales. The strong point of this work is the physical consistency of the network’s predictions, which is underlined by the robust agreement between the Volume and Mass Filling Fractions (VFF, MFF) computed from the U-Net predictions with the ground-truth Hoffman classification, with differences below 10−3, highlighting that the model successfully captured the underlying density-morphology relation. This confirms the 3D U-Net as a robust, physically-consistent tool for high-resolution cosmic web segmentation.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Fisica e Astronomia "Galileo Galilei" - DFA
			
	Corso di studio
	
				PHYSICS OF DATA Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2025
			
	Titolo inglese
	
				Extracting Cosmological Information from Large-Scale Structure Morphology using Deep Learning
			
	Abstract in italiano
	
				The Large-Scale Structure (LSS) of the Universe describes the distribution of galaxies and
matter on scales much larger than individual galaxies or clusters. This cosmic web forms an
intricate network of voids, sheets, filaments, and knots. Its morphology encodes key information 
about structure formation and cosmological parameters. Classifying these structures
with high precision and accuracy is therefore essential for modern cosmological analysis.
In this work, we present a reproducible pipeline based on 3D Deep Learning to classify LSS
into these four classes.
We used N-body simulations from the Quĳote Simulations suite. In particular, we analyzed
a cubic volume with a side length of 1 h−1Gpc at redshift z = 0. By applying the Piece-
wise Cubic Spline (PCS) mass assignment method, we distributed the particles from the raw
snapshots into a 1283 voxel grid. From this grid, we computed the velocity, density and
overdensity fields needed to develop the following methods.
To train our supervised model, we generated ground-truth labels using the Hoffman method,
which uses the eigenvalues of the shear tensor to produce the classification.
We implemented a 3D U-Net architecture, which is widely used for image segmentation,
to capture the multi-scale spatial correlations of the cosmic web. Before training the model,
we applied data augmentation with random 90-degree rotations and axis flips. The U-Net
begins with 64 initial filters, descending through four downsampling blocks to a bottleneck
of 1024 channels, followed by four corresponding upsampling blocks. Due to the heavy class
imbalance present in the data, in fact, voids occupy the majority of the volume, we optimized
the model using a hybrid loss function composed of the classic cross-entropy and the Dice
loss. Moreover, we applied a weighting scheme based on the effective number of samples
per class to avoid biasing toward the majority classes.
During inference on the test set, the U-Net achieved a mean Dice coefficient of 0.788, with
per-class scores ranging from 0.931 for voids to 0.633 for knots. Despite the strong class
imbalance, it is clear that the class balance strategy is effective in maintaining high segmentation fidelity across all scales.
The strong point of this work is the physical consistency of the network’s predictions,
which is underlined by the robust agreement between the Volume and Mass Filling Fractions (VFF, MFF) 
computed from the U-Net predictions with the ground-truth Hoffman
classification, with differences below 10−3, highlighting that the model successfully captured 
the underlying density-morphology relation. This confirms the 3D U-Net as a robust,
physically-consistent tool for high-resolution cosmic web segmentation.
			
	Parola chiave
	
				Cosmic Web
Deep Learning
Unet-3D
SemanticSegmentation
			
	Relatore
	
				LIGUORI, MICHELE
			
	Correlatore
	
				SEMENZATO, FEDERICO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Boccanera_Eugenia.pdf accesso aperto Dimensione 4.6 MB Formato Adobe PDF Visualizza/Apri	4.6 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/107349