In recent years, the application of radiomics to lung cancer show encouraging results in the prediction of histological outcomes, survival times, staging of the disease and so on. In this thesis, radiomics and deep learning applications are compared by analyzing their performance in the prediction of the 2-year overall survival (OS) in patients affected by non-small cell lung cancer (NSCLC). The dataset under exam contains 417 patients with both clinical data and computed tomography (CT) examinations of the chest. Radiomics extracts handcrafted radiomic features from the three-dimensional tumor region of interest (ROI). It is the approach that better predicts the 2-year overall survival with a training and test area under the receiver operating characteristic curve (AUC) equal to 0.683 and 0.652. Concerning deep learning applications, two methods are considered in this thesis: deep features and convolutional neural networks (CNN). The first method is similar to radiomics, but substitutes handcrafted features with deep features extracted from the bi-dimensional slices that build the three-dimensional tumor ROI. In particular, two different main classes of deep features are considered: the latent variables returned by a convolutional autoencoder (CAE) and the inner features learnt by a pre-trained CNN. The results for latent variables returned by CAE show an AUC of 0.692 in training set and 0.631 in test set. The second method considers the direct classification of the CT images themselves by means of CNN. They perform better than deep features and they reach an AUC equal to 0.692 in training set and 0.644 in test set. For CNN, the impact of using generative adversarial networks (GAN) to increase the dataset dimension is also investigated. This analysis results in poorly defined images, where the synthesis of the bones is incompatible with the actual structure of the tumor mass. In general, deep learning applications perform worse than radiomics, both in terms of lower AUC and greater generalization gap between training and test sets. The main issue encountered in their training is the limited number of patients that is responsible for overfitting on CNN, inacurrate reconstructions on CAE and poor synthetic images on GAN. This limit is reflected in the necessity to reduce the complexity of the models by implementing a two-dimensional analysis of the tumor masses, in contrast with the three-dimensional study performed by radiomics. However, the bi-dimensional restriction is responsible for an incomplete description of the tumor masses, reducing the predictive capabilities of deep learning applications. In summary, our analysis, spanning a wide set of more than 7000 combinations, shows that with the current dataset it is only possible to match the performances of previous works. This detailed survey suggests that we have reached the state of the art in terms of analysis and that more data are needed to improve the predictions.

Radiomics and machine learning methods for 2-year overall survival prediction in non-small cell lung cancer patients

Braghetto, Anna
2021/2022

Abstract

In recent years, the application of radiomics to lung cancer show encouraging results in the prediction of histological outcomes, survival times, staging of the disease and so on. In this thesis, radiomics and deep learning applications are compared by analyzing their performance in the prediction of the 2-year overall survival (OS) in patients affected by non-small cell lung cancer (NSCLC). The dataset under exam contains 417 patients with both clinical data and computed tomography (CT) examinations of the chest. Radiomics extracts handcrafted radiomic features from the three-dimensional tumor region of interest (ROI). It is the approach that better predicts the 2-year overall survival with a training and test area under the receiver operating characteristic curve (AUC) equal to 0.683 and 0.652. Concerning deep learning applications, two methods are considered in this thesis: deep features and convolutional neural networks (CNN). The first method is similar to radiomics, but substitutes handcrafted features with deep features extracted from the bi-dimensional slices that build the three-dimensional tumor ROI. In particular, two different main classes of deep features are considered: the latent variables returned by a convolutional autoencoder (CAE) and the inner features learnt by a pre-trained CNN. The results for latent variables returned by CAE show an AUC of 0.692 in training set and 0.631 in test set. The second method considers the direct classification of the CT images themselves by means of CNN. They perform better than deep features and they reach an AUC equal to 0.692 in training set and 0.644 in test set. For CNN, the impact of using generative adversarial networks (GAN) to increase the dataset dimension is also investigated. This analysis results in poorly defined images, where the synthesis of the bones is incompatible with the actual structure of the tumor mass. In general, deep learning applications perform worse than radiomics, both in terms of lower AUC and greater generalization gap between training and test sets. The main issue encountered in their training is the limited number of patients that is responsible for overfitting on CNN, inacurrate reconstructions on CAE and poor synthetic images on GAN. This limit is reflected in the necessity to reduce the complexity of the models by implementing a two-dimensional analysis of the tumor masses, in contrast with the three-dimensional study performed by radiomics. However, the bi-dimensional restriction is responsible for an incomplete description of the tumor masses, reducing the predictive capabilities of deep learning applications. In summary, our analysis, spanning a wide set of more than 7000 combinations, shows that with the current dataset it is only possible to match the performances of previous works. This detailed survey suggests that we have reached the state of the art in terms of analysis and that more data are needed to improve the predictions.
2021-04
106
Radiomic analysis, Deep neural networks, Lung cancer, Overall survival, Classification
File in questo prodotto:
File Dimensione Formato  
Tesi_Braghetto_Anna.pdf

accesso aperto

Dimensione 10.14 MB
Formato Adobe PDF
10.14 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/21201