DeTT: Data Efficient Temporal Transformers for Surgical Phase Recognition

Laparoscopic cholecystectomy is a minimally invasive surgical procedure to remove the gallbladder. The procedure is beneficial to patients for several reasons, the most prevalent of which is to treat the pain and risk of infection associated with gallstones. One of the most common complications that can occur during this procedure is a bile duct injury, which can be accompanied by long term consequences for the patient’s health. Such procedural injuries can be prevented or at the very least minimised by adhering to a uniform, standardised protocol which divides the surgery into operational phases. Much progress has been made in the use of manual workflow organisation during the operation, and the advent of recently robotic surgery provides us with a wealth of surgical data captured throughout the procedure. Given these datasets, we can leverage Artificial Intelligence and Machine Learning to propose an automated approach to surgical phase recognition. This thesis will explore the feasibility of using Deep Learning techniques to successfully capture discriminating features in the data and perform automated segmentation of a complete cholecystectomy procedure into its constituent phases. In order to train and evaluate the performance of our architectures, we will be using a publicly available data set of surgical videos and their respective telemetry signals. The outcome of this thesis therefore is not just a modelling evaluation from an architectural perspective but also understanding the optimal approach to data ingestion; be it telemetry, video or a combination of the two. We will investigate current state-of-the-art approaches whilst proposing a new architecture for this problem. The new approach will be spearheaded by the DeiT Transformer model augmented by a Long short-term memory (LSTM) network. The output of the transformer and LSTM will then be passed to a hidden Markov model with the goal of fine-tuning the output of the classifier. This approach, which we will refer to as a Data Efficient Temporal Transformer (DeTT) model is what we are proposing as our final architecture for surgical workflow recognition for the cholecystectomy procedure.

DeTT: Data Efficient Temporal Transformers for Surgical Phase Recognition

KULAZHENKOV, IVAN

2022/2023

Abstract

Laparoscopic cholecystectomy is a minimally invasive surgical procedure to remove the gallbladder. The procedure is beneficial to patients for several reasons, the most prevalent of which is to treat the pain and risk of infection associated with gallstones. One of the most common complications that can occur during this procedure is a bile duct injury, which can be accompanied by long term consequences for the patient’s health. Such procedural injuries can be prevented or at the very least minimised by adhering to a uniform, standardised protocol which divides the surgery into operational phases. Much progress has been made in the use of manual workflow organisation during the operation, and the advent of recently robotic surgery provides us with a wealth of surgical data captured throughout the procedure. Given these datasets, we can leverage Artificial Intelligence and Machine Learning to propose an automated approach to surgical phase recognition. This thesis will explore the feasibility of using Deep Learning techniques to successfully capture discriminating features in the data and perform automated segmentation of a complete cholecystectomy procedure into its constituent phases. In order to train and evaluate the performance of our architectures, we will be using a publicly available data set of surgical videos and their respective telemetry signals. The outcome of this thesis therefore is not just a modelling evaluation from an architectural perspective but also understanding the optimal approach to data ingestion; be it telemetry, video or a combination of the two. We will investigate current state-of-the-art approaches whilst proposing a new architecture for this problem. The new approach will be spearheaded by the DeiT Transformer model augmented by a Long short-term memory (LSTM) network. The output of the transformer and LSTM will then be passed to a hidden Markov model with the goal of fine-tuning the output of the classifier. This approach, which we will refer to as a Data Efficient Temporal Transformer (DeTT) model is what we are proposing as our final architecture for surgical workflow recognition for the cholecystectomy procedure.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2022
			
	Titolo inglese
	
				DeTT: Data Efficient Temporal Transformers for Surgical Phase Recognition
			
	Abstract in italiano
	
				Laparoscopic cholecystectomy is a minimally invasive surgical procedure to remove the gallbladder. The procedure is beneficial to patients for several reasons, the most prevalent of which is to
treat the pain and risk of infection associated with gallstones. One of the most common complications that can occur during this procedure is a bile duct injury, which can be accompanied
by long term consequences for the patient’s health. Such procedural injuries can be prevented
or at the very least minimised by adhering to a uniform, standardised protocol which divides
the surgery into operational phases. Much progress has been made in the use of manual workflow organisation during the operation, and the advent of recently robotic surgery provides us
with a wealth of surgical data captured throughout the procedure. Given these datasets, we
can leverage Artificial Intelligence and Machine Learning to propose an automated approach
to surgical phase recognition. This thesis will explore the feasibility of using Deep Learning
techniques to successfully capture discriminating features in the data and perform automated
segmentation of a complete cholecystectomy procedure into its constituent phases. In order
to train and evaluate the performance of our architectures, we will be using a publicly available
data set of surgical videos and their respective telemetry signals. The outcome of this thesis
therefore is not just a modelling evaluation from an architectural perspective but also understanding the optimal approach to data ingestion; be it telemetry, video or a combination of
the two. We will investigate current state-of-the-art approaches whilst proposing a new architecture for this problem. The new approach will be spearheaded by the DeiT Transformer
model augmented by a Long short-term memory (LSTM) network. The output of the transformer and LSTM will then be passed to a hidden Markov model with the goal of fine-tuning
the output of the classifier. This approach, which we will refer to as a Data Efficient Temporal Transformer (DeTT) model is what we are proposing as our final architecture for surgical
workflow recognition for the cholecystectomy procedure.
			
	Parola chiave
	
				Deep Learning
Surgical DataScience
Transformers
			
	Relatore
	
				TESTOLIN, ALBERTO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Data_Science_MsC_Thesis____IK.pdf accesso riservato Dimensione 6.25 MB Formato Adobe PDF	6.25 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/50207