Laparoscopic cholecystectomy is a minimally invasive surgical procedure to remove the gallbladder. The procedure is beneficial to patients for several reasons, the most prevalent of which is to treat the pain and risk of infection associated with gallstones. One of the most common complications that can occur during this procedure is a bile duct injury, which can be accompanied by long term consequences for the patient’s health. Such procedural injuries can be prevented or at the very least minimised by adhering to a uniform, standardised protocol which divides the surgery into operational phases. Much progress has been made in the use of manual workflow organisation during the operation, and the advent of recently robotic surgery provides us with a wealth of surgical data captured throughout the procedure. Given these datasets, we can leverage Artificial Intelligence and Machine Learning to propose an automated approach to surgical phase recognition. This thesis will explore the feasibility of using Deep Learning techniques to successfully capture discriminating features in the data and perform automated segmentation of a complete cholecystectomy procedure into its constituent phases. In order to train and evaluate the performance of our architectures, we will be using a publicly available data set of surgical videos and their respective telemetry signals. The outcome of this thesis therefore is not just a modelling evaluation from an architectural perspective but also understanding the optimal approach to data ingestion; be it telemetry, video or a combination of the two. We will investigate current state-of-the-art approaches whilst proposing a new architecture for this problem. The new approach will be spearheaded by the DeiT Transformer model augmented by a Long short-term memory (LSTM) network. The output of the transformer and LSTM will then be passed to a hidden Markov model with the goal of fine-tuning the output of the classifier. This approach, which we will refer to as a Data Efficient Temporal Transformer (DeTT) model is what we are proposing as our final architecture for surgical workflow recognition for the cholecystectomy procedure.

Laparoscopic cholecystectomy is a minimally invasive surgical procedure to remove the gallbladder. The procedure is beneficial to patients for several reasons, the most prevalent of which is to treat the pain and risk of infection associated with gallstones. One of the most common complications that can occur during this procedure is a bile duct injury, which can be accompanied by long term consequences for the patient’s health. Such procedural injuries can be prevented or at the very least minimised by adhering to a uniform, standardised protocol which divides the surgery into operational phases. Much progress has been made in the use of manual workflow organisation during the operation, and the advent of recently robotic surgery provides us with a wealth of surgical data captured throughout the procedure. Given these datasets, we can leverage Artificial Intelligence and Machine Learning to propose an automated approach to surgical phase recognition. This thesis will explore the feasibility of using Deep Learning techniques to successfully capture discriminating features in the data and perform automated segmentation of a complete cholecystectomy procedure into its constituent phases. In order to train and evaluate the performance of our architectures, we will be using a publicly available data set of surgical videos and their respective telemetry signals. The outcome of this thesis therefore is not just a modelling evaluation from an architectural perspective but also understanding the optimal approach to data ingestion; be it telemetry, video or a combination of the two. We will investigate current state-of-the-art approaches whilst proposing a new architecture for this problem. The new approach will be spearheaded by the DeiT Transformer model augmented by a Long short-term memory (LSTM) network. The output of the transformer and LSTM will then be passed to a hidden Markov model with the goal of fine-tuning the output of the classifier. This approach, which we will refer to as a Data Efficient Temporal Transformer (DeTT) model is what we are proposing as our final architecture for surgical workflow recognition for the cholecystectomy procedure.

DeTT: Data Efficient Temporal Transformers for Surgical Phase Recognition

KULAZHENKOV, IVAN
2022/2023

Abstract

Laparoscopic cholecystectomy is a minimally invasive surgical procedure to remove the gallbladder. The procedure is beneficial to patients for several reasons, the most prevalent of which is to treat the pain and risk of infection associated with gallstones. One of the most common complications that can occur during this procedure is a bile duct injury, which can be accompanied by long term consequences for the patient’s health. Such procedural injuries can be prevented or at the very least minimised by adhering to a uniform, standardised protocol which divides the surgery into operational phases. Much progress has been made in the use of manual workflow organisation during the operation, and the advent of recently robotic surgery provides us with a wealth of surgical data captured throughout the procedure. Given these datasets, we can leverage Artificial Intelligence and Machine Learning to propose an automated approach to surgical phase recognition. This thesis will explore the feasibility of using Deep Learning techniques to successfully capture discriminating features in the data and perform automated segmentation of a complete cholecystectomy procedure into its constituent phases. In order to train and evaluate the performance of our architectures, we will be using a publicly available data set of surgical videos and their respective telemetry signals. The outcome of this thesis therefore is not just a modelling evaluation from an architectural perspective but also understanding the optimal approach to data ingestion; be it telemetry, video or a combination of the two. We will investigate current state-of-the-art approaches whilst proposing a new architecture for this problem. The new approach will be spearheaded by the DeiT Transformer model augmented by a Long short-term memory (LSTM) network. The output of the transformer and LSTM will then be passed to a hidden Markov model with the goal of fine-tuning the output of the classifier. This approach, which we will refer to as a Data Efficient Temporal Transformer (DeTT) model is what we are proposing as our final architecture for surgical workflow recognition for the cholecystectomy procedure.
2022
DeTT: Data Efficient Temporal Transformers for Surgical Phase Recognition
Laparoscopic cholecystectomy is a minimally invasive surgical procedure to remove the gallbladder. The procedure is beneficial to patients for several reasons, the most prevalent of which is to treat the pain and risk of infection associated with gallstones. One of the most common complications that can occur during this procedure is a bile duct injury, which can be accompanied by long term consequences for the patient’s health. Such procedural injuries can be prevented or at the very least minimised by adhering to a uniform, standardised protocol which divides the surgery into operational phases. Much progress has been made in the use of manual workflow organisation during the operation, and the advent of recently robotic surgery provides us with a wealth of surgical data captured throughout the procedure. Given these datasets, we can leverage Artificial Intelligence and Machine Learning to propose an automated approach to surgical phase recognition. This thesis will explore the feasibility of using Deep Learning techniques to successfully capture discriminating features in the data and perform automated segmentation of a complete cholecystectomy procedure into its constituent phases. In order to train and evaluate the performance of our architectures, we will be using a publicly available data set of surgical videos and their respective telemetry signals. The outcome of this thesis therefore is not just a modelling evaluation from an architectural perspective but also understanding the optimal approach to data ingestion; be it telemetry, video or a combination of the two. We will investigate current state-of-the-art approaches whilst proposing a new architecture for this problem. The new approach will be spearheaded by the DeiT Transformer model augmented by a Long short-term memory (LSTM) network. The output of the transformer and LSTM will then be passed to a hidden Markov model with the goal of fine-tuning the output of the classifier. This approach, which we will refer to as a Data Efficient Temporal Transformer (DeTT) model is what we are proposing as our final architecture for surgical workflow recognition for the cholecystectomy procedure.
Deep Learning
Surgical DataScience
Transformers
File in questo prodotto:
File Dimensione Formato  
Data_Science_MsC_Thesis____IK.pdf

accesso riservato

Dimensione 6.25 MB
Formato Adobe PDF
6.25 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/50207