In this thesis we explored different approaches to video processing, first we had a glance at the RPCA method, one of the highly-competitive methods with respect to the state-of-the-art computer vision procedure for video processing. This method is based on the separation of the data-matrix in two different matrices found by minimizing two different norms: the nuclear norm and the l_1-norm. The first matrix is low-rank and contains the information related to the background, the second is a sparse matrix that represent the foreground of the video. The minimization problem that leads to the two matrices involve also a parameter l that can be tuned as desired by the user and adjust the precision of the reconstruction. The developers suggest a specific l that is related to the dimension of the data-matrix we have, however in the thesis we computed the method also with different parameters, some lower and some higher with respect to the suggested one, to see how the reconstruction changed. We observed that lowering the parameter produced some black pixels in the background, while increasing the value of l caused the foreground objects to leave a sort of shadow effect on the background. The RPCA method has been explored because it has been used as an inspiration for the application of the DMD method in video processing. The Dynamic Mode Decomposition, born in fluid dynamic community and widely used for data-driven analysis, is an interesting tool not only for its physical relevance, but also for its applicability in many different fields. The focus of this thesis is the use of the DMD to video processing, specifically the background/foreground separation required for surveillance videos. The DMD is a linear algebra method that allows us to study systems without equations that describe its dynamic. The method computes the Singular Value Decomposition on a data matrix X, truncates the rank of the SVD according to a user choice or to an estimation of the optimal rank and then reconstructs the dynamic. As for the RPCA, when applied to Video Processing the DMD separates the modes into two different matrices: a low rank and a sparse one. The difference between the two methods consists of the way in which the two matrices are calculated: while the RPCA solves a minimization problem, the DMD computes an SVD from which the DMD eigenvectors and eigenvalues are generated. From the magnitude of eigenvalues, expressed in such a way that resembles a Fourier series, the method is able to detect which one belong to the background and which to the foreground. The method takes the eigenvalues whose magnitude is closer to zero (or to one if we consider them in the "Fourier" form) and put those in the low-rank matrix, while the others, representing the modes of the dynamic that are changing faster, are put in the foreground model. To be precise, the foreground model is actually obtained by subtraction of the background from the original video. There are many variants of the DMD method, as shown in the introduction, but in this thesis we focused only on the compressed sensing and the compressive DMD. Both methods work with measurement matrices that compress the so called full-state data, in order to improve the computational cost of the DMD. It has been studied that there are some basis that put the matrices of most natural signals in a sparse form with a small loss of information, for example when dealing with images or audio, the Fourier and the Wavelet. This is remarkable, not only because we can transform the data into their sparse counterpart without a great effort, but also because if we only have access to a partial datasets (such as in ocean or atmospheric sampling), we can think of the data as already transformed into a sparse base and reconstruct the full-state data as we would when dealing with the whole sampling set. The thesis concludes with an example of application of the compressed DMD on a video from SegTrackV2.
In this thesis we explored different approaches to video processing, first we had a glance at the RPCA method, one of the highly-competitive methods with respect to the state-of-the-art computer vision procedure for video processing. This method is based on the separation of the data-matrix in two different matrices found by minimizing two different norms: the nuclear norm and the l_1-norm. The first matrix is low-rank and contains the information related to the background, the second is a sparse matrix that represent the foreground of the video. The minimization problem that leads to the two matrices involve also a parameter l that can be tuned as desired by the user and adjust the precision of the reconstruction. The developers suggest a specific l that is related to the dimension of the data-matrix we have, however in the thesis we computed the method also with different parameters, some lower and some higher with respect to the suggested one, to see how the reconstruction changed. We observed that lowering the parameter produced some black pixels in the background, while increasing the value of l caused the foreground objects to leave a sort of shadow effect on the background. The RPCA method has been explored because it has been used as an inspiration for the application of the DMD method in video processing. The Dynamic Mode Decomposition, born in fluid dynamic community and widely used for data-driven analysis, is an interesting tool not only for its physical relevance, but also for its applicability in many different fields. The focus of this thesis is the use of the DMD to video processing, specifically the background/foreground separation required for surveillance videos. The DMD is a linear algebra method that allows us to study systems without equations that describe its dynamic. The method computes the Singular Value Decomposition on a data matrix X, truncates the rank of the SVD according to a user choice or to an estimation of the optimal rank and then reconstructs the dynamic. As for the RPCA, when applied to Video Processing the DMD separates the modes into two different matrices: a low rank and a sparse one. The difference between the two methods consists of the way in which the two matrices are calculated: while the RPCA solves a minimization problem, the DMD computes an SVD from which the DMD eigenvectors and eigenvalues are generated. From the magnitude of eigenvalues, expressed in such a way that resembles a Fourier series, the method is able to detect which one belong to the background and which to the foreground. The method takes the eigenvalues whose magnitude is closer to zero (or to one if we consider them in the "Fourier" form) and put those in the low-rank matrix, while the others, representing the modes of the dynamic that are changing faster, are put in the foreground model. To be precise, the foreground model is actually obtained by subtraction of the background from the original video. There are many variants of the DMD method, as shown in the introduction, but in this thesis we focused only on the compressed sensing and the compressive DMD. Both methods work with measurement matrices that compress the so called full-state data, in order to improve the computational cost of the DMD. It has been studied that there are some basis that put the matrices of most natural signals in a sparse form with a small loss of information, for example when dealing with images or audio, the Fourier and the Wavelet. This is remarkable, not only because we can transform the data into their sparse counterpart without a great effort, but also because if we only have access to a partial datasets (such as in ocean or atmospheric sampling), we can think of the data as already transformed into a sparse base and reconstruct the full-state data as we would when dealing with the whole sampling set. The thesis concludes with an example of application of the compressed DMD on a video from SegTrackV2.
Model Discovery using Dynamic Mode Decomposition
MARANGON, TERESA
2023/2024
Abstract
In this thesis we explored different approaches to video processing, first we had a glance at the RPCA method, one of the highly-competitive methods with respect to the state-of-the-art computer vision procedure for video processing. This method is based on the separation of the data-matrix in two different matrices found by minimizing two different norms: the nuclear norm and the l_1-norm. The first matrix is low-rank and contains the information related to the background, the second is a sparse matrix that represent the foreground of the video. The minimization problem that leads to the two matrices involve also a parameter l that can be tuned as desired by the user and adjust the precision of the reconstruction. The developers suggest a specific l that is related to the dimension of the data-matrix we have, however in the thesis we computed the method also with different parameters, some lower and some higher with respect to the suggested one, to see how the reconstruction changed. We observed that lowering the parameter produced some black pixels in the background, while increasing the value of l caused the foreground objects to leave a sort of shadow effect on the background. The RPCA method has been explored because it has been used as an inspiration for the application of the DMD method in video processing. The Dynamic Mode Decomposition, born in fluid dynamic community and widely used for data-driven analysis, is an interesting tool not only for its physical relevance, but also for its applicability in many different fields. The focus of this thesis is the use of the DMD to video processing, specifically the background/foreground separation required for surveillance videos. The DMD is a linear algebra method that allows us to study systems without equations that describe its dynamic. The method computes the Singular Value Decomposition on a data matrix X, truncates the rank of the SVD according to a user choice or to an estimation of the optimal rank and then reconstructs the dynamic. As for the RPCA, when applied to Video Processing the DMD separates the modes into two different matrices: a low rank and a sparse one. The difference between the two methods consists of the way in which the two matrices are calculated: while the RPCA solves a minimization problem, the DMD computes an SVD from which the DMD eigenvectors and eigenvalues are generated. From the magnitude of eigenvalues, expressed in such a way that resembles a Fourier series, the method is able to detect which one belong to the background and which to the foreground. The method takes the eigenvalues whose magnitude is closer to zero (or to one if we consider them in the "Fourier" form) and put those in the low-rank matrix, while the others, representing the modes of the dynamic that are changing faster, are put in the foreground model. To be precise, the foreground model is actually obtained by subtraction of the background from the original video. There are many variants of the DMD method, as shown in the introduction, but in this thesis we focused only on the compressed sensing and the compressive DMD. Both methods work with measurement matrices that compress the so called full-state data, in order to improve the computational cost of the DMD. It has been studied that there are some basis that put the matrices of most natural signals in a sparse form with a small loss of information, for example when dealing with images or audio, the Fourier and the Wavelet. This is remarkable, not only because we can transform the data into their sparse counterpart without a great effort, but also because if we only have access to a partial datasets (such as in ocean or atmospheric sampling), we can think of the data as already transformed into a sparse base and reconstruct the full-state data as we would when dealing with the whole sampling set. The thesis concludes with an example of application of the compressed DMD on a video from SegTrackV2.| File | Dimensione | Formato | |
|---|---|---|---|
|
Model_discovery_using_Dynamic_Mode_Decomposition___Tesi_Triennale___Marangon_Teresa_consegna.pdf
accesso aperto
Dimensione
2.03 MB
Formato
Adobe PDF
|
2.03 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/80258