Time series measurements of analytes of current versus time are generated using nanopore-based sensing instruments. The training dataset of time series contains three classes labeled with "no event’’ when no analytes are detected, "event A’’ when analytes of type A are detected, and "event B’’ when analytes of type B are detected in measurements. The unseen time series datasets are unlabeled but contain expected ratios of each class. The unlabeled time series is analyzed and classified into three classes using machine learning. The measurements are not time-dependent. Removing it results in a univariate time series which is further split into overlapping sequences using sliding windows. The data is not normalized, as this causes the classifiers to be biased on one class. The windows are trained and compared using four classifiers: fully connected neural networks, random forest, logistic regression, and long short-term memory. Logistic regression with a window size of 0.1 seconds and balanced weights has the most optimal results out of the four tested classifiers. The predictions for the three unlabeled datasets are 2,4:1, 0,8:1, and 0,5:1 for the expected ratios of 3:1, 3:1, and 1:1, respectively. Other classifiers require further experimentation with hyperparameter tuning to produce more satisfying results.
Time series measurements of analytes of current versus time are generated using nanopore-based sensing instruments. The training dataset of time series contains three classes labeled with "no event’’ when no analytes are detected, "event A’’ when analytes of type A are detected, and "event B’’ when analytes of type B are detected in measurements. The unseen time series datasets are unlabeled but contain expected ratios of each class. The unlabeled time series is analyzed and classified into three classes using machine learning. The measurements are not time-dependent. Removing it results in a univariate time series which is further split into overlapping sequences using sliding windows. The data is not normalized, as this causes the classifiers to be biased on one class. The windows are trained and compared using four classifiers: fully connected neural networks, random forest, logistic regression, and long short-term memory. Logistic regression with a window size of 0.1 seconds and balanced weights has the most optimal results out of the four tested classifiers. The predictions for the three unlabeled datasets are 2,4:1, 0,8:1, and 0,5:1 for the expected ratios of 3:1, 3:1, and 1:1, respectively. Other classifiers require further experimentation with hyperparameter tuning to produce more satisfying results.
Time Series Event Classification with Machine Learning
ALIJA, VULNET
2021/2022
Abstract
Time series measurements of analytes of current versus time are generated using nanopore-based sensing instruments. The training dataset of time series contains three classes labeled with "no event’’ when no analytes are detected, "event A’’ when analytes of type A are detected, and "event B’’ when analytes of type B are detected in measurements. The unseen time series datasets are unlabeled but contain expected ratios of each class. The unlabeled time series is analyzed and classified into three classes using machine learning. The measurements are not time-dependent. Removing it results in a univariate time series which is further split into overlapping sequences using sliding windows. The data is not normalized, as this causes the classifiers to be biased on one class. The windows are trained and compared using four classifiers: fully connected neural networks, random forest, logistic regression, and long short-term memory. Logistic regression with a window size of 0.1 seconds and balanced weights has the most optimal results out of the four tested classifiers. The predictions for the three unlabeled datasets are 2,4:1, 0,8:1, and 0,5:1 for the expected ratios of 3:1, 3:1, and 1:1, respectively. Other classifiers require further experimentation with hyperparameter tuning to produce more satisfying results.File | Dimensione | Formato | |
---|---|---|---|
Alija_Vulnet.pdf
accesso riservato
Dimensione
2.67 MB
Formato
Adobe PDF
|
2.67 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/42440