Linking microbial species to their specific functions is essential for understanding ecological processes across different environments. Machine learning has emerged as a powerful technique to address challenges in functional association by leveraging information from reference databases. However, its effectiveness is affected by database accuracy and completeness, which can lead to biased predictions typically limited to well-studied functions, making data collection for training models particularly challenging. In this project, we focused on enhancing a previously developed machine learning tool for microbial functional classification from gene annotations to include less explored functions. While this expansion approach can be applied universally to include any desired phenotype, we focused on functions relevant to the anaerobic digestion, a crucial process for the biogeochemical carbon cycle and for the functioning of the gut microbiome. Through extensive literature mining, we extracted information on nine key functions and enhanced this tool by developing and training function-specific optimized models. The expanded tool demonstrated powerful capabilities for anaerobic digestion-related functions, offering robust predictions in seven out of nine classes. Our results prove the vast applicability of machine learning techniques in microbial functional association, ultimately contributing to more comprehensive insights into microbial communities and their roles in complex ecological systems.
Linking microbial species to their specific functions is essential for understanding ecological processes across different environments. Machine learning has emerged as a powerful technique to address challenges in functional association by leveraging information from reference databases. However, its effectiveness is affected by database accuracy and completeness, which can lead to biased predictions typically limited to well-studied functions, making data collection for training models particularly challenging. In this project, we focused on enhancing a previously developed machine learning tool for microbial functional classification from gene annotations to include less explored functions. While this expansion approach can be applied universally to include any desired phenotype, we focused on functions relevant to the anaerobic digestion, a crucial process for the biogeochemical carbon cycle and for the functioning of the gut microbiome. Through extensive literature mining, we extracted information on nine key functions and enhanced this tool by developing and training function-specific optimized models. The expanded tool demonstrated powerful capabilities for anaerobic digestion-related functions, offering robust predictions in seven out of nine classes. Our results prove the vast applicability of machine learning techniques in microbial functional association, ultimately contributing to more comprehensive insights into microbial communities and their roles in complex ecological systems.
Machine Learning for Ad-hoc Functional Association of Microbes: An Anaerobic Digestion Case Study
MOYO, CLAUDIA NOMAGUGU
2024/2025
Abstract
Linking microbial species to their specific functions is essential for understanding ecological processes across different environments. Machine learning has emerged as a powerful technique to address challenges in functional association by leveraging information from reference databases. However, its effectiveness is affected by database accuracy and completeness, which can lead to biased predictions typically limited to well-studied functions, making data collection for training models particularly challenging. In this project, we focused on enhancing a previously developed machine learning tool for microbial functional classification from gene annotations to include less explored functions. While this expansion approach can be applied universally to include any desired phenotype, we focused on functions relevant to the anaerobic digestion, a crucial process for the biogeochemical carbon cycle and for the functioning of the gut microbiome. Through extensive literature mining, we extracted information on nine key functions and enhanced this tool by developing and training function-specific optimized models. The expanded tool demonstrated powerful capabilities for anaerobic digestion-related functions, offering robust predictions in seven out of nine classes. Our results prove the vast applicability of machine learning techniques in microbial functional association, ultimately contributing to more comprehensive insights into microbial communities and their roles in complex ecological systems.File | Dimensione | Formato | |
---|---|---|---|
Moyo_ClaudiaNomagugu.pdf
accesso riservato
Dimensione
1.81 MB
Formato
Adobe PDF
|
1.81 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/83173