Linking microbial species to their specific functions is essential for understanding ecological processes across different environments. Machine learning has emerged as a powerful technique to address challenges in functional association by leveraging information from reference databases. However, its effectiveness is affected by database accuracy and completeness, which can lead to biased predictions typically limited to well-studied functions, making data collection for training models particularly challenging. In this project, we focused on enhancing a previously developed machine learning tool for microbial functional classification from gene annotations to include less explored functions. While this expansion approach can be applied universally to include any desired phenotype, we focused on functions relevant to the anaerobic digestion, a crucial process for the biogeochemical carbon cycle and for the functioning of the gut microbiome. Through extensive literature mining, we extracted information on nine key functions and enhanced this tool by developing and training function-specific optimized models. The expanded tool demonstrated powerful capabilities for anaerobic digestion-related functions, offering robust predictions in seven out of nine classes. Our results prove the vast applicability of machine learning techniques in microbial functional association, ultimately contributing to more comprehensive insights into microbial communities and their roles in complex ecological systems.

Linking microbial species to their specific functions is essential for understanding ecological processes across different environments. Machine learning has emerged as a powerful technique to address challenges in functional association by leveraging information from reference databases. However, its effectiveness is affected by database accuracy and completeness, which can lead to biased predictions typically limited to well-studied functions, making data collection for training models particularly challenging. In this project, we focused on enhancing a previously developed machine learning tool for microbial functional classification from gene annotations to include less explored functions. While this expansion approach can be applied universally to include any desired phenotype, we focused on functions relevant to the anaerobic digestion, a crucial process for the biogeochemical carbon cycle and for the functioning of the gut microbiome. Through extensive literature mining, we extracted information on nine key functions and enhanced this tool by developing and training function-specific optimized models. The expanded tool demonstrated powerful capabilities for anaerobic digestion-related functions, offering robust predictions in seven out of nine classes. Our results prove the vast applicability of machine learning techniques in microbial functional association, ultimately contributing to more comprehensive insights into microbial communities and their roles in complex ecological systems.

Machine Learning for Ad-hoc Functional Association of Microbes: An Anaerobic Digestion Case Study

MOYO, CLAUDIA NOMAGUGU
2024/2025

Abstract

Linking microbial species to their specific functions is essential for understanding ecological processes across different environments. Machine learning has emerged as a powerful technique to address challenges in functional association by leveraging information from reference databases. However, its effectiveness is affected by database accuracy and completeness, which can lead to biased predictions typically limited to well-studied functions, making data collection for training models particularly challenging. In this project, we focused on enhancing a previously developed machine learning tool for microbial functional classification from gene annotations to include less explored functions. While this expansion approach can be applied universally to include any desired phenotype, we focused on functions relevant to the anaerobic digestion, a crucial process for the biogeochemical carbon cycle and for the functioning of the gut microbiome. Through extensive literature mining, we extracted information on nine key functions and enhanced this tool by developing and training function-specific optimized models. The expanded tool demonstrated powerful capabilities for anaerobic digestion-related functions, offering robust predictions in seven out of nine classes. Our results prove the vast applicability of machine learning techniques in microbial functional association, ultimately contributing to more comprehensive insights into microbial communities and their roles in complex ecological systems.
2024
Machine Learning for Ad-hoc Functional Association of Microbes: An Anaerobic Digestion Case Study
Linking microbial species to their specific functions is essential for understanding ecological processes across different environments. Machine learning has emerged as a powerful technique to address challenges in functional association by leveraging information from reference databases. However, its effectiveness is affected by database accuracy and completeness, which can lead to biased predictions typically limited to well-studied functions, making data collection for training models particularly challenging. In this project, we focused on enhancing a previously developed machine learning tool for microbial functional classification from gene annotations to include less explored functions. While this expansion approach can be applied universally to include any desired phenotype, we focused on functions relevant to the anaerobic digestion, a crucial process for the biogeochemical carbon cycle and for the functioning of the gut microbiome. Through extensive literature mining, we extracted information on nine key functions and enhanced this tool by developing and training function-specific optimized models. The expanded tool demonstrated powerful capabilities for anaerobic digestion-related functions, offering robust predictions in seven out of nine classes. Our results prove the vast applicability of machine learning techniques in microbial functional association, ultimately contributing to more comprehensive insights into microbial communities and their roles in complex ecological systems.
Bioinformatics
Machine learning
Metagenomics
File in questo prodotto:
File Dimensione Formato  
Moyo_ClaudiaNomagugu.pdf

accesso riservato

Dimensione 1.81 MB
Formato Adobe PDF
1.81 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/83173