The TREC_COVID Challenge has the goal to create search engines to effectively and efficiently retrieve information produced at a rate never seen before, in the biomedical field. This work focuses on the effectiveness of the information retrieval. The search engine is based on Elasticsearch. A multitude of information retrieval techniques are tested, with the goal of identifying the ones leading to a performance improvement. The techniques' effectiveness is measured using the evaluation measures: P@20, MAP, and BPref. The techniques explored that yield improvement in the search are: custom analyzers, filters, relevance feedback and reciprocal rank fusion. Other tested techniques, that yield negligible results, are: field boosting, bigrams and distance feature. Ultimately, the results are compared to the ones obtained by others in the Challenge.
The TREC_COVID Challenge has the goal to create search engines to effectively and efficiently retrieve information produced at a rate never seen before, in the biomedical field. This work focuses on the effectiveness of the information retrieval. The search engine is based on Elasticsearch. A multitude of information retrieval techniques are tested, with the goal of identifying the ones leading to a performance improvement. The techniques' effectiveness is measured using the evaluation measures: P@20, MAP, and BPref. The techniques explored that yield improvement in the search are: custom analyzers, filters, relevance feedback and reciprocal rank fusion. Other tested techniques, that yield negligible results, are: field boosting, bigrams and distance feature. Ultimately, the results are compared to the ones obtained by others in the Challenge.
Ad-hoc Biomedical Information Retrieval for Global Pandemics: A Study of Methods Based on the TREC-COVID test collection
VIRGINIO, GIACOMO
2021/2022
Abstract
The TREC_COVID Challenge has the goal to create search engines to effectively and efficiently retrieve information produced at a rate never seen before, in the biomedical field. This work focuses on the effectiveness of the information retrieval. The search engine is based on Elasticsearch. A multitude of information retrieval techniques are tested, with the goal of identifying the ones leading to a performance improvement. The techniques' effectiveness is measured using the evaluation measures: P@20, MAP, and BPref. The techniques explored that yield improvement in the search are: custom analyzers, filters, relevance feedback and reciprocal rank fusion. Other tested techniques, that yield negligible results, are: field boosting, bigrams and distance feature. Ultimately, the results are compared to the ones obtained by others in the Challenge.File | Dimensione | Formato | |
---|---|---|---|
Virginio_Giacomo.pdf
accesso aperto
Dimensione
508.93 kB
Formato
Adobe PDF
|
508.93 kB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/11348