Points of Interest (POIs) are essential components of maps, frequently serving as key navigation destinations and driving users to seek out mapping services. The categorization of POIs is both delicate and crucial, as it, along with the POI's location and name, uniquely identifies each point. In the context of TomTom's Orbis map quality analyses, obtaining and comparing data from various mapping providers necessitates an effective and accurate alignment of differing categorization systems. This thesis aims to develop a robust and consistent framework for mapping categories across multiple providers, thereby replacing the current manual, category-by-category approach. Furthermore, this framework is designed to be adaptable, accommodating new categories or different providers to ensure a high level of generalization capability. The proposed methodology leverages embeddings—vector representations that capture semantic meanings—derived from pre-trained SBERT (Sentence-BERT) models. A variety of pre-processing approaches for categorization have been examined, including prompting techniques to enhance contextual meaning. A comprehensive vector database of category embeddings is constructed, followed by a thorough analysis and evaluation. This process aims to elucidate the strengths and weaknesses of the categorization framework, enabling practical and effective strategies to achieve a more refined alignment of categories.

Points of Interest (POIs) are essential components of maps, frequently serving as key navigation destinations and driving users to seek out mapping services. The categorization of POIs is both delicate and crucial, as it, along with the POI's location and name, uniquely identifies each point. In the context of TomTom's Orbis map quality analyses, obtaining and comparing data from various mapping providers necessitates an effective and accurate alignment of differing categorization systems. This thesis aims to develop a robust and consistent framework for mapping categories across multiple providers, thereby replacing the current manual, category-by-category approach. Furthermore, this framework is designed to be adaptable, accommodating new categories or different providers to ensure a high level of generalization capability. The proposed methodology leverages embeddings—vector representations that capture semantic meanings—derived from pre-trained SBERT (Sentence-BERT) models. A variety of pre-processing approaches for categorization have been examined, including prompting techniques to enhance contextual meaning. A comprehensive vector database of category embeddings is constructed, followed by a thorough analysis and evaluation. This process aims to elucidate the strengths and weaknesses of the categorization framework, enabling practical and effective strategies to achieve a more refined alignment of categories.

Cross-Provider POI Categories Mapping with an Embedding Approach

MARRAS, ERICA
2023/2024

Abstract

Points of Interest (POIs) are essential components of maps, frequently serving as key navigation destinations and driving users to seek out mapping services. The categorization of POIs is both delicate and crucial, as it, along with the POI's location and name, uniquely identifies each point. In the context of TomTom's Orbis map quality analyses, obtaining and comparing data from various mapping providers necessitates an effective and accurate alignment of differing categorization systems. This thesis aims to develop a robust and consistent framework for mapping categories across multiple providers, thereby replacing the current manual, category-by-category approach. Furthermore, this framework is designed to be adaptable, accommodating new categories or different providers to ensure a high level of generalization capability. The proposed methodology leverages embeddings—vector representations that capture semantic meanings—derived from pre-trained SBERT (Sentence-BERT) models. A variety of pre-processing approaches for categorization have been examined, including prompting techniques to enhance contextual meaning. A comprehensive vector database of category embeddings is constructed, followed by a thorough analysis and evaluation. This process aims to elucidate the strengths and weaknesses of the categorization framework, enabling practical and effective strategies to achieve a more refined alignment of categories.
2023
Cross-Provider POI Categories Mapping with an Embedding Approach
Points of Interest (POIs) are essential components of maps, frequently serving as key navigation destinations and driving users to seek out mapping services. The categorization of POIs is both delicate and crucial, as it, along with the POI's location and name, uniquely identifies each point. In the context of TomTom's Orbis map quality analyses, obtaining and comparing data from various mapping providers necessitates an effective and accurate alignment of differing categorization systems. This thesis aims to develop a robust and consistent framework for mapping categories across multiple providers, thereby replacing the current manual, category-by-category approach. Furthermore, this framework is designed to be adaptable, accommodating new categories or different providers to ensure a high level of generalization capability. The proposed methodology leverages embeddings—vector representations that capture semantic meanings—derived from pre-trained SBERT (Sentence-BERT) models. A variety of pre-processing approaches for categorization have been examined, including prompting techniques to enhance contextual meaning. A comprehensive vector database of category embeddings is constructed, followed by a thorough analysis and evaluation. This process aims to elucidate the strengths and weaknesses of the categorization framework, enabling practical and effective strategies to achieve a more refined alignment of categories.
Embeddings
Categories mapping
NLP
File in questo prodotto:
File Dimensione Formato  
Marras_Erica.pdf

accesso riservato

Dimensione 4.94 MB
Formato Adobe PDF
4.94 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/80895