Triangle counting is one of the major graph problems with several applications in social network analysis, anomaly detection, link recommendation, query optimization, pattern classification, and other implementations in data mining and big data fields. One of the most popular computational models considered is edge streaming. However, as the size of the graph grows and involves millions of edges and nodes, several challenges arise, such as memory space limitations and long running time, and therefore sampling techniques are needed to estimate the number of triangles appearing in our graph. In this work we are going to implement a sampling algorithm using Waiting Room to store most recent edges, and making use of a Machine Learning oracle model that provides predictions about the heaviness of edges to output more accurate estimates, while still maintaining rigorous guarantees on the approximation of our result.

Triangle counting is one of the major graph problems with several applications in social network analysis, anomaly detection, link recommendation, query optimization, pattern classification, and other implementations in data mining and big data fields. One of the most popular computational models considered is edge streaming. However, as the size of the graph grows and involves millions of edges and nodes, several challenges arise, such as memory space limitations and long running time, and therefore sampling techniques are needed to estimate the number of triangles appearing in our graph. In this work we are going to implement a sampling algorithm using Waiting Room to store most recent edges, and making use of a Machine Learning oracle model that provides predictions about the heaviness of edges to output more accurate estimates, while still maintaining rigorous guarantees on the approximation of our result.

Counting Triangles in Graph Streams using Machine Learning Predictions

BOLDRIN, CRISTIAN
2022/2023

Abstract

Triangle counting is one of the major graph problems with several applications in social network analysis, anomaly detection, link recommendation, query optimization, pattern classification, and other implementations in data mining and big data fields. One of the most popular computational models considered is edge streaming. However, as the size of the graph grows and involves millions of edges and nodes, several challenges arise, such as memory space limitations and long running time, and therefore sampling techniques are needed to estimate the number of triangles appearing in our graph. In this work we are going to implement a sampling algorithm using Waiting Room to store most recent edges, and making use of a Machine Learning oracle model that provides predictions about the heaviness of edges to output more accurate estimates, while still maintaining rigorous guarantees on the approximation of our result.
2022
Counting Triangles in Graph Streams using Machine Learning Predictions
Triangle counting is one of the major graph problems with several applications in social network analysis, anomaly detection, link recommendation, query optimization, pattern classification, and other implementations in data mining and big data fields. One of the most popular computational models considered is edge streaming. However, as the size of the graph grows and involves millions of edges and nodes, several challenges arise, such as memory space limitations and long running time, and therefore sampling techniques are needed to estimate the number of triangles appearing in our graph. In this work we are going to implement a sampling algorithm using Waiting Room to store most recent edges, and making use of a Machine Learning oracle model that provides predictions about the heaviness of edges to output more accurate estimates, while still maintaining rigorous guarantees on the approximation of our result.
Data Mining
Graphs
Big Data
Algorithms
Machine Learning
File in questo prodotto:
File Dimensione Formato  
Boldrin_Cristian.pdf

embargo fino al 08/03/2025

Dimensione 5.2 MB
Formato Adobe PDF
5.2 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/50905