Current social dynamics are strongly linked to what happens on Social Media. Opinions, emotions, and how people perceive the world around them are strongly influenced by what they see or read on Social Platforms. We can insert in this field Social Media phenomena like Fake News, Hate Speech, Propaganda, Race and Gender biases. All these events are considered to be among the most significant problems for social stability and one of the most effective means of influencing people. Much work has been done by researchers from different areas of Computer Science, in particular from Natural Language Processing and Network Analysis, focusing on textual information in the first case (articles, posts, comments, etc.) or graph structures and node activities in the second (detection of malicious spreaders, polarization, etc.). In this thesis, we will clarify what are the main problems in this area of research, known by most as Computational Social Science, providing the theoretical basis of the most used tools. Then, we will go into specifics dealing with the topic of the detection of toxic messages on Twitter at the level of the single tweet, comparing different Deep Learning models, among which some innovative solutions proposed by us, trying to answer the following question: can Natural Language syntax be useful in such task? Unlike, for instance, Sentiment Analysis, we have not yet achieved high performance, especially because the models typically used, given a sentence, turn out to focus a lot on the occurring words rather than on the meaning of the sentence itself. Our idea starts from the assumption that exploiting syntactic information can be effective to overcome this obstacle. In the end, we will provide the results of our experiments and possible related interpretations, proposing scientific and ethical reflections, and finally try to convince the reader on why research should invest efforts on this topic, and what future scenarios we should focus on.

Current social dynamics are strongly linked to what happens on Social Media. Opinions, emotions, and how people perceive the world around them are strongly influenced by what they see or read on Social Platforms. We can insert in this field Social Media phenomena like Fake News, Hate Speech, Propaganda, Race and Gender biases. All these events are considered to be among the most significant problems for social stability and one of the most effective means of influencing people. Much work has been done by researchers from different areas of Computer Science, in particular from Natural Language Processing and Network Analysis, focusing on textual information in the first case (articles, posts, comments, etc.) or graph structures and node activities in the second (detection of malicious spreaders, polarization, etc.). In this thesis, we will clarify what are the main problems in this area of research, known by most as Computational Social Science, providing the theoretical basis of the most used tools. Then, we will go into specifics dealing with the topic of the detection of toxic messages on Twitter at the level of the single tweet, comparing different Deep Learning models, among which some innovative solutions proposed by us, trying to answer the following question: can Natural Language syntax be useful in such task? Unlike, for instance, Sentiment Analysis, we have not yet achieved high performance, especially because the models typically used, given a sentence, turn out to focus a lot on the occurring words rather than on the meaning of the sentence itself. Our idea starts from the assumption that exploiting syntactic information can be effective to overcome this obstacle. In the end, we will provide the results of our experiments and possible related interpretations, proposing scientific and ethical reflections, and finally try to convince the reader on why research should invest efforts on this topic, and what future scenarios we should focus on.

Leveraging Recursive Neural Networks on Dependency Trees for Online-Toxicity Detection on Twitter

PENZO, NICOLO'
2021/2022

Abstract

Current social dynamics are strongly linked to what happens on Social Media. Opinions, emotions, and how people perceive the world around them are strongly influenced by what they see or read on Social Platforms. We can insert in this field Social Media phenomena like Fake News, Hate Speech, Propaganda, Race and Gender biases. All these events are considered to be among the most significant problems for social stability and one of the most effective means of influencing people. Much work has been done by researchers from different areas of Computer Science, in particular from Natural Language Processing and Network Analysis, focusing on textual information in the first case (articles, posts, comments, etc.) or graph structures and node activities in the second (detection of malicious spreaders, polarization, etc.). In this thesis, we will clarify what are the main problems in this area of research, known by most as Computational Social Science, providing the theoretical basis of the most used tools. Then, we will go into specifics dealing with the topic of the detection of toxic messages on Twitter at the level of the single tweet, comparing different Deep Learning models, among which some innovative solutions proposed by us, trying to answer the following question: can Natural Language syntax be useful in such task? Unlike, for instance, Sentiment Analysis, we have not yet achieved high performance, especially because the models typically used, given a sentence, turn out to focus a lot on the occurring words rather than on the meaning of the sentence itself. Our idea starts from the assumption that exploiting syntactic information can be effective to overcome this obstacle. In the end, we will provide the results of our experiments and possible related interpretations, proposing scientific and ethical reflections, and finally try to convince the reader on why research should invest efforts on this topic, and what future scenarios we should focus on.
2021
Leveraging Recursive Neural Networks on Dependency Trees for Online-Toxicity Detection on Twitter
Current social dynamics are strongly linked to what happens on Social Media. Opinions, emotions, and how people perceive the world around them are strongly influenced by what they see or read on Social Platforms. We can insert in this field Social Media phenomena like Fake News, Hate Speech, Propaganda, Race and Gender biases. All these events are considered to be among the most significant problems for social stability and one of the most effective means of influencing people. Much work has been done by researchers from different areas of Computer Science, in particular from Natural Language Processing and Network Analysis, focusing on textual information in the first case (articles, posts, comments, etc.) or graph structures and node activities in the second (detection of malicious spreaders, polarization, etc.). In this thesis, we will clarify what are the main problems in this area of research, known by most as Computational Social Science, providing the theoretical basis of the most used tools. Then, we will go into specifics dealing with the topic of the detection of toxic messages on Twitter at the level of the single tweet, comparing different Deep Learning models, among which some innovative solutions proposed by us, trying to answer the following question: can Natural Language syntax be useful in such task? Unlike, for instance, Sentiment Analysis, we have not yet achieved high performance, especially because the models typically used, given a sentence, turn out to focus a lot on the occurring words rather than on the meaning of the sentence itself. Our idea starts from the assumption that exploiting syntactic information can be effective to overcome this obstacle. In the end, we will provide the results of our experiments and possible related interpretations, proposing scientific and ethical reflections, and finally try to convince the reader on why research should invest efforts on this topic, and what future scenarios we should focus on.
Dependency Tree
Toxic Language
Recursive NN
File in questo prodotto:
File Dimensione Formato  
Penzo_Nicolò.pdf

accesso aperto

Dimensione 1.89 MB
Formato Adobe PDF
1.89 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/31568