In human cognition, when advanced mathematical abilities reach a certain level, basic numerical skills, such as number sense and elementary calculation, are typically welldeveloped. In this thesis we investigate whether stateoftheart artificial neural network models exhibit a similar trend. Indeed, much research has pointed out that largescale language models (such as ChatGPT) possess exceptional highlevel mathematical abilities, but their elementary numeracy skills have often been overlooked. This dissertation focuses on the foundational mathematical abilities of GPT3.5 (from which ChatGPT was developed), its newest version GPT4 and six other multimodal deep learning models. Taking into account the unique characteristics of different neural network models, standardized tests and selfdeveloped tasks were employed to explore the mathematical abilities of these eight models. The findings indicate that GPT3.5 and GPT4 are indeed able to exhibit complex mathematical competencies, though basic numeracy skills are not always fully developed (especially in GPT3.5). In contrast, the six multimodal models still need to make progress in improving their numerosity perception and number sense to unlock more advanced mathematical abilities.
In human cognition, when advanced mathematical abilities reach a certain level, basic numerical skills, such as number sense and elementary calculation, are typically welldeveloped. In this thesis we investigate whether stateoftheart artificial neural network models exhibit a similar trend. Indeed, much research has pointed out that largescale language models (such as ChatGPT) possess exceptional highlevel mathematical abilities, but their elementary numeracy skills have often been overlooked. This dissertation focuses on the foundational mathematical abilities of GPT3.5 (from which ChatGPT was developed), its newest version GPT4 and six other multimodal deep learning models. Taking into account the unique characteristics of different neural network models, standardized tests and selfdeveloped tasks were employed to explore the mathematical abilities of these eight models. The findings indicate that GPT3.5 and GPT4 are indeed able to exhibit complex mathematical competencies, though basic numeracy skills are not always fully developed (especially in GPT3.5). In contrast, the six multimodal models still need to make progress in improving their numerosity perception and number sense to unlock more advanced mathematical abilities.
Evaluation of basic mathematical abilities of neural networks
HOU, KUINAN
2022/2023
Abstract
In human cognition, when advanced mathematical abilities reach a certain level, basic numerical skills, such as number sense and elementary calculation, are typically welldeveloped. In this thesis we investigate whether stateoftheart artificial neural network models exhibit a similar trend. Indeed, much research has pointed out that largescale language models (such as ChatGPT) possess exceptional highlevel mathematical abilities, but their elementary numeracy skills have often been overlooked. This dissertation focuses on the foundational mathematical abilities of GPT3.5 (from which ChatGPT was developed), its newest version GPT4 and six other multimodal deep learning models. Taking into account the unique characteristics of different neural network models, standardized tests and selfdeveloped tasks were employed to explore the mathematical abilities of these eight models. The findings indicate that GPT3.5 and GPT4 are indeed able to exhibit complex mathematical competencies, though basic numeracy skills are not always fully developed (especially in GPT3.5). In contrast, the six multimodal models still need to make progress in improving their numerosity perception and number sense to unlock more advanced mathematical abilities.File  Dimensione  Formato  

Hou Kuinan_thesis_pdfa.pdf
accesso aperto
Dimensione
17.48 MB
Formato
Adobe PDF

17.48 MB  Adobe PDF  Visualizza/Apri 
The text of this website © Università degli studi di Padova. Full Text are published under a nonexclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/52790