In human cognition, when advanced mathematical abilities reach a certain level, basic numerical skills, such as number sense and elementary calculation, are typically well-developed. In this thesis we investigate whether state-of-the-art artificial neural network models exhibit a similar trend. Indeed, much research has pointed out that large-scale language models (such as ChatGPT) possess exceptional high-level mathematical abilities, but their elementary numeracy skills have often been overlooked. This dissertation focuses on the foundational mathematical abilities of GPT-3.5 (from which ChatGPT was developed), its newest version GPT-4 and six other multi-modal deep learning models. Taking into account the unique characteristics of different neural network models, standardized tests and self-developed tasks were employed to explore the mathematical abilities of these eight models. The findings indicate that GPT-3.5 and GPT-4 are indeed able to exhibit complex mathematical competencies, though basic numeracy skills are not always fully developed (especially in GPT-3.5). In contrast, the six multi-modal models still need to make progress in improving their numerosity perception and number sense to unlock more advanced mathematical abilities.
In human cognition, when advanced mathematical abilities reach a certain level, basic numerical skills, such as number sense and elementary calculation, are typically well-developed. In this thesis we investigate whether state-of-the-art artificial neural network models exhibit a similar trend. Indeed, much research has pointed out that large-scale language models (such as ChatGPT) possess exceptional high-level mathematical abilities, but their elementary numeracy skills have often been overlooked. This dissertation focuses on the foundational mathematical abilities of GPT-3.5 (from which ChatGPT was developed), its newest version GPT-4 and six other multi-modal deep learning models. Taking into account the unique characteristics of different neural network models, standardized tests and self-developed tasks were employed to explore the mathematical abilities of these eight models. The findings indicate that GPT-3.5 and GPT-4 are indeed able to exhibit complex mathematical competencies, though basic numeracy skills are not always fully developed (especially in GPT-3.5). In contrast, the six multi-modal models still need to make progress in improving their numerosity perception and number sense to unlock more advanced mathematical abilities.
Evaluation of basic mathematical abilities of neural networks
HOU, KUINAN
2022/2023
Abstract
In human cognition, when advanced mathematical abilities reach a certain level, basic numerical skills, such as number sense and elementary calculation, are typically well-developed. In this thesis we investigate whether state-of-the-art artificial neural network models exhibit a similar trend. Indeed, much research has pointed out that large-scale language models (such as ChatGPT) possess exceptional high-level mathematical abilities, but their elementary numeracy skills have often been overlooked. This dissertation focuses on the foundational mathematical abilities of GPT-3.5 (from which ChatGPT was developed), its newest version GPT-4 and six other multi-modal deep learning models. Taking into account the unique characteristics of different neural network models, standardized tests and self-developed tasks were employed to explore the mathematical abilities of these eight models. The findings indicate that GPT-3.5 and GPT-4 are indeed able to exhibit complex mathematical competencies, though basic numeracy skills are not always fully developed (especially in GPT-3.5). In contrast, the six multi-modal models still need to make progress in improving their numerosity perception and number sense to unlock more advanced mathematical abilities.File | Dimensione | Formato | |
---|---|---|---|
Hou Kuinan_thesis_pdfa.pdf
accesso aperto
Dimensione
17.48 MB
Formato
Adobe PDF
|
17.48 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/52790