Our ability to recognize and categorize objects in our surroundings is a critical component of our cognitive processes. Despite the enormous variations in each object's appearance (Due to variations in object position, pose, scale, illumination, and the presence of visual clutter), primates are thought to be able to quickly and easily distinguish objects from among tens of thousands of possibilities. The primate's ventral visual stream is believed to support this view-invariant visual object recognition ability by untangling object identity manifolds. Convolutional Neural Networks (CNNs), inspired by the primate's visual system, have also shown remarkable performance in object recognition tasks. This review aims to explore and compare the mechanisms of object recognition in the primate's ventral visual stream and state-of-the-art deep CNNs. The research questions address the extent to which CNNs have approached human-level object recognition and how their performance compares to the primate ventral visual stream. The objectives include providing an overview of the literature on the ventral visual stream and CNNs, comparing their mechanisms, and identifying strengths and limitations for core object recognition. The review is structured to present the ventral visual stream's structure, visual representations, and the process of untangling object manifolds. It also covers the architecture of CNNs. The review also compared the two visual systems and the results showed that deep CNNs have shown remarkable performance and capability in certain aspects of object recognition, but there are still limitations in replicating the complexities of the primate visual system. Further research is needed to bridge the gap between computational models and the intricate neural mechanisms underlying human object recognition.

Our ability to recognize and categorize objects in our surroundings is a critical component of our cognitive processes. Despite the enormous variations in each object's appearance (Due to variations in object position, pose, scale, illumination, and the presence of visual clutter), primates are thought to be able to quickly and easily distinguish objects from among tens of thousands of possibilities. The primate's ventral visual stream is believed to support this view-invariant visual object recognition ability by untangling object identity manifolds. Convolutional Neural Networks (CNNs), inspired by the primate's visual system, have also shown remarkable performance in object recognition tasks. This review aims to explore and compare the mechanisms of object recognition in the primate's ventral visual stream and state-of-the-art deep CNNs. The research questions address the extent to which CNNs have approached human-level object recognition and how their performance compares to the primate ventral visual stream. The objectives include providing an overview of the literature on the ventral visual stream and CNNs, comparing their mechanisms, and identifying strengths and limitations for core object recognition. The review is structured to present the ventral visual stream's structure, visual representations, and the process of untangling object manifolds. It also covers the architecture of CNNs. The review also compared the two visual systems and the results showed that deep CNNs have shown remarkable performance and capability in certain aspects of object recognition, but there are still limitations in replicating the complexities of the primate visual system. Further research is needed to bridge the gap between computational models and the intricate neural mechanisms underlying human object recognition.

Comparing primate’s ventral visual stream and the state-of-the-art deep convolutional neural networks for core object recognition

LADU, CHARLES WANI VICTOR
2022/2023

Abstract

Our ability to recognize and categorize objects in our surroundings is a critical component of our cognitive processes. Despite the enormous variations in each object's appearance (Due to variations in object position, pose, scale, illumination, and the presence of visual clutter), primates are thought to be able to quickly and easily distinguish objects from among tens of thousands of possibilities. The primate's ventral visual stream is believed to support this view-invariant visual object recognition ability by untangling object identity manifolds. Convolutional Neural Networks (CNNs), inspired by the primate's visual system, have also shown remarkable performance in object recognition tasks. This review aims to explore and compare the mechanisms of object recognition in the primate's ventral visual stream and state-of-the-art deep CNNs. The research questions address the extent to which CNNs have approached human-level object recognition and how their performance compares to the primate ventral visual stream. The objectives include providing an overview of the literature on the ventral visual stream and CNNs, comparing their mechanisms, and identifying strengths and limitations for core object recognition. The review is structured to present the ventral visual stream's structure, visual representations, and the process of untangling object manifolds. It also covers the architecture of CNNs. The review also compared the two visual systems and the results showed that deep CNNs have shown remarkable performance and capability in certain aspects of object recognition, but there are still limitations in replicating the complexities of the primate visual system. Further research is needed to bridge the gap between computational models and the intricate neural mechanisms underlying human object recognition.
2022
Comparing primate’s ventral visual stream and the state-of-the-art deep convolutional neural networks for core object recognition
Our ability to recognize and categorize objects in our surroundings is a critical component of our cognitive processes. Despite the enormous variations in each object's appearance (Due to variations in object position, pose, scale, illumination, and the presence of visual clutter), primates are thought to be able to quickly and easily distinguish objects from among tens of thousands of possibilities. The primate's ventral visual stream is believed to support this view-invariant visual object recognition ability by untangling object identity manifolds. Convolutional Neural Networks (CNNs), inspired by the primate's visual system, have also shown remarkable performance in object recognition tasks. This review aims to explore and compare the mechanisms of object recognition in the primate's ventral visual stream and state-of-the-art deep CNNs. The research questions address the extent to which CNNs have approached human-level object recognition and how their performance compares to the primate ventral visual stream. The objectives include providing an overview of the literature on the ventral visual stream and CNNs, comparing their mechanisms, and identifying strengths and limitations for core object recognition. The review is structured to present the ventral visual stream's structure, visual representations, and the process of untangling object manifolds. It also covers the architecture of CNNs. The review also compared the two visual systems and the results showed that deep CNNs have shown remarkable performance and capability in certain aspects of object recognition, but there are still limitations in replicating the complexities of the primate visual system. Further research is needed to bridge the gap between computational models and the intricate neural mechanisms underlying human object recognition.
visual system
DNNs
object recognition
IT cortex
File in questo prodotto:
File Dimensione Formato  
Comparison between the Ventral Visual Stream and DCNNs.pdf

accesso aperto

Dimensione 3 MB
Formato Adobe PDF
3 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/47241