PRE-ATTENTIVE PROCESSING

Pre-attentive processing is defined as the initial process of automatic and rapid detection of salient objects in the environment without having to focus attention on them. It typically refers to the set of information captured by the human eye in a 'single glance', which falls within the range of 200-250 milliseconds. Various neurological studies state that features like hue, curvature, size, orientation, length, intensity, motion, and depth of field are immediately detected by the human eye during this short time. These features thus help humans to perform visual tasks such as target detection, boundary detection, and counting and estimation. Humans have evolved to discern patterns in the environment and use them to their advantage. ‘It represents a deep instinct, a drive, a need to impose order on the world so as to make it usable and survivable’ (Brigitte Jordan, 2013). In this paper, we will first investigate the neurological attributes that contribute to the detection of the pre-attentive features and later we will focus on the theories that help us understand the human behaviour of perceiving the surroundings as patterns. In the end, we will evaluate the design aspects of the YouTube website. YouTube is an American online social media platform mainly used for streaming and sharing videos.

Neurological attributes responsible for pre-attentive processing

Visual scenes comprise a lot of information, only a fraction of which is picked up by the eyes for further processing. The eyes make ‘rapid, ballistic movements’ (Purves D, Augustine GJ, Fitzpatrick D, et al., 2001) known as saccades to change the point of fixation by directing the fovea (area of highest visual acuity) onto new stimuli. The vision depends on the information that is picked during the fixation pauses between saccades (Min Zhao et al., 2012) which usually varies between 200-400 ms. The information once picked flows from the photoreceptors through the ganglion cells to the visual cortex.

Retina to Visual Cortex

The retinal output is efficiently condensed so that the information-carrying axons can pass through the anatomical bottleneck caused by the optic nerve along the route from the eye to the visual cortex (Jonathan J. Nassi & Edward M. Callaway, 2009). The information is then projected to the lateral geniculate nucleus (LGN) located in the posterior part of the thalamus. LGN is a six-layered structure that comprises neurons with concentric receptive fields (Chris I. Baker, 2012). The midget ganglion cells form the origin of the parvocellular pathway and convey the red/green color opponent signals to the upper four parvocellular (P) layers comprising of smaller neurons. Parasol ganglion cells form the origin of the magnocellular pathway and convey broadband, achromatic signals to the deepest two magnocellular (M) layers comprising of larger neurons. Finally, the bistratified ganglion cells form the koniocellular pathway that conveys the blue-on/ yellow-off color opponent color signals to the thin koniocellular layers intercalated between the six primary layers (Jonathan J. Nassi & Edward M. Callaway, 2009; Chris I. Baker, 2012). The smaller P cells of the LGN are involved in the analysis of form and color, whereas the larger M cells mostly participate in the detection of motion and luminance (Ehud Kalpan, 2003). Ultimately, the axons of all the LGN neurons terminate in different layers or sub-layers of the primary visual cortex through the occipital lobe.

Visual Cortex

The visual cortex is the primary cortical region of the brain with the primary function to process visual information (Trevor Huff et al., 2021). ‘Most visual information from the LGN passes through V1 before being processed further in extrastriata visual cortex’ (Jonathan J. Nassi & Edward M. Callaway, 2009). Neurons in V1 are usually associated with the detection of ‘color, direction, and orientation’ (Chris I. Baker, 2012). V1 comprise areas that have colour-sensitive neurons and are known as blobs. The area between the blobs is referred as interblobs and are orientation sensitive (Stewart Shipp, 1995). The receptive fields of the primary visual cortex (V1) are mainly classified as simple, complex and hypercomplex. The simple cells primarily respond to the orientation of lines and edges and the complex cells respond to the movements in a specific direction (Trevor Huff at al., 2021). The hypercomplex cells on the other hand respond to the orientation of corners and curves. Neurons with these patterns of response act as line detectors, motion detectors, and angle detectors respectively. (Trevor Huff at al., 2021) V2 receives its inputs from V1. Research shows that the neurons in this region respond to changes in color, complex patterns, spatial frequency, and orientation. The information from V2 further splits into the dorsal and ventral streams which are concerned with the processing of object recognition and spatial task / visual motor skills respectively.

Theories of pre-attentive processing

Feature Integration Theory

Treisman & Gelade proposed the Feature Integration Theory (FIT) of attention first in 1980 ( Jeremy M Wolfe, 2020) that suggest ‘features are registered early, automatically, and in parallel across the visual field, while objects are identified separately and only at a later stage, which requires focused attention’ (Terisman & Gelade, 1980). The theory further suggests that the ‘visual scene is initially coded along a number of separable dimensions, such as color, orientation, spatial frequency, brightness, direction of movement’ ( Terisman & Gelade, 1980). Healey & Enns explained Treisman’s feature integration model of early vision that states, each feature map registers the activities of only a specific visual feature, which are encoded parallelly. However, the feature maps do not provide any details about the location, spatial arrangement. Thus, focused attention is required to ‘glue’ the ‘initially separable features into unitary objects’(Terisman & Gelade, 1980). Healey & Enns further explained that the targets with the unique features can simply access the feature map to check if the activity is occurring and since feature maps are encoded parallelly, feature detection becomes instantaneous. However, conjunction targets can only be detected by searching serially through the maser map which requires focused attention. Thus, this model provides a generic hypothesis to explain the pre-attentive process.

Grouping: Pattern Perception Principles

Multiple studies have provided evidence that suggests perceptual grouping occurs pre attentively and does not require attention (Pedro R. Montoro et al., 2014). Gestalt psychologist in the early twentieth century identified a set of principles/factors of perceptual grouping (Joseph L Brooks, 2014). Gestalt principles focused on the grouping of elements based on proximity, similarity, closure, good continuation, common region, and connectedness (Irvin Rock & Stephen Palmer, 1990). Proximity is one of the most important grouping principles that can override competing visual cues such as similarity of color or shape (Aurora Harley, 2020). Various ‘phenomenological observation tells us that the strength of the proximity principle decreases with distance’ (Michael Kubovy et al., 1998). Aurora Harley also states that the elements that show identical visual traits such as color, shape or size are perceived to be related. Thus the ‘strength of the similarity principle decreases with dissimilarity’ (Michael Kubovy et al., 1998). Similarly, elements that either appear in a closed boundary, appear on a line/ curve, or seem connected by uniform visual traits are perceived to be related. Recent studies by Rosenthal & Humphreys and Wang, Weng, & He have reported evidence of pre-attentive processing of other perceptual attributes like ‘contour integration’ and ‘illusory contour completion’ respectively. Rosenthal and Humphreys also discovered that the integration of Gabor elements resulted in unconscious learning of global contours. These studies make imposing examples of the unconscious detection of the distinct elements to form patterns that are comparatively different from the sum of their parts (Pedro R. Montoro et al., 2014).

Guided Search Theory

The guided search theory by Wolfe hypnotized a construction of an activation map based on both bottom-up and top-down visual search (Christopher G. Healey & James T. Enns, 2012). ‘Activation map is the signal that will guide the deployment of attention. For each item in a display, the guiding activation is simply a weighted sum of the bottom-up activation and the activity in each channel (composed of the top-down activation) plus some noise’ (Jeremy M. Wolfe, 2007).’Bottom-up activation follows feature categorization. It measures how different an element is from its neighbors’ (Christopher G. Healey & James T. Enns, 2012). The more an item differs from its surroundings, the more it pops out and this happens because of local contrast (Jeremy M. Wolfe, 2007).

Design Review

YouTube is an online platform used for sharing and streaming videos. The image below refers to the homepage of YouTube that is designed to provide a holistic view of all the features available. Since YouTube is mainly a video streaming platform it provides access to a large range of videos for the users to choose from.

YouTube Homepage

Similarity

As per the similarity principle, elements that show identical traits such as color, shape or size are perceived to be related. YouTube uses the same shape (rectangle with rounded corners) for the various categories in the subheader to strengthen their perception as a group. It also uses the same color for these shapes to indicate their connection. Black and grey color for these rounded rectangles has been used to create a sense of dissimilarity, thereby indicating a difference within a group. Also, the use of the same size of thumbnail images for the videos enables the eye to pre attentively perceive them as similar.

Proximity

On the YouTube home page, white space has been efficiently used to create distinct blocks for each item. The information related to each video such as the channel image, title, channel name, views, and the time has been placed closer to the particular thumbnail image thereby indicating that these elements are related. These elements are perceived as a single entity by the human eye due to their spatial proximity. Also, for the menu on the left, the information is perceived as three groups. This is because of the difference in the proximity of the various elements. The items that appear closer to each other are perceived to belong to the same group.

Connectedness

YouTube uses a white background for the various options available such as categories, account, and the menu. However, it uses a grey background for displaying video-related information. Due to the use of different colors for the background, the information on each background is perceived to be related.

Common Region

In the menu section, the use of grey lines is used to create a closed region. The elements that fall within a boundary are perceived to be related and share similar functionalities. Thus, the information displayed on the left-hand menu is perceived to be in three groups.

Symmetry

The different video tiles are displayed proportionally with even spacing between each tile. This creates a sense of alignment which further conveys order and clarity. These elements are perceived as a group due to the principle of symmetry. Also, the placement of the categories on the subheader of the screen creates a sense of togetherness due to the alignment of these categories.

Continuity

YouTube uses a down arrow and shows more options in the left-hand menu to create a sense of continuity thereby depicting that the elements below are related to the ones already displayed. Also, for the videos section, YouTube displays a small section of the thumbnail images of the videos at the bottom of the screen thereby indicating that the videos below are related. Thus, perceiving it all as a group.

Conclusion

The findings from the neurological and psychophysical studies suggest that the visual information is pre attentively perceived along a number of distinct attributes such as color, orientation, size, luminance, and motion. Alike attributes are then processed parallelly through the visual channels. Which are then mapped together based on the principles of proximity, similarity, closure, connectedness, continuity and common region to form a single object.

References

Baker, C. I. (2012). Visual processing in the primate brain. Handbook of Psychology, Second Edition. https://doi.org/10.1002/9781118133880.hop203004
Brooks, J. L. (2014). Traditional and new principles of Perceptual Grouping. Oxford Handbooks Online. https://doi.org/10.1093/oxfordhb/9780199686858.013.060
Covington BP, Al Khalili Y. Neuroanatomy, Nucleus Lateral Geniculate. [Updated 2021 Jul 31]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2021 Jan-. Available from: https://www.ncbi.nlm.nih.gov/books/NBK541137/
Harley, A. (2020, August 2). Proximity principle in visual design. Nielsen Norman Group. Retrieved October 31, 2021, from https://www.nngroup.com/articles/gestalt-proximity/.
Healey, C. G., & Enns, J. T. (2012). Attention and visual memory in visualization and Computer Graphics. IEEE Transactions on Visualization and Computer Graphics, 18(7), 1170–1188. https://doi.org/10.1109/tvcg.2011.127
Huff T, Mahabadi N, Tadi P. Neuroanatomy, Visual Cortex. [Updated 2021 Jul 31]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2021 Jan-. Available from: https://www.ncbi.nlm.nih.gov/books/NBK482504/
Jordan, B. (2013). Advancing Ethnography in corporate environments challenges and emerging opportunities. Left Coast Press, Inc.
Kaplan, Ehud. (2003). The M, P and K pathways in the Primate Visual System.
Kubovy, M., Holcombe, A. O., & Wagemans, J. (1998). On the lawfulness of grouping by proximity. Cognitive Psychology, 35(1), 71–98. https://doi.org/10.1006/cogp.1997.0673
Montoro, P. R., Luna, D., & Ortells, J. J. (2014). Subliminal gestalt grouping: Evidence of perceptual grouping by proximity and similarity in absence of Conscious Perception. Consciousness and Cognition, 25, 1–8. https://doi.org/10.1016/j.concog.2014.01.004
Nassi, J. J., & Callaway, E. M. (2009). Parallel Processing Strategies of the primate visual system. Nature Reviews Neuroscience, 10(5), 360–372. https://doi.org/10.1038/nrn2619
Purves D, Augustine GJ, Fitzpatrick D, et al., editors. Neuroscience. 2nd edition. Sunderland (MA): Sinauer Associates; 2001. Types of Eye Movements and Their Functions. Available from: https://www.ncbi.nlm.nih.gov/books/NBK10991/
Rock, I., & Palmer, S. (1990). The Legacy of Gestalt psychology. Scientific American, 263(6), 84–90. https://doi.org/10.1038/scientificamerican1290-84
Shipp, S. (1995). Visual processing: The odd couple. Current Biology, 5(2), 116–119. https://doi.org/10.1016/s0960-9822(95)00029-7
Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12(1), 97–136. https://doi.org/10.1016/0010-0285(80)90005-5
Wolfe, J. M. (2007). Guided Search 4.0: Current progress with a model of visual search. In W. Gray (Ed.), Integrated Models of cognitive Systems (pp. 99-119). New York: Oxford.
Wolfe, J. M. (2020). Forty Years after feature integration theory: An introduction to the special issue in honor of the contributions of Anne Treisman. Attention, Perception, & Psychophysics, 82(1), 1–6. https://doi.org/10.3758/s13414-019-01966-3
Zhao, M., Gersch, T. M., Schnitzer, B. S., Dosher, B. A., & Kowler, E. (2012). Eye movements and attention: the role of pre-saccadic shifts of attention in perception, memory and the control of saccades. Vision research, 74, 40–60. https://doi.org/10.1016/j.visres.2012.06.017