APCV 2024: ASIA PACIFIC CONFERENCE ON VISION 2024
PROGRAM FOR THURSDAY, JULY 11TH
Days:
previous day
next day
all days

View: session overviewtalk overview

08:45-09:45 Session 10: Keynote 3: Ning Qian
Chair:
08:45
Receptive-field remapping and space representation across saccades

ABSTRACT. The nature and function of perisaccadic receptive-field (RF) remapping have been controversial. We used a delayed saccade task to reduce previous confounds and examined the remapping time course in areas LIP and FEF. In the delay period, the RF shift direction turned from the initial fixation to the saccade target. In the perisaccadic period, RFs first shifted toward the target (convergent remapping) but around the time of saccade onset/offset, the shifts became predominantly toward the post-saccadic RF locations (forward remapping). Thus, unlike forward remapping that depends on the corollary discharge (CD) of the saccade command, convergent remapping appeared to follow attention from the initial fixation to the target. We modelled the data with attention-modulated and CD-gated connections, and showed that both sets of connections emerged automatically in neural networks trained to update stimulus retinal locations across saccades. The model also explained the translational component of perisaccadic perceptual mislocalization of flashed stimuli. Our work thus integrates seemingly contradictory findings into a computational mechanism for transsaccadic space representation. (Joint work with Mingsha Zhang and Mickey Goldberg's labs.)

10:00-11:30 Session 11A: Talks - Imagery and consciousness
Chair:
10:00
Perceived duration of motion aftereffects is longer for individuals with more vivid visual imagery
PRESENTER: Alan L. F. Lee

ABSTRACT. Visual imagery is the ability to generate mental images in one’s “mind’s eye”. By studying how visual imagery ability is related to certain aspects of visual perception, we can better understand the processes underlying visual imagery. One such aspect that has been overlooked in the literature is the motion aftereffect (MAE).

In the present study, we compared the perceived MAE duration in individuals with high and low levels of self-report vividness in visual imagery. Using an online questionnaire (VVIQ2, Marks, 1995), we abstained the VVIQ scores from ~160 valid participants, among which we targeted 36 participants, with 17 being in the high-VVIQ group (M = 122) and 19 in the low-VVIQ group (M = 94.2). On each trial, participants viewed, as the adapting stimulus, a random-dot kinematogram (RDK) (200 black and white dots; speed = 1.6 deg/s) for 50 seconds. The adapting RDK was presented to either the left or right eye only, with the order randomized across trials. After a one-second blank screen, we presented the last frame of the adapting RDK for 25 seconds as the test, either to the adapted eye or the non-adapted eye. During the test, participants pressed one of the two arrow keys to indicate the perceived MAE direction. They were told to hold the key as long as the MAE was perceptible.

We found that perceived MAE duration was significantly longer in the high-VVIQ group (M = 9.86s, SD = 6.74s) than in the low-VVIQ group (M = 4.45s, SD = 4.12s; p = .008, Cohen’s d = 0.97). This group difference was weaker but still significant when MAE was tested on the non-adapted eye (Cohen’s d = 0.82) than on the adapted eye (Cohen’s d = 1.06). Our results suggest that motion perception is closely related to visual imagery ability.

10:15
Neural Correlates of Unconscious Prior Experience Utilization in Disambiguating Ambiguous Stimuli
PRESENTER: Po-Jang Hsieh

ABSTRACT. Disambiguation, also known as one-shot learning, is crucial for human evolution as it allows for adapting to ambiguous threats with limited exposure. Previous studies showed how prior information affects recognizing ambiguous stimuli consciously, but it remains unclear whether prior experience can be automatically or unconsciously applied to disambiguate stimuli. This research used fMRI to monitor brain activity during the Mooney Images Paradigm under discontinuous flash suppression. Findings replicated effects observed consciously and revealed neural disambiguation even in unconscious conditions, particularly in occipital and temporal regions like V1, V2, FG, IT, and MT. This finding suggests the brain automatically applies prior experiences even without conscious perception, indicating a disparity between neural and conscious recognition.

10:30
Dissociable effects of attentional modulation and perceptual suppression in V1
PRESENTER: Chen Liu

ABSTRACT. Although attentional modulation in the primary visual cortex (V1) is well established, the effect of perceptual suppression in V1 remains controversial. To address this issue and to better understand the relationship between attention and consciousness, we employed 0.8-mm isotropic resolution CBV and BOLD fMRI at 7 Tesla to investigate the effect of attentional modulation and perceptual suppression in human V1.

A 2 x 2 factorial design was used to control attention and awareness independently. Red Mondrian masks and green gratings were presented in alternating frames at adjacent but non-overlapping locations, either in the same eye (visible) or in different eyes (invisible). Subjects either attended to and reported visibility of the green grating (attended), or performed a letter detection task at fixation (unattended). Stimuli were presented in 30-s blocks, alternating with 18-s fixations. An attentional cue was presented at 2 s before each stimulus block.

In all four conditions, fMRI time courses showed a transient response followed by a sustained plateau. Attentional modulation and perceptual suppression on the transient response showed a double dissociation: the effects of perceptual suppression were identical between attended and unattended conditions, while the effects of attentional modulation were identical between visible and invisible conditions. These findings demonstrate the independent effects of attentional modulation and perceptual suppression on V1 activity, which may reconcile the discrepancies in previous studies (Watanabe 2011, Science; Yuval-Greenberg 2013, J Neurosci).

10:45
Predicting the modality and intensity of imagined sensations from EEG measures of oscillatory brain activity
PRESENTER: Derek Arnold

ABSTRACT. Most people can experience imagined images and sounds. There are, however, large individual differences, with some people reporting that they cannot have imagined audio or visual sensations (Aphants), and there are others who report having unusually intense imagined sensations (hyper-phantasics). These individual differences have been linked to activity in sensory brain regions, driven by feedback. We therefore predicted that imagined sensations should be associated with distinct frequencies of oscillatory brain activity, as these can provide a signature of interactions within and across regions of the brain. We have found that we can decode the modality of imagined sensations with a high success rate (~75%), live while people participate in an experiment, and provide neurofeedback on this to motivate participants. Moreover, replicating many past studies, we have found that the act of engaging in audio or in visual imagery is linked to reductions in oscillatory power, with prominent peaks in the alpha band (8–12 Hz). This, however, did not predict individual differences in the subjective intensity of these imagined sensations. For audio imagery, these were instead predicted by reductions in the theta (6–9 Hz) and gamma (33–38 Hz) bands, and by increases in beta (15–17 Hz) band. Visual imagery intensity was predicted by reductions in lower (14–16 Hz) and upper (29–32 Hz) beta band, and by an increase in mid-beta band (24–26 Hz) activity. Our data suggest there is sufficient ground truth to the subjective reports people use to describe their imagined sensations, such that these are linked to the power of distinct rhythms of brain activity.

11:00
The mystery of continuous flash suppression: A two-photon calcium imaging study in macaque V1

ABSTRACT. Continuous flash suppression (CFS) has been widely used to study subconscious visual processing, in which the perception of a high-contrast stimulus presented to one eye is suppressed by flashing Mondrian noise presented to the other eye. It remains elusive whether and how the responses of V1 neurons, most of which receive binocular inputs, are affected by CFS. To address this issue, we employed two-photon calcium imaging to record responses of thousands of superficial-layer V1 neurons to a target under CFS in two awake, fixating macaques. The target was a circular-windowed square-wave grating (d=1°, SF=3/6 cpd, contrast=0.45, drifting speed=4°/s). The flashing stimulus was a circular Mondrian noise pattern (d=1.89°, contrast=0.50, TF=10 Hz). The stimuli were presented for 1000-ms with 1500-ms intervals. The square-wave grating at various orientations was first presented alone to either eye to identify oriented-tuned V1 neurons and estimate their respective ocular dominance indices (ODI). Then the grating target and the flashing noise were presented dichoptically to measure neuronal responses under CFS. The results show that the flashing noise nearly completely wiped out the orientation responses of neurons preferring the noise eye or both eyes, in that the population orientation tuning functions were suppressed with no measurable or very wide bandwidths. The flashing noise also significantly suppressed the orientation responses of neurons preferred the grating eye, but in a less degree, and the tuning bandwidths were still measurable, which increased from 11-13o to 19-21o (half-width at half-height). The neuronal responses under CFS can be fitted with a gain control model, in which the flashing noise produces ODI-dependent and partially orientation-tuned intra- and inter-ocular inhibition. Consequently, the high-contrast stimulus evidence may not be rendered into subconsciousness by flashing noise as many studies assumed. Instead, it is severely compromised under CFS and likely unable to enter consciousness.

11:15
Veridical and consciously perceived information interact in guiding behavior
PRESENTER: Marjan Persuh

ABSTRACT. A compelling proposition suggests that our visual systems distinguish between information for perception and action, yet consensus remains elusive despite numerous studies using visual illusions where object properties diverge between veridical and perceived. In exploring response priming, evidence hints that only the physical attributes of prime stimuli govern motor responses. Across three experiments, we investigated the interplay of physical and consciously perceived locations in response priming, leveraging the well-known flash-lag illusion—a scenario where a briefly flashed disk and moving bars, perceived at the same location, appear displaced. Participants rapidly responded to the target disk's location presented above or below static bars. In the first experiment, maintaining the physical location of the prime disk constant resulted in both the disk and moving bars appearing at the same spot. Responses to the target disk consistently showed a bias influenced by the prime disk, indicating that the illusory perception of the prime location primed rapid motor responses. In the second experiment, we inverted the physical and perceived location of the prime. After estimating the illusion size for each participant, we presented the prime disk either above or below the moving bars, aligning the perceived location with the moving bars. Motor responses were moderated by the physical location of the disk, revealing that the visuomotor system utilized the veridical prime location. In the third experiment, we juxtaposed physical and perceived locations, situating them on opposite sides of the moving bars. Under this arrangement, motor responses remained unaffected by primes. Our experiments underscore that visuomotor systems integrate both sources of information—veridical and consciously perceived location—to guide behavior.

10:00-11:30 Session 11B: Symposium - The impact of recent technologies on studies of multisensory integration

Multisensory integration is one of the key functions to obtain stable visual and non-visual perception in our daily life. However, it is still a challenging problem to comprehensively understand how our brain integrates different types of modal information. How does our visual system extract meaningful visual information from retinal images and integrate those with information from other sensory modalities? Recent technologies, such as virtual reality (VR) and/or augmented reality (AR), can provide scalable interactive and immersive environments to test the effects of external stimulation on our subjective experiences. What do those technologies bring to our research? We invite world-leading scientists in human perception and performance to discuss the psychological, physiological, and computational foundations of multisensory integration, and methodologies that provide insight into how non-visual sensory information enhances our visual experiences of the world.

Organizers: Hiroaki Kiyokawa (Saitama University) and Juno Kim (University of New South Wales)

10:00
Forward and backward steps in virtual reality affect facial expression recognition

ABSTRACT. Using a head-mounted display (HMD) to present visual stimuli reduces the physical constraints on the participants during the experiment. This makes it possible to more easily verify the effects of body movements on facial expression recognition, which has been challenging to measure in the past. Here, we show that steps forward or backward (approach-avoidance behaviors) influence facial expression recognition in virtual reality. A 3D face model with varying facial expressions, ranging from happy to angry, was presented in a virtual space. Participants wore an HMD and were asked to behave one of the following four actions in front of the face stimulus; the participant 1) approaches (step forward), 2) avoids (step backward) a 3D face stimulus, 3) is approached, or 4) is avoided by the 3D stimulus. Immediately after the action, the participants judged the facial expressions for happy or angry emotions. As a result, the participants perceived the face as angrier when they avoided the model compared to when the model avoided them. This suggests that the avoidance enhanced angry recognition, which is a reverse causal relationship to “angry promotes avoidance” indicated by the previous studies. Our findings imply that the observers’ body movements trigger a modulation of facial expressions to interact with the outside world effectively.

10:30
Multimodal information for virtual walking

ABSTRACT. Walking is a fundamental human activity that contributes to physical and mental health. We need to move the legs and arms to walk around, so it limits the diversity of walkers. It would be very useful and valuable if we could provide walking experience for people with physical disabilities. Therefore, we are developing a virtual reality system that provides an illusory walking experience for a sitting or lying person without physical movement. The system consists of a head-mounted display and four vibrators. The vibrators applied vibrations to the feet to simulate feet landing on the ground. Seated observers experienced a virtual walking illusion through a combination of optic flow and foot vibrations that were rhythmic and synchronized with a virtual walker. The virtual walking experience was enhanced by observing a walking avatar in first-person perspective and third-person perspective in mirrors in virtual environments. Even for supine observers, the combination of optic flow and foot vibrations elicited a virtual walking experience. Scene-congruent vibration patterns enhanced ground material perception and the virtual walking experience. Then, we further developed a virtual walking system by real-time synthesis of a walking avatar and its shadow on a 360-degree video with foot vibrations. The shadow of the avatar induces an illusory presence of its body. In the future, it is expected to provide an immersive experience for any recorded or live-steaming medium with a virtual embodiment.

11:00
Can we measure sensory conflict during virtual reality? And if we can, then what can we do with this information?

ABSTRACT. Over the last decade, considerable efforts have been made to reduce the cybersickness produced by head-mounted display (HMD) based virtual reality (VR). Unfortunately, this sickness still continues to hinder the widespread uptake of this revolutionary technology. Our approach to tackling this critically important problem is rather different to that adopted by most other researchers. Rather than trialling one of the many possible interventions proposed to reduce this sickness, our focus has instead been on identifying the specific patterns of multisensory stimulation that are actually responsible for inducing this sickness. Our studies therefore attempt to objectively estimate the sensory conflict produced by different VR applications and head movement tasks (note: these conditions are often deliberately designed to make people sick). In this talk, I will show how differences between the HMD user’s virtual and physical head pose (DVP) can be used to predict both the onset and severity of their experiences of cybersickness. Having identified the specific patterns of DVP that reliably precede the imminent onset of cybersickness, we are now focusing on using this information to develop new ways of mitigating and preventing this sickness.

11:30-12:30 Session 12A: Posters - Color, orientation, and form (AM presenters)
[1] Brain compensates for ambiguity in extreme-peripheral colors

ABSTRACT. Sensory processing in the extreme periphery of the visual field (>60 deg. of the visual angle) should be of central interest among those who are interested in neural sensory flexibility, where the brain compensates for ambiguity or noisy sensory signals by recruiting more reliable signals (“brain compensation hypothesis”). Our past studies demonstrated that, in the extreme periphery: 1) sounds either improve or degrade visual detection performance, depending on the conditions (Suegami, et al. ECVP ‘18, JoV. ’19, APCV’19), 2) the central color signals capture the ambiguous color (Shehata, et al., Shirai, et al., JoV. ‘19), and 3) (invisible) actions capture location/motion of an ambiguous visual flicker (JoV, ‘24). Here, we report new findings focusing mostly on color x location confusions. We presented a static lattice of colors (patches lined up next to each other) at various sizes and eccentricities (all the way to the extreme periphery), asking the participants to report colors and their relative locations. A staircase was employed to measure the critical threshold size that induced the color/location errors at each eccentricity. There are a few novel illusions/effects observed as listed below, with big individual differences: 1) Flashing of color(s). 2) Failure to report color(s). 3) Color mixtures (of presented colors). 4) Lattice orientation (H/V) reported wrongly. 5) Relative locations of color patches mixed up. 6) As eccentricity increased, these mistakes abruptly occurred, and were then consistently observed to the extreme periphery. 7) The illusions/errors tended to be consistent within participant, but varied a lot across participants. The critical size for the errors measured by the staircase showed a slope (larger sizes at more extreme periphery) that was qualitatively consistent with the known cortical magnification factors. Still noteworthy are the highly nonlinear nature of the illusion/error onset, and that illusions involved vivid color experiences reported with high confidence.

[3] Does vertical size disparity affect binocular correspondence?

ABSTRACT. Binocular disparities are processed at different scales following the coarse-to-fine interaction. Interactions between global stereopsis and local stereopsis have been investigated, but it remains unclear how disparity processing at global scale affects disparity detection or binocular correspondence at the most basic level. We focused on a specific condition, where global uniform vertical size disparity was introduced into the left and right eye’s image, and measured the corresponding retinal points (the pair of locations on the two retinas that has no perceptual positional difference), which should be processed in local scale. Two monocular test dots that flashed to each eye sequentially were used and minimum-apparent-motion criteria were applied to determine the location of corresponding points. Observers were asked to adjust the separation of two test dots horizontally and vertically until no apparent motion was perceived while a rectangular random-dot-stereogram (RDS) with vertical size disparity was presented in a large visual field. The test dots were presented in 24 different visual locations under three conditions of vertical disparity, where the vertical size ratio (VSR) of RDS patterns were 1.06, 1, or 0.943. Horizontal size ratio of RDS patterns were adjusted respectively to assure no slant of surface was perceived. Results showed that the locations of corresponding points in the left eye shifted away from the horizontal meridian when the vertical size of the left eye’s image was larger (VSR > 1). The vertical separation between the corresponding points seems to vary in the same direction as the global vertical size disparity. However, corresponding points seemed not to shift when that of the right eye’s image was larger (VSR < 1). Although asymmetry between the left and right field was observed, our results suggest that vertical size disparity of the global patterns may affect binocular correspondence in the local stage.

[5] Influence of lighting distribution and direction on the impression of Japanese pottery

ABSTRACT. Lighting conditions affect the surface appearance of objects. It could influence the material and shape recognition and impression of the objects. It is especially critical how the appearance of crafts and artwork is influenced by illuminating. In this study, we aim to clarify the relationship between the lighting conditions and the impression of crafts when the lighting distribution and angle are changed and underlying photometrical factors. In the experiment, three types of beam angles in lighting: 8 degrees (narrow-angle), 16 degrees (medium-angle), and 29 degrees (wide-angle) were used. We prepared black and red Raku tea bowls for evaluation. The bowl was illuminated from three angles: 30 degrees in front, directly above, and 30 degrees behind the bowls. Five observers evaluated twelve items for each of the nine conditions. As a result, the beam angle had little effect on the impression evaluation. As the illumination angle was moved from the front to the back, the evaluation values of glossiness, flamboyance, and brightness decreased, while those of darkness and depth increased. To investigate the cause of this result, we measured the luminance of bowl surfaces and found a correlation between the luminance characteristics and impression evaluation values. This suggests that the luminance distribution was related to the impression evaluation of the crafts. However, there were almost no differences in impression evaluation in an additional experiment in which the illuminance conditions were changed extensively. Thus, it is considered that only overall brightness does not affect impressions; complex factors are involved.

[7] Differential Changes of Isoluminance and Cone Contrast Sensitivity in Adult Human Amblyopia

ABSTRACT. To investigate color perception differential changes in adults with amblyopia who do not have color blindness by using psychophysical methods, we conducted two psychophysical experiments in this study. In Experiment 1, we used minimum motion method to measure the monocular isoluminance of adults with amblyopia and normally insighted adults at 0.28, 0.43, 0.88, 1.75 and 3 cycles/degree. Isoluminance reflects the relative contribution of L-cone and M-cone to luminance and is usually defined as the L/M cone ratios in the cone contrast space. In Experiment 2, we assessed binocular and monocular contrast thresholds for achromatic, isolated L-, M- and S-cone at 0.5, 1 and 2 cycles/degree. We evaluated the binocular summation ratio (BSR), defined as the ratio of binocular contrast sensitivity to monocular (better eye) contrast sensitivity, in both the normal group and amblyopic group. All participations passed a color screening test before psychophysics tests. Our results showed that the isoluminant point was similar in both eyes of adults with amblyopia but was significantly larger in amblyopic eye compared to the nondominant eye of the normal group. This demonstrated that the isoluminant point is shifted towards red in adults with amblyopia compared to normally insighted individuals. Additionally, the contrast thresholds of the amblyopic eye were higher than those of fellow eye in amblyope and the nondominant eye in normal individuals under the isolated L-cone condition. This suggested that L-cone contrast sensitivity may be impaired in the amblyopic eye. Notably, these changes become more pronounced with increasing spatial frequency. Moreover, the BSR results indicated that the interocular difference in the amblyopic group is larger than in the control group under isolated M-, and S-cone conditions.

[9] The impact of temporal attitudes on physical and affective material impressions – an analysis using EEG

ABSTRACT. When humans look at objects, they receive both physical material impressions (e.g., glossiness) and affective material impressions (e.g., aesthetics). Traditional studies in affective engineering often assume that affective impressions follow the perception of physical material features. Yet, no direct evidence supports this assumption in the human visual system. To explore this, we analyzed the temporal patterns of decoding accuracy for physical and affective material impressions from EEG. We expected that the influence of observers’ temporal expectations (whether they need to assess impressions rapidly or not) on decoding accuracy varies between physical and affective material impressions; we anticipated a lesser effect on physical impressions if their processing precedes affective impression processing. The stimuli were object images of various materials collected from different online datasets. In each trial, an image was displayed to the observer, who then evaluated the intensity (strong or weak) of the physical (glossiness) or affective (aesthetics) material impressions while EEG measurements were taken. After the experiments, the observers’ responses were expected using a linear support vector machine based on EEG within different time windows. We conducted two types of sessions with different stimulus durations: one with only 1000 ms exposures and another with mixed 100 ms and 1000 ms exposures. The observers’ expectations about stimulus duration likely differed between these sessions and influenced judgment strategies. Results indicated that decoding accuracy for physical material impressions was barely impacted by expectations, whereas affective impressions were significantly influenced; decoding accuracy improved, and the peak accuracy window shifted to later times in the 1000ms-only session. This is in line with the hypothesis that the processing of physical material impressions might precede that of affective impressions, at least concerning glossiness and aesthetics.

11:30-12:30 Session 12B: Posters - Social and cultural influences on perception (AM presenters)
[11] Friendliness and Hostility in Action: Encoding/Decoding Principles and Cultural Influence

ABSTRACT. In an unfamiliar culture when verbal communication is ineffective, body movements help us to tell friends from foes. However, how actions lead to intention expression and understanding and how the cultures influence this fundamental and critical capacity need to be clarified. We approached these endeavors by recruiting professional performers (42 Japanese, 41 Taiwanese) to our lab. We instructed them to interact with an imaginary friendly alien (scenario 1) or an imaginary hostile alien (scenario 2) for 10 seconds. Because extraterrestrial lives neither understood human culture nor spoke human languages, performers needed to communicate solely through body language. Their movements, recorded by a motion capture system (Vicon), were converted to dynamic point-light animations. An additional 53 observers (24 Japanese and 29 Taiwanese) viewed these animations and reported the perceived friendliness/hostility on a scale from 0-friendly to 100-hostile in an online experiment platform. From subjective reports, similar cues were used to express (by performers) and detect (by observers): from motion-related cues such as kinematics (e.g. slow-friendliness, fast-hostile), posture (open-friendliness, closure-hostile) to abstract cues such as intention (e.g. aggression, greeting-gestures) and imagined context (e.g. perceived emotion, daily-action). The intensity rating showed that the perceived hostility was significantly higher in hostile animations than in the friendly animations, and a positive correlation of the intensity rating between JP and TW raters (95.42%, p < 0.001 for JP animations; 94.89%, p < 0.001 for TW animations) indicated a high perception consistency between these two modes across two cultures. Interestingly, Japanese observers perceived higher hostility when Japanese performers interacted with an antagonistic alien than Taiwanese observers, highlighting a cultural-specific sensitivity toward negative expression. To summarize, our reports provided the first accounts of encoding and decoding principles of friendliness-hostility body expression. Our demonstration of cultural modulations encourages future research in this direction.

[13] Culture Matters: Performance and Perception of Human-like Body Motion between Taiwan and Japan
PRESENTER: Xiaoyue Yang

ABSTRACT. Humans are sensitive to conspecific movements and might have a distinct way of perceiving whether a motion is human-like. Because cultural norms heavily modulate our non-verbal communication, we investigate whether there is any cultural impact on expressing and detecting human-like body movements. 43 Japanese and 41 Taiwanese professional performers were instructed to use whole-body movements to demonstrate that they were real humans (instead of AI or machines) when their motion was recorded by a motion capture system (Vicon) in our laboratory. The recordings were processed into 57-dynamic-point-light displays (PLDs) to remove factors unrelated to motion (e.g. face, background). An additional 99 observers (50 Taiwanese and 49 Japanese) viewed all PLDs and judged whether real humans or AI made each PLD. From the interviews, Taiwanese performers used kinematics cues most frequently to convince the viewers of their human identities (e.g., smooth, continuity, flexible, soft) rather than non-human agents (e.g., rigid, repetitive). Japanese performers utilized more contextual cues (e.g. festival dance, children’s games, shared human experiences, etc) in their motion. The objective assessment of PLDs via Motion Energy Analysis (MEA) also revealed interesting cultural characteristics: Taiwanese PLDs contained significantly higher motion energy than Japanese PLDs (p < .001). For Taiwanese PLDs, the most Human-like PLDs peaked at 1-2 Hz and AI-like PLDs peaked at 0-0.25 Hz. Japanese PLDs contained consistent energy across all frequency bands. For humanness perception, Taiwanese observers were significantly more likely to make real human reports than Japanese observers (p < .041), regardless of the origin of the PLDs (JP/TW), suggesting a more dominant role of observers’ cultural backgrounds than the performers'. Our study provided the first report on cultural-specific body expressions to differentiate real human motions from AI-generated movements. It highlights how environmental factors modulate non-verbal communication from both senders and receivers.

[15] Exploring Mental Health Self-stigma, Self-identification, and Person Perception in a Subclinical Population

ABSTRACT. Objectives: Individuals with high levels of internalized mental health stigma are more likely to have poorer health outcomes and engage in discriminatory behaviors against others with mental ill health (MIH). However, self-identifying as having MIH can buffer against the negative impacts of self-stigma. This study investigates the influence of mental health self-stigma and self-identification on person perception in a subclinical population in Singapore.

Method: Participants (N=83) rated 36 person images paired with descriptions of MIH symptoms or non-MIH behaviors on trustworthiness and competence. They then completed measures of self-stigma, self-identification, anxiety, and depression. For robustness, faces and descriptions were randomly paired for each participant, with an equal number of faces per race and sex presented.

Results: When person images were paired with MIH symptoms, they were rated significantly less competent and trustworthy compared to when paired with non-MIH behaviors. Participants with higher self-stigma scores rated stimuli significantly lower on competence but not trustworthiness. A novel two-way interaction between MIH symptom labels and self-identification on trustworthiness ratings was observed. This interaction demonstrated a gentler decrease in trustworthiness ratings for person images paired with MIH labels when participants had higher self-identification.

Conclusion: The findings support the persistence of mental health stigma and suggest internalisations of stigma on competence-relevant domains. The results also highlight the benefits of positive self-identification on reducing stigmatizing behaviors, shedding new light on the potential positive effects of peer support in individuals with subclinical MIH experiences. This study contributes to the limited research on self-stigma and person perception in subclinical populations and underscores the importance of addressing mental health stigma and fostering positive self-identification.

[17] Does premenstrual syndrome affect emotion recognition?

ABSTRACT. Premenstrual syndrome (PMS) is characterised by recurring physical and affective symptoms that arise during the luteal phase of a woman’s menstrual cycle. Mood disturbances are often reported in PMS, and we examined whether that might have repercussions on social cognition, specifically, the ability to interpret facial expressions. Forty-one participants were grouped as individuals with (N = 23) or without (N = 18) PMS, based on scores calculated from a daily record of menstrual symptoms over two consecutive menstrual cycles. In a subsequent menstrual cycle, all participants completed questionnaires measuring several dimensions of mood (e.g., anger, depression), once during their follicular phase, and once in their luteal phase. At the same two instances, they viewed a series of facial expressions that were intended at conveying one of several emotions (happy, angry, disgusted, or fearful) and intensities (i.e., subtle to extreme), and classified them with a keypress. Severities of negative affect (p = .022), anger (p = .005), depression (p = .032), and total mood disturbance (p = .030) were generally higher during the luteal phase compared to the follicular phase, and this increase was comparable between the two participant groups. As for expression classification, all participants found it difficult to accurately recognise subtle expressions of disgust (p ≤ .004) during the luteal phase, relative to the follicular phase. Further, individuals with PMS exhibited some unique negative biases. They classified neutral faces (p = .003) and subtle intensities of anger (p ≤ .013) more often as angry during the luteal phase than in the follicular phase. These biases were not observed in individuals without PMS. Our findings suggest that classification of subtle facial expressions is generally unstable across the menstrual cycle. More importantly, PMS may introduce distinguishable biases on emotion classification that do not appear to be contingent on mood disturbances.

[19] Parity and gender influences in mental jigsaw puzzles: A secondary eye-tracking study

ABSTRACT. This secondary analysis, derived from a preprint (https://doi.org/10.31219/osf.io/vkctj) and preregistered (https://osf.io/5yux6 and https://osf.io/r2j7a), investigates the influence of parity and gender on mental jigsaw puzzles (FT, fitting task) and traditional object mental rotation tasks (MT, matching task). While previous research noted behavioral gender and parity differences in MT, FT remains underexplored. Thirty college students (14 female, 16 male; balanced for gender-specific analysis) engaged with three-dimensional objects, either for FT (male fixed at 0° with female) or MT (male fixed at 0° with male), analogous to electrical connectors. Parity was adjusted by mirroring male objects at 0° (congruent, incongruent). Participants, responding via foot pedals, were tested across six rotational angles (0°, 60°, 120°, 180°, 240°, 300°) on both rotation sides (left, right), with their eye movements tracked over 576 trials. Using both parametric and non-parametric repeated measures ANOVA, distinct behavioral patterns were observed: in FT, incongruent conditions exhibited significantly shorter reaction times (RTs) at 240° and longer RTs at 0°, 60°, and 300° on the left side, and 300° on the right side (all p < .05). MT, under incongruent conditions after collapsing rotation sides, showed increased RTs at 60° and 300° (all p < .01). Error ratios were biased higher under congruent conditions for both tasks, particularly at mid-range angles (120°–240°; all p < .001), extending to 300° in FT on the left side. Longer fixation times were noted under congruent conditions in FT, and more fixations per saccade in MT under incongruent conditions (both p < .05). Despite extensive analysis, no significant gender differences were observed across these metrics in the congruent condition (all p > .05), suggesting minimal impact of gender in these tasks. These findings imply nuanced similarities and differences in behavioral parity trends and cognitive strategies across tasks, emphasizing the need for further investigation.

11:30-12:30 Session 12C: Posters - Perceptual learning, adaptation, and training (AM presenters)
[21] Differences between the visual behaviors of experts and beginners in diagnosing citrus Huanglongbing (HLB)

ABSTRACT. Citrus greening disease, also known as Huanglongbing (HLB), is a citrus disease. The HLB symptoms cause leaves to gradually turn yellow and lead to plant death. Although deep learning-based HLB diagnostic systems have been proposed, their diagnostic accuracy for a whole tree remains low. To improve the accuracy of the diagnosis, this study focused on the eye movements of experts in determining HLB. We conducted an eye-tracking experiment to clarify the differences between the visual behaviors of experts and beginners. During the experiment, participants wore an eye-tracking device and viewed screen-projected images of citrus trees to distinguish between diseased and healthy ones. We analyzed the results to compare the diagnostic abilities of experts and beginners, identifying crucial visual behaviors for accurate diagnosis. The experimental findings revealed that experts exhibited shorter observation distances and times, and fewer saccades and fixations compared to beginners. This indicates that experts focus their observations on areas crucial for disease determination, including tree branches, rather than solely on leaves. When implementing this diagnostic system for detecting HLB, visual behaviors should primarily target green leaves and branches. From now on, we will analyze gaze data only by experts with particularly high diagnostic accuracy. In addition, we will construct the saliency map model using the differences between the visual behaviors of experts and beginners and evaluate and validate the validity of the model.

[23] Neural correlates of musical notation reading in experts and novices

ABSTRACT. Musical notation reading is a sophisticated and highly practiced form of visual expertise that has been relatively understudied. To what extent does the specialization of such a relatively new visual skill impact cortical systems that are engaged by existing expertise such as face and word processing? To examine the functional mechanisms implicated in reading musical notation, experts (n=10) with extensive music reading experience (M=9 years, SD=3) and novices (n=11) with no music reading experience completed a test of visual fluency in musical notation (Wong et al., 2021) followed by an fMRI session. The visual fluency test confirmed that experts were faster in reading musical notation than novices. The fMRI session involved 6 musical notation conditions with 4 notes with 1) regular or 2) varied rhythms, manipulations of 3) removing line junctions among notes, 4) highly unfamiliar musical rhythm configurations, or replacements of notes with 5) letters or 6) symbols, in a one-back task. Independent localizer was used to identify left visual word form area (VWFA) and right fusiform face area (FFA) to test any involvement of these areas by musical notation reading expertise. We found that in VWFA, novices showed significantly stronger response amplitudes to musical notation compared with experts, but also to non-musical categories such as faces. In contrast, no significant differences were found between groups in FFA. Additionally, robust parietal activations for musical notation relative to other visual categories in both experts and novices were found in the localizer runs. These findings suggest that both ventral and dorsal activations should be considered to understand and distinguish neural correlates of musical notation reading that are relevant to expertise or stimulus characteristics.

[25] Neural adaptation to visual background orientation: a wide-view fMRI study

ABSTRACT. Sense of verticality is crucial for accurate spatial orientation and appropriate body control. In addition to vestibular and proprioceptive systems, vision plays a substantial role in estimating subjective verticality. For instance, a retinally vertical probe is perceived as tilted in the opposite direction to the orientation of visual background (i.e., the rod-and-frame illusion). Although such visual reliance of verticality processing has been investigated in psychological approaches, its neural mechanism remains largely unknown. In this study, we explored the neural representation of subjective visual vertical by using fMRI adaptation combined with a custom wide-view stimulation system. To induce a robust effect of visual background to subjective verticality, participants were presented a large square frame (approximately 60 x 60 degrees) as a background, while a rod was presented in the center as a probe. The frame was oriented -20 degrees (tilted left), 0 degrees (upright), or +20 degrees (tilted right), biasing participants’ subjective visual vertical. A rod was presented with different orientations, each adjusted individually to be perceived equally across participants based on the psychophysical measurement conduced prior to the fMRI experiment. We tested whether the response to the rod was adapted through repeated presentation of the rod’s orientation in terms of either physical or subjective vertical. Although the rod-and-frame illusion was induced substantially, we found no significant brain regions that showed the neural adaptation to the rod orientation for either physical or subjective vertical. On the other hand, the neural adaptation to frame orientation was identified in multiple cortical regions, including a bilateral pair in the posterior part of the inferior temporal sulcus, likely corresponding to hMT+, and in the caudal part of the intraparietal sulcus. These results might suggest a possible involvement of visual motion processing and visuomotor coordination in the estimation of subjective visual vertical.

[27] The dual impact of action video gaming on useful field of view

ABSTRACT. Action video games have been linked to improvements in visual attention, including the useful field of view (UFOV). However, mixed findings revealed that excessive time spent playing video games may develop gaming addiction, which worsens visual attention and inhibitory control. Therefore, it is hypothesised that gaming addiction is a mediating factor between gaming expertise and the enhancement of the attentional visual field. Fifty-seven non-gamers, casual and competitive players of the popular first-person shooter video game Valorant, were recruited to examine this mediating relationship. A 21-item Game Addiction Scale (Lemmens et al., 2006) was used to measure their risk of gaming addiction, while a UFOV task (Ball et al., 1988) was used to quantify participants’ attentional visual fields. Using Process macro by Hayes (2022), these data were analysed via simple mediation analysis with a multi-categorical independent variable – gaming expertise (i.e. non-gamers, casual and competitive players). Results indicate a significantly positive direct effect of expertise on the enhanced attentional field that exceeds usual gaming demands and implies improved abilities to overcome foveal load. Furthermore, a significantly positive effect of expertise on the risk of gaming addiction was observed, yet gaming addiction did not emerge as a significant mediator in the relationship between expertise and attentional field. Not only does this highlight a positive association between moderate action video gameplay and cognitive benefits, but it also further emphasises the importance of identifying addiction protective factors and exploring potential mediating variables that may influence this relationship.

[29] Object-contingent duration adaptation depends on the object-specific representation of duration

ABSTRACT. Past experience plays a crucial role in shaping our perception of time, and our brain possesses the capability to process temporal information for multiple objects simultaneously. Studies have demonstrated that observers can concurrently adapt to two durations across visual and auditory sensory modalities. Notably, this duration aftereffect is contingent on auditory frequency but not visual orientation. We hypothesized that duration adaptation depends on the duration representation of specific carrier. The lack of concurrent adaptation on visual stimuli was due to the durations were not well bound to their carriers. In the current study, we applied a concurrent adaptation paradigm coupled with a duration oddball task. Participants were required to report the duration oddball while being repetitively exposed to two distinct visual stimuli (a fish or a balloon) each with a fixed duration (160 or 700ms) during the adaptation phase. The oddball task facilitated effective binding of the durations with their visual carriers. In the subsequent test phase, participants reproduced the perceived duration of each visual stimulus presented between 300-460ms. The results showed significant duration aftereffect contingent on the visual stimuli, suggesting that participants concurrently adapted to two different durations linked to the distinct visual carriers. This indicates that the occurrence of object-contingent duration adaptation depends on the object-specific representation of duration, which likely occurring in the later stages of the visual processing hierarchy.

[31] Visual Stimulation Program for Children with Severe and Profound Visual Impairment

ABSTRACT. Purpose: The objective of this study was to develop a visual stimulation program and evaluate its effectiveness on children with severe and profound visual impairment attributable to disorders in the anterior or posterior visual pathways. Methods: The children had severe to profound visual impairment alongside other developmental problems. Comprehensive ophthalmologic evaluations were conducted. Visual function and functional vision were assessed using the Near Detection Scale (NDS), the Visual Function Battery for Children with Special Needs (VFB-CSN), and the Functional Visual Questionnaire (FVQ). The visual stimulation program employed the Tobii Dynavox PCEye 5, Gaze Viewer, and Microsoft PowerPoint, with adjustments in stimulus size, contrast, color, pattern, orientation, and motion tailored to the children’s visual ability and response to stimulus. The interventions were administered weekly, consisting of one hour of visual stimulation and health education, for a total of ten sessions. The primary outcome measures included changes in the scores of the VFB-CSN, NDS, and FVQ. Results: The cohort initially comprised 25 children with special needs, with 21 completing the intervention. The average age of the 21 children (11 males, 10 females) was 39.5 months (range: 8 - 82 months, SD = 23.8). The intervention demonstrated significant enhancements in visual functions and functional vision among children with severe and profound visual impairments concomitant with multiple developmental disorders. Improvements were notable in visual reflexes, visual acuity, contrast sensitivity, ocular motor, and overall scores. Subcategories of visual field and visual attention also showed border effects. Functional vision was evaluated using the Functional Vision Questionnaire, which also indicated improvement not only in visual function but also in functional vision in severe/profound children. Conclusion: The results indicated that the visual stimulation program may have the potential to improve visual function and functional visual performance in children with severe and profound visual impairment.

13:00-14:00 Session 14A: Posters - Color, orientation, and form (PM presenters)
[2] The limit of spatial frequency for pooling local orientation signals

ABSTRACT. This study used a mirror stereoscope to investigate the spatial frequency difference that determined whether a set of uneven local orientations was integrated as a coherent surface texture or segregated. A donut-shaped texture composed of 40 Gabor patches surrounding a fixation point was created. Twenty-nine participants performed a mean orientation discrimination task on the patches in parafovea while the spatial frequency and number of the patches were manipulated. The spatial frequencies of the patches within the texture were divided into three types, each differing by 1.5 octaves. Because the contrast of each type was twice as much as the absolute threshold for each participant, the visibility of all patches in parafovea was presumed uniform. One type was always vertical, while the other two were tilted. The true mean orientation of the latter group was 15 degrees clockwise or counterclockwise from the vertical while the angles of individual patches differed by the normal random number generation with 1SD of 1, 4, 8, 16, or 32 degrees. Since the number of tilted patches in the texture varied from 4 to 40, the remaining placeholders were buried with the vertical patches. Additionally, using a mirror stereoscope, the tilted patches appeared slightly floated or sank from a reference plane containing a fixation point and the vertical patches. Participants were required to integrate bumpy local orientations to discriminate the mean orientation of all patches against the vertical. Results showed that the discrimination threshold decreased proportionally with the number of tilted patches when they consisted of a 1.5-octave difference in spatial frequency. However, the discrimination was nearly impossible regardless of the number of tilted patches when they consisted of a 3-octave difference. Consequently, it is suggested that local orientation pooling processes underlying the perception of surface textures have a spatial frequency limit of approximately 1.5 octaves.

[4] Population receptive field estimation in human visual cortex under wide-view stimulation

ABSTRACT. Human early visual cortex has topographic representations of the visual field. Such a representation can be estimated using the population receptive field (pRF) model. Here, we investigate the pRF properties of human visual cortex in response to retinotopic stimuli presented in both central and peripheral vision. The retinotopic stimuli consisted of rotating wedges and contracting concentric rings moving across the visual field. The observers viewed the stimuli through a wide-view binocular visual stimulation system in the MRI scanner, which allowed the visual presentation to cover around 90 degrees of the visual field. We applied a 2D Gaussian pRF model to the BOLD activations of voxels in areas V1 and V2 defined by probabilistic Freesurfer labels for the observers. Our results show that the center of the pRF estimation exhibits a topographical organization in both areas V1 and V2, which includes not only the central vision but also the peripheral vision. Additionally, in the left hemisphere, most of the voxels show a pRF center located in the right visual field, while in the right hemisphere, the majority of the pRF centers are in the left visual field. However, we have not been able to identify a consistent pattern in the size of the pRF estimations. This may be because voxels in the far periphery contain neurons with different visual field selectivity. Further optimization of the experiment stimuli and protocol may be necessary to address this issue. Nevertheless, our study demonstrated the possibility of applying the pRF model to estimate the retinotopic mapping in far periphery vision.

[6] Is fluorescence perception mediated by a mechanism producing the colour appearance of the surface colour mode and aperture colour mode?

ABSTRACT. It is well known that the surface colour mode and the aperture colour mode produce different colour appearances. These modes essentially depend on the colour appearance of the target and the surrounding stimulus configuration. Psychological studies have reported that the colour appearance of these two modes is influenced by the luminance and/or brightness of the target. However, colour appearance in fluorescence perception seems to be related to both modes. Therefore, our question is how colour appearance in fluorescence perception is affected by the mechanisms of these modes. Furthermore, we investigate whether there is an independent mode and its mechanism for fluorescence perception. We measured the probability of perceiving one of these modes when presented with a stimulus that varied in spatial frequency components and contrast. The respective probability of the surface and aperture colour modes increased and decreased as the spatial frequency band covered a higher frequency. However, it did not change much as a function of contrast. Interestingly, the sum of both probabilities was greater than 100% around a certain spatial frequency band. This suggests that a mechanism other than these two modes may be responsible for the appearance of colour in fluorescence perception.

[8] The effect of presentation time and luminance gradient on the glare effect

ABSTRACT. The gradient from black to white creates a glow perception, resulting in a glare effect that appears brighter than the physical luminance. Since the presence of lowlight and white highlights are important for the glare perception, the gradient from white to black could be a factor in the glare effect as well. It is also known that the brightness contrast, in which the center appears brighter with a black surrounding, is significantly stronger when the surrounding is presented briefly. The effect of presentation time has not been investigated for the glare effect, and it has been unclear how it relates to the polarity of the luminance gradient. The study measured the perceived luminance when luminance gradients were briefly presented on a black or white background. Stimulus patterns were either glares (brightening toward the center) or halos (darkening toward the center). Presentation durations ranged from 0.02 to 0.53 s in six steps. Stimuli were presented briefly, extinguished for 0.5 s, and flashed repeatedly. The background when the stimulus extinguished was black or white. Participants adjusted the luminance to match the brightness of the reference stimulus. Results showed that the effect size between the background condition and presentation time exceeded the effect size between stimulus patterns and duration time. The response curve to presentation time differed depending on the type of background, exhibiting a peak on black backgrounds and a decay on white backgrounds.The main effect size between stimulus patterns was larger for halo conditions at presentation times of 0.03 and 0.07 s regardless of the background condition. The results indicate that the brightness change due to the luminance gradient depends more on the background condition than the stimulus pattern, and that the luminance polarity of the gradient may be important for the perception of brightness in short presentation times.

[10] Estimating metamer mismatch between CIE-1931 color matching functions and CIE-2006 cone fundamentals

ABSTRACT. Color is represented by tristimulus values, however, their definition varies depending on the adopted models. The CIE1931-XYZ model has been widely employed in industrial color management despite known discrepancies from average observer perceptions. Conversely, cone fundamentals (also known CIE2006) are preferred in the vision science for their accuracy and clarity of physiological origin. While both sets of color matching functions represent human color vision, they should be mutually linearly convertible. However, the conversion was known to lead errors. Despite this acknowledged inconsistency, its impact on chromaticity has not been thoroughly explored. Quantifying the magnitude of this error is crucial for conducting comparative studies and meta-analyses of existing literatures. In this study, we focused on metamer mismatch, wherein spectra of lights that match in color under one model (CIE1931) may appear different under another model (cone fundamentals). We quantified the extent of metamer mismatch using binary rectangular spectra, mixture of three random Gaussian spectra, and Munsell color chips. The rectangular spectra represent theoretical limits of metamer mismatch, while the Gaussian spectra simulate color distributions likely found in nature. The Munsell chip serve as representative colors. Our findings indicate that the observed distribution of metamer mismatch is broadly spread along the S cone axis, with the diameter was 0.07 on the xy chromaticity diagram when the sample was neutral (value 5, luminance ~22%). This diameter increased as the stimulus intensity decreased. The metamer mismatch volumes generated by Gaussian spectra were 15% and 50% smaller than theoretical limits for S and L/M axis, respectively. Munsell color chips were positioned within these volumes. The observed size of metamer mismatch corresponds to a discernible color difference and is larger than the measurement accuracy of conventional colorimeter, suggesting cautions in comparing studies based on different color matching functions.

[12] Effects of front and rear sounds on visual parvocellular and magnocellular processing

ABSTRACT. Multisensory studies have shown that sounds can facilitate early visual perception, but it remains unclear how auditory inputs affect the visual parvocellular and magnocellular processing. Based on previous knowledge that sounds nearby a visual target increase the sensitivity to high spatial frequency information, sounds presented frontally may facilitate visual perception mediated by the parvocellular system. On the other hand, given a critical role of the auditory system to alert unseen threats, sounds outside the visual field e.g. behind the body may facilitate the magnocellular system, which is sensitive to transient information and involved in action. Thus, this study examined the effects of sounds presented in front of and behind the body on an orientation detection task of Gabor patches, where the relative contribution of the parvo- and magnocellular pathways was manipulated by high and low spatial frequencies. A task-irrelevant white noise burst was presented simultaneously with a visual stimulus, either from loudspeakers located in front of and behind participants. Preliminary results showed that the increase in the orientation detection performance due to simultaneously presented sounds was greater for Gabors with high, compared to low, spatial frequency. Moreover, frontal sounds specifically increased the performance for high spatial frequency Gabors, whereas rear sounds had no reliable effect on the performance. The selective facilitation by frontal sounds cannot be explained by channel-independent mechanisms such as internal signal enhancement due to multisensory integration of simultaneous audiovisual inputs. Rather, it is plausible that the sustained channel of vision, which is known to improve at attended locations, was enhanced by sound-induced attentional boost in a spatially consistent manner.

13:00-14:00 Session 14B: Posters - Social and cultural influences on perception (PM presenters)
[14] Cultural Variance in the Emotion Perception of Body Actions by Asian Performers

ABSTRACT. While many studies have examined cultural variances in emotion perception of facial expressions, there is relatively less research on the cultural variance in emotion perception conveyed through body actions. Previous studies on facial expressions suggested an in-group advantage, wherein people perform better at recognizing emotions expressed by members of their cultural group. However, these studies mainly focused on emotion recognition, neglecting other important dimensions, including arousal and valence. Building on the hypothesized in-group advantage for emotion perception of body actions in the literature and incorporating additional dimensions of emotion, the present study investigated the perception of seven emotions (i.e., joy, sadness, anger, fear, disgust, surprise, and contempt) expressed by Asian performers across four dimensions (emotion recognition, confidence in recognition, arousal, valence). Participants (Asian: N = 41; non-Asian: N = 26) were engaged in an online experiment to watch 70 motion videos and respond to their emotion perceptions. Results revealed that Asian participants had higher accuracy and confidence in recognizing sadness, anger, and surprise. Besides emotion recognition differences, Asian observers reported higher perceived arousal for joy, sadness, disgust, and contempt and higher perceived negativity for sadness, anger, and contempt, indicating cultural variances across multiple emotional dimensions. While significant relationships were discovered between cultural contact and perception of certain emotions, no significant relationship was found between individualist tendency and emotion perception, emphasizing cultural exposure rather than attitudes towards self and community. This study contributes to cross-cultural studies in emotion perception, calling for further investigation into potential variances in the underlying neural mechanisms.

[16] Prolonged Visual Perceptual Changes Induced by Short-term Dyadic Training: The Influential Roles of Confidence and Autistic Traits in Social Learning

ABSTRACT. As social creatures, we are naturally swayed by the opinions of others, which largely shape our attitudes and preferences. However, whether social influence can directly impact our visual perceptual experience remains debated. We designed a two-phase dyadic training paradigm where participants first made a visual categorization judgment and then were informed of an alleged social partner’s choice on the same stimulus. Results demonstrated that social influence significantly modified participants’ subsequent visual categorizations, even when they had been well-trained prior to the dyadic training. This effect persisted for an extended period of up to six weeks. Diffusion model analysis revealed that this effect stemmed from perceptual processing more than mere response bias, and its strength was inversely related to the participants’ confidence and autistic-like tendencies. These findings offer compelling evidence that our perceptual experiences are deeply influenced by social factors, with individual confidence and personality traits playing significant roles.

[18] Different cognitive mechanisms underlie absolute and relative evaluation of images

ABSTRACT. In life with huge digital data, we need to choose some pictures for display from hundreds of pictures. There are two possible methods to evaluate images. One is to evaluate preference for each image one after another (absolute evaluation). The other is to compare two images to decide which one is preferable to the other (relative evaluation). Both of the evaluations should be based on the same mental process if preference is uniquely decided. However, interestingly, some studies showed that the evaluation of the two methods was different. In this study, we investigated what causes the difference between the two types of evaluation methods. To understand the neural process underlying the difference, we conducted behavioral experiments. We recorded the facial expressions and EEG signals of participants while they were performing absolute and relative image evaluation tasks during the experiment. The experiment showed that preference scores of the absolute and relative evaluations were not correlated, and the relative differences in scores of a certain pair of images from absolute evaluations are sometimes opposite in scores from relative evaluation. These results suggest that distinct cognitive mechanisms underlie relative and absolute evaluations. Next, we trained a machine learning model to predict absolute and relative evaluation results from facial expressions and EEG signals. For both types of evaluations, when predicting one participant’s evaluations, facial/EEG features of the same participant successfully predicted evaluations. Furthermore, for absolute evaluations, features from other participants improved prediction performance, suggesting that there are common facial/EEG features across participants. On the contrary, for relative evaluations, features from other participants did not improve predictive performance, suggesting facial/EEG signals related to relative evaluations differ greatly across participants. In summary, we showed that absolute and relative evaluations are two different mechanisms from behavioral data analysis and machine learning analysis.

[20] Empowering Attires: The Role of Clothing in Countering Stereotypes

ABSTRACT. As economic inequality grows, understanding how status cues shape social perception becomes increasingly crucial. The pervasive stereotype of incompetence continues to profoundly shape the way women and minorities navigate social interactions, with studies showing the strategic use of language to convey competence. Likewise, clothing conveys competence cues through subtle economic status cues. However, the role of attire in countering stereotypes remain underexplored. This study thus investigates how attires are employed across gender and race to project competence and mitigate negative stereotypes.

In two studies, participants (Study 1: 20 Black and 20 White men, 20 Black and 20 White women; Study 2: 50 Black and 50 White men, 50 Black and 50 White women, all residents of the U.S.) read about 20 scenarios describing social situations. Half demands competence presentation (competence-relevant situations; e.g., presenting in an exhibition), whereas the other half involve lower stakes (competence-irrelevant situations; e.g., gathering with friends). Given each situation, participants choose an outfit from five options randomly drawn from a gender-specific pool of ~75 clothing images. These clothes and situations were rated for professionalism by two separate groups of independent raters. To ensure robustness, separate clothing images were used for each study, study 2’s protocol and predictions were preregistered.

Initial results revealed that competence-relevant situations prompted more professional attire choices. Black and female participants chose more professional attires in competence-irrelevant situations. Black women opted for more professional clothes in competence-relevant situations than White women, while White men opted for attires more casual than Black men in competence-irrelevant situations. Additional studies examine the motivations behind selecting clothes choices.

By shedding light on how different groups utilise attires to strategically navigate their social landscape, this research underscores its potential to counter negative stereotypes and contribute to our understanding of the subtle yet powerful ways in which they are resisted.

[22] Trustworthiness Judgement in Short Videos is Influenced by Speakers’ Facial Emotion and Attire
PRESENTER: Zihao Zhao

ABSTRACT. Previous research found that attire and emotions influenced social attributes of face images such as trustworthiness and competence. However, their effects on the trustworthiness of short videos on social media such as TikTok remain unclear. The current study explored how facial emotions and clothes of the speaker influence trustworthiness judgement in short videos.

Thirty-two participants (Mean Age = 32.16, 17 females) viewed 192 short videos (4 clothes * 3 emotions * 2 display modes * 2 speaker genders * 4 news contents) in a random order and rated trustworthiness of speaker and content in the same trial on a 0 (lowest trust) to 100 (highest trust) scale after each video. Open-source computer vision algorithms were used to transform static real-person images into realistically speaking individual’ videos. Repeated measures analysis of variance showed speaker and content trustworthiness ratings were influenced significantly by emotion (F(1.57, 48.74) = 17.42, p < 0.001; F(1.57, 48.63) = 6.78, p = 0.005, respectively) and uniform ( F(1.80, 55.71) = 8.48, p = 0.001; F(1.90, 58.99) = 4.97, p = 0.011, respectively). Pairwise comparisons with Bonferroni correction found angry speakers were rated as less trustworthy than happy (t = 4.11, p = 0.001) or neutral speakers (t = 5.69, p < 0.001). Angry speakers' content was also less trusted than neutral speakers’ (t = 3.49, p = 0.004). Furthermore, speakers in doctor uniforms were trusted more than those in casual clothes (t = 4.24, p = 0.001). However, content from speakers in doctor uniforms received higher trustworthiness ratings than those in casual clothes, but not significant (t = 2.59, p = 0.086).

The current study found both emotion and attire influenced our judgement of trust toward short videos’ speaker and content. It provides insights into the underlying mechanisms of judgment of trust and fake news detection.

13:00-14:00 Session 14C: Posters - Perceptual learning, adaptation, and training (PM presenters)
[24] Exploring the effects of training variability on multitasking training gains and long-term retention using online MATB

ABSTRACT. Effective multitasking relies on focused attention and visual coordination, and while training can significantly improve multitasking performance, different training strategies can affect the rate of improvement. This study explores the effects of training variability on multitasking training performance gains and their long-term retention across different training conditions. It utilizes the Multi-Attribute Task Battery (MATB), a naturalistic multitasking paradigm where participants simultaneously manage four different visual and audio tasks in a flight deck context, with our modified web-accessible platform allowing for remote online training. Participants were randomised into four training groups of varying task intensity and variety, and completed five consecutive daily sessions of online MATB training, followed by a follow-up session five months later to evaluate their long-term training retention. Results show a similar trend of training improvement for all groups, as well as good long-term retention. Consistent with the literature, varied training groups generally yielded greater overall training gains than fixed training groups, suggesting that varied training strategies may be more effective than fixed training strategies. Future works will look to refine the experimental design to further explore the mechanisms and strategies underlying these effects.

[26] The Effect of Features of Objects and Temporal Order Context on Implicit Learning of Spatial Bias

ABSTRACT. The location probability cueing effect refers to the phenomenon where spatial attention is implicitly biased toward a specific location, leading to enhanced visual search efficiency when a target frequently appears at that location. This study aimed to investigate whether spatial bias can be implicitly learned when combined with temporal order context and object feature context. To this end, two visual search tasks were presented sequentially, and the location where the target frequently appeared varied depending on the order of the tasks and the features of the target. In both pre- and post-search tasks, the frequently appearing object features and their corresponding locations were predetermined. Reaction times were then compared under conditions where these high probability contexts were either valid or invalid. The results of the experiment showed that reaction times were faster when only one context was valid compared to when both contexts were valid. Between the two contexts, reaction times were faster when the temporal order context was valid compared to when the feature context was valid. These findings suggest that the temporal dimension is more likely to be prioritized and associated with the spatial dimension during learning compared to other attributes. In visual search, when only one context cue is available, the searcher can implicitly learn spatial bias using that context. However, when it is necessary to learn multiple contexts, the cognitive cost required may exceed the benefits gained from visual search performance, causing a potential bottleneck effect.

[28] Perceptual learning and contour integration in the primate TEO

ABSTRACT. Objective Previous studies have shown that grouping of contour segments into a unified whole involves bidirectional processes among the early- and intermediate-level visual areas V1, V2 and V4, and that training significantly enhances contour representation in V1. However, it is unknown whether the TEO, a higher-tier visual area, is also engaged in the grouping process and whether training can also modify the grouping process in TEO. The current study addressed these two questions for a better understanding of the coordination among hierarchically organized visual areas. Methods Monkeys were trained to detect contours embedded in a complex background, and TEO neuronal activities were recorded by using implanted microelectrode arrays during the training. Results (1) TEO neurons did encode global contours and the grouping process improved with training. The learning-induced changes in TEO were correlated with the improved behavioral performance. (2) Two encoding modes were mediated by two distinct groups of TEO neurons. One group showed increased firing rates with increasing contour saliency, whereas the other group showed an opposite effect; Both groups of neurons were insensitive to the orientations of the global contour. (3) After training the animal in the contour detection task, a binary classifier trained to distinguish between the neural responses in the presence and absence of an isolated contour was able to partially decode the responses to the same contours embedded within the background. This suggests that training results in the convergence of neural representations in TEO regardless of the cluttered background. Conclusion TEO neurons extract visual contour information by suppressing the background elements and enhancing the contour components; and these two complementary processes were enhanced with training. Our results, when combined with previous studies in V1, V2 and V4, suggest that the contour integration processes in TEO and earlier visual areas are closely coordinated.

[30] The Impact of 40Hz Light Flicker on Monocular Deprivation Plasticity in Human Adults

ABSTRACT. Animal studies have found that 40Hz light flickering can induce various changes in the visual system and show promising results in regulating neural plasticity in mouse models. Based on this, our study aimed to explore the effects of 40Hz light flickering on visual plasticity in human adults and elucidate its potential mechanisms. In particular, we used short-term monocular deprivation-induced plasticity, a well-established form of visual plasticity in human adults, as an indicator to probe the effects of 40Hz light flicker on visual plasticity. In Experiment 1, we used the binocular rivalry paradigm to assess the effects of short-term 40Hz light flickering on binocular perception, its impact on short-term monocular deprivation plasticity, and modulation effects under different intervention forms (strip light and glasses light). Measurements were taken before intervention and at 0, 3, 6, 9, and 30 minutes post-intervention. In Experiment 2, we administered light flickering at different frequencies (20Hz, 40Hz, 80Hz) to normal adults using the binocular rivalry paradigm to explore the frequency specificity of 40Hz flickering on short-term monocular deprivation plasticity. Additionally, we used both binocular rivalry and binocular combination paradigms to investigate the task specificity of 40Hz flickering. In Experiment 3, we employed steady-state visual evoked potentials (SSVEP) to explore the neuronal responses of the primary visual cortex before and after monocular deprivation under 40Hz flicker intervention. Our results showed that 1-hour of 40Hz light flickering did not alter binocular perception in adults, but it stabilized the visual system, making it unable to induce significant short-term monocular deprivation plasticity in subjects previously exposed to one hour 40Hz light flickering. This regulatory effect remained consistent across different intervention methods. Similar outcomes were observed in the SSVEP tests and the behavior assessments. These findings highlight the potential of 40Hz flickering as a tool for modulating neural plasticity in human adults.

[32] Exposure-based Learning Improved Orientation Discrimination Under Visual Crowding

ABSTRACT. Visual crowding refers to the impairment of object recognition in the presence of adjacent objects. Perceptual learning reduces peripheral crowded orientation discrimination and learning shows specificity to trained locations. Here we manipulate attention to crowded stimuli, to separate the impacts of top-down attention and bottom-up exposure, on learning with crowded orientation and its transfer to other locations. Observers reported the target orientation (a circular sinusoidal grating centered at 8°-eccentricity, 36°/126°) with two grating flankers with randomized orientation in pre-test and post-test. The orientation discrimination threshold was adjusted by staircase. Four groups of observers underwent five sessions of training or exposure. Results: (1) Baseline-group (N=8): Training improved crowded orientation discrimination, and the reduction of crowding was specific to the trained location. (2) Active-exposure-group (N=8): Crowded contrast discrimination training enabled complete learning transfer to crowded orientation discrimination, and the transfer was specific to the exposed location. (3) Passive-exposure-group (N=8): Observers responded to a central RSVP task while passively exposed to peripheral crowded gratings. Crowded orientation discrimination was substantially improved as continued training produced no further gains, and the reduction of crowding was evident in unexposed locations. (4) Subliminal-exposure-group (N=8): A “continuous-flash-suppression” technique was used to suppress the exposure of crowded gratings into sub-consciousness, meanwhile observers were asked to do a fovea dot color task. Crowded orientation discrimination was mostly improved, and the improvements partially transferred to unexposed locations. (5) A control group (N=9) ruled out the possibility that the improvements were due to the test-retest effect. The results demonstrated the capacity of the visual system to learn to reduce crowding by repeated exposure to crowded stimuli, which provides a complementary of plasticity to practice, attention-based, learning. Releasing spatial attention to crowded stimuli might decrease the location specificity in crowding learning. These findings shed new light on the mechanisms of crowding and learning.

14:00-15:15 Session 15A: Talks - Visual illusions and related phenomena
14:00
Rhythmic TMS over human right parietal cortex strengthens visual size illusions
PRESENTER: Lihong Chen

ABSTRACT. Rhythmic brain activity has been proposed to structure visual processing. Here we investigated the causal contributions of parietal beta oscillations to context-dependent visual size perception, which is indicated by the classic Ebbinghaus and Ponzo illusions. On each trial, rhythmic TMS was applied over left or right superior parietal lobule in a train of five pulses at beta frequency (20 Hz). Immediately after the last pulse of the stimulation train, participants were presented with the illusory configuration, and performed a size-matching task. The results revealed that right parietal stimulation significantly increased the magnitudes of both size illusions relative to control vertex stimulation, whereas the illusion effects were unaffected with left parietal stimulation. The findings provide clear evidence for the functional relevance of beta oscillations for the implementation of cognitive performance, supporting the causal contribution of parietal cortex to the processing of visual size illusions, with a right hemispheric dominance.

14:15
Verification of Hermann grid illusion using machine learning
PRESENTER: Yuto Suzuki

ABSTRACT. There are various types of optical illusions that humans experience, and multiple approaches are being used to explain the mechanisms. Machine learning could be one of the promising methods. In this study, we attempted to clarify the mechanism of Hermann grid illusion by reproducing the illusion using machine learning. In the evaluation experiment, observers evaluated the strength of illusion for 568 Harman grid illusion images with different grid thicknesses, number of intersections, angle, and brightness contrasts on a seven-point scale. After that, we created a model that learned each image and its strength of illusion using CNN. Then, we calculated the strength of the illusion of the test images and determined the correct answer rate. In addition to the conventional model, we created one that incorporates the ON-center receptive field structure, which is thought to explain the optical illusion mechanism, orientation-selective structure, and structure with both visual systems. We compared the correct answer rates of those four types of models in a machine-learning experiment. We confirmed that the Hermann grid illusion has orientation selectivity in the evaluation experiment. In the machine learning experiment, the model with the structure of the visual system had a relatively more stable correct answer rate than the conventional model, suggesting its validity. It was also suggested that the correct answer rate may increase by combining the structures of the visual system. Our results of evaluation and machine learning experiments suggested that the Hermann grid illusion may involve a mechanism using ON-centered receptive fields and orientation selectivity. Further model improvements are needed to clarify the mechanism of the Hermann grid illusion.

14:30
Classical orthonormal polynomials as activation functions for implicit neural representations to preserve high frequency sharp features.

ABSTRACT. Neural networks with ReLU activation functions, although are popular choices in many machine learning applications, are strongly biased towards reconstructing low frequency signals. Higher frequency representations are essential to manifest sharp features in images and 3D shapes. The current strategy to enhance high frequency representations in neural networks is to use a sinusoidal activation with increasing frequency. However, such activation introduce periodicity in the network as sinusoidal functions are periodic and also introduce ``ringing artifacts'' in image and 3D shape reconstructions due to Gibbs overshooting phenomenon. Noting that sinusoidal functions with increasing frequencies are only an example of a more general class of functions called complete orthogonal systems that approximate any arbitrary function, the authors have explored other ``classical'' orthogonal systems like Legendre, Hermite, Laguerre and Tschebyscheff polynomials as activation functions to address above mentioned issues and compared them against the sinusoidal functions and non-orthogonal systems such as power series. In this study, the authors demonstrate how these functions can be used as a neural network layer, compare their convergence rates with different optimizers and increasing polynomial degrees, and assess their accuracy in various applications, including solving differential equations, image reconstruction, 3D shape reconstruction, and learning implicit functions.

14:45
The inhibition of return and the eye fixation patterns for perceiving bistable figures
PRESENTER: Chien-Chung Chen

ABSTRACT. Bistable figures can generate two different percepts and make observers’ perception spontaneously reverse. Some evidence has pointed out that visual attention plays an important role in object perception because it can help us selectively focus on some features within the figure. According to the saliency model proposed by Itti and Koch (2000), visual attention can be attracted to locations of high saliency. After attention stays at the same location for a while, the local saliency will be suppressed. It thus makes visual attention shift to a new location, which is called inhibition of return (IOR). Based on this, we assumed that there are several features contained in a bistable figure, and those features imply different interpretations of the figure. IOR can make our attention shift between different regions containing those different features, and thus result in percept reversal. We used an eye-tracker to record the observers’ eye movements during observation of the duck/rabbit figure and Necker cube, and also recorded their percept reversals. In Exp. 1, we found that different fixation patterns were shown under different percepts. Also, the fixation shift across different regions occurred before the percept reversal. This supports the idea that what we perceive depends on where we look. In Exp. 2, we examined the influence of inward bias on the duck/rabbit figure. We found that it had a significant effect on the first percept, but this effect diminished over time. In Exp. 3, we added a mask to the attended region to remove the local saliency. This manipulation increased both the number of percept reversals and the number of fixation shifts across different regions. That is, the change in local saliency can cause a fixation shift and thus reverse our perception.

15:00
Interoception affects the moving rubber hand illusion
PRESENTER: Hiroshi Ashida

ABSTRACT. Rubber hand illusion (RHI) refers to the phenomenon that a hand-like object is felt as our own hand after we undergo synchronous visuo-tactile stimulation on our own hand and the object. We have suggested that emotional states could affect the RHI, and speculated that interoception might mediate the link (Kaneno & Ashida, 2023). Tsakiris et al. (2011) reported that people with lower interoceptive sensitivity are more susceptible to RHI, but it remains controversial with many studies failing to replicate it. In this study, we examined the relationship between interoception and a variant of RHI that is induced by active movement of participants’ finger (i.e. “moving RHI”, Kalckert & Ehrsson, 2012). The participant’s index finger was linked to the rubber-hand finger so that the participant’s invisible finger movement was reflected in visible movement of the rubber finger. Similar but asynchronous finger motion was produced by the experimenter as a control. The RHI was quantified as a difference between the questionnaire scores under synchronous and asynchronous conditions. We measured interoception in two ways: interoceptive accuracy (IA) by the conventional heartbeat counting task and interoceptive sensitivity (IS) by questionnaires on broader aspects of interoception. We found that moving RHI was stronger for those with higher interoceptive sensitivity, which was not clear in classic RHI. The pattern of results is apparently opposite from that in Tsakiris et al. (2011), but it is consistent with the finding of Ma et al. (2023) of stronger out-of-body experience in a virtual avatar for the higher-IA group, when active walking was involved. Participants’ own movement is considered to be crucial for linking interoceptive and exteroceptive information as to the sense of body ownership. Our results also suggest the need of multiple interoception measures, as one measure alone may not be always reliable.

14:00-15:00 Session 15B: Talks - Working memory: behavior and models
14:00
The Effect of Dynamic Visuospatial Working Memory on Motor Control
PRESENTER: Garry Kong

ABSTRACT. Working memory is thought of as a key pillar of human cognition, supposedly because it acts as a foundation through which other cognitive abilities can build upon. Despite this, there is very little definitive proof that any aspect of working memory enables another cognitive function. Here, we demonstrate that visuospatial working memory is bidirectionally linked to fine motor control, i.e., that fine motor control is impaired when visuospatial working memory is loaded and that doing a fine motor control task impairs visuospatial working memory. We used a dual-task paradigm, where participants viewed a memory stimulus, then moved their finger from one side of a touchscreen monitor to a dot on the other side. On some trials, once the participant began their finger movement, the destination dot was translated vertically, and the participant had to adjust their movement to land on the new location. Once their finger reached the destination, they were then asked to recall the memory stimulus. The memory stimulus was either a moving trail of dots (dynamic), or three colored dots (static). When the memory stimulus was dynamic, we found that the time required to adjust to the change in destination was increased compared to when there was either no memory load, or a static one. Furthermore, we found that not only did the possibility of needing to adjust their planned finger motion decrease their memory accuracy, but actually correcting their movement impaired memory even more. For the static memory stimulus, the time required to adjust to the change in destination was increased compared to no load, but memory accuracy was not impacted by either the possibility of needing to adjust their motion, or actually correcting it. We conclude that there is a bidirectional relationship between dynamic visuospatial working memory and fine motor control.

14:15
Second responses in visual working memory experiment

ABSTRACT. Visual working memory (VWM) allows for storing detailed visual information on short time scale. Despite intensive research, there is no consensus on the nature of representations in VWM. According to the slot models, memory representation is constrained by a highly limited number of discrete memory units where the element information is stored. This leads to a prediction that memory performance should have high-threshold, all-or-none characteristics. On the other hand, detection theory – based resource models assume that VWM representations can store more elements, but precision is limited by noise. According to this view, VWM performance should have low-threshold, degrees-of-certainty characteristics. In this study, a two-response n-alternative forced-choice technique developed originally in psychophysics was applied to a VWM change detection task. Observers were shown n Gabor elements (1.5 cpd) in randomized orientations for 200 milliseconds. After a blank retention interval (1500 milliseconds), one of the elements had changed in orientation. The observer’s task was to indicate the changed element. In addition to the first, main response, observers were allowed to make a second choice (the second-best guess). The number of elements was varied (3, 4, 6 and 8). The slot and resource models make different predictions for the accuracy of the second response when the first response is incorrect. According to the slot model, incorrect responses happen when the changed item was not stored in VWM, and thus the second response performance should be at the chance level. The resource model predicts that incorrect responses were caused by noise, and second responses should be above the chance level. The results show that second response performance was significantly above the chance level for all set sizes. However, it was also below what is predicted by a simple, independent Gaussian noise-limited resource model. Nonetheless, more elaborate resource models could explain the results.

14:30
Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory
PRESENTER: Mengmi Zhang

ABSTRACT. Working memory (WM), a fundamental cognitive process facilitating the temporary storage, integration, manipulation, and retrieval of information, plays a vital role in reasoning and decision-making tasks. Robust benchmark datasets that capture the multifaceted nature of WM are crucial for the effective development and evaluation of AI WM models. Here, we introduce a comprehensive Working Memory (WorM) benchmark dataset for this purpose. WorM comprises 10 tasks and a total of 1 million trials, assessing 4 functionalities, 3 domains, and 11 behavioral and neural characteristics of WM. We jointly trained and tested state-of-the-art recurrent neural networks and transformers on all these tasks. We also include human behavioral benchmarks as an upper bound for comparison. Our results suggest that AI models replicate some characteristics of WM in the brain, most notably primacy and recency effects, and neural clusters and correlates specialized for different domains and functionalities of WM. In the experiments, we also reveal some limitations in existing models to approximate human behavior. This dataset serves as a valuable resource for communities in cognitive psychology, neuroscience, and AI, offering a standardized framework to compare and enhance WM models, investigate WM's neural underpinnings, and develop WM models with human-like capabilities. Our source code and data are available at: https://github.com/ZhangLab-DeepNeuroCogLab/WorM

14:45
Visual Working Memory Load Impairs Detection Sensitivity: A Re-entry Load Account
PRESENTER: Chi Zhang

ABSTRACT. Recent studies have shown that an increased load on visual working memory (VWM) impairs visual detection, indicating that VWM load can influence visual perception. In this study, we investigated whether it is the VWM load itself (VWM load account) or the volume of re-entry signals generated by VWM representations (re-entry load account) that impacts visual detection. The feedback signal account posits that the volume of feedback signals, rather than the VWM load, is what modulates detection sensitivity. To explore this question, participants were tasked with performing a visual search task while simultaneously detecting a peripheral meaningless shape during the maintenance phase of a VWM task. They were required to memorize four discs into VWM. A critical aspect of the study was that, in half of the trials, these four discs could potentially form two subjective contours, a phenomenon where re-entry signals are particularly significant. The VWM load was expected to be significantly reduced in the with-contour condition compared to the without-contour condition. Conversely, the re-entry load was anticipated to be significantly higher in the with-contour condition. The VWM load account would predict that detection of the peripheral shape would be less successful in the without-contour condition than in the with-contour condition; whereas the re-entry load account would predict the opposite effect. Across three experiments, we consistently found evidence supporting the re-entry load account. This suggests that VWM's influence on visual perception is mediated through the re-entry signal, rather than being a direct result of the representational load alone.

15:15-16:30 Session 16: Talks - Working memory: neural mechanisms
15:15
Neural mechanisms of feature binding in working memory
PRESENTER: Yang Cao

ABSTRACT. Working memory (WM) is acknowledged as a system capable of manipulating stored information for upcoming goals, albeit with a limited capacity. Binding various features into a unitary entity in WM is crucial for enhancing its capacity to effectively support ongoing cognitive tasks. However, the neural mechanisms governing binding in WM remain unsettled. To gain a comprehensive understanding of the neural mechanism underlying feature bindings in WM, we employed a change detection task with color-location conjunctions as stimuli and functional magnetic resonance imaging (fMRI) techniques. Participants were asked to memorize two types of information: the bindings of color and location (i.e., the binding condition), or either the color or location information (i.e., the either-memory condition). In addition, the neural activities corresponding to different conditions were modeled through graph-based network analysis, enabling us construct functional brain networks and conduct a comprehensive whole-brain analysis to examine the neural activity involved in feature bindings. The results identified a collaborative network that operates through a central workspace encompassing the somatomotor area (SMA), insular, and prefrontal cortex (PFC), underpinning the effective processing of bindings. Within these regions, we observed increased local efficiency and stronger connections during bindings. Notably, connections within this workspace significantly correlated with condition classification (binding vs. memorizing-features-separately) and behavioral performance. Among these regions, SMA, characterized by a shorter intrinsic timescale, responded more rapidly to visual input, carrying rich temporal information with more connections, and potentially served as the starting point during binding processes. These results highlight a dedicated workspace with sufficient and valid internal connections, facilitating successful binding through collaborative regional interactions.

15:30
Neural Substrates of Working Memory Maintenance
PRESENTER: Sirui Chen

ABSTRACT. Working memory (WM) serves as a crucial yet limited memory buffer, temporarily storing and flexibly manipulating information in real-time. One hypothesis suggests that items are retained in memory through sustained rhythmic activities, particularly involving alpha (8 -12 Hz) and theta (4-7 Hz) oscillations. However, employing multivariate methods, researchers have successfully tracked memory content based on the topographic distribution of alpha, not theta, activities, leading to an enigma regarding the role of theta oscillation in memory maintenance. To investigate this, we measured oscillatory brain activities while participants completed a classic WM task, involving memorizing the location of a target shape. Additionally, they completed a search task, serving to establish the period of encoding/localizing the target. This setup allowed us to determine when the maintenance period began in the WM task, as it shared the same encoding/localizing period with the search task. We employed an inverted encoding model (IEM) to track the neuronal selectivity corresponding to the attended/memorized location, and examined phase distributions to explore the synchronization of oscillations during memory maintenance. Considering recent evidence linking saccadic eye movements to WM, we also incorporated horizontal electro-oculargram (EOG) data into our analysis to gain a more comprehensive understanding of the mechanisms involved in WM maintenance. The results showed that IEM decoding over alpha oscillation tracks memory maintenance, which is correlated with theta phase. This implies that theta oscillation might occur for controlling information maintained in WM. We also found sustained horizontal eye movements during memory maintenance, which is independent of neural oscillations. Note that, these identified neural correlates of WM maintenance showed differentiation between high and low performances, emphasizing their roles in successful memory maintenance. Overall, we provide evidence for alpha and theta’s different functions within WM, as well as the participation of the oculomotor system in WM maintenance.

15:45
Linking behavioral and neural estimates of trial-by-trial working memory information content
PRESENTER: Ying Zhou

ABSTRACT. How is working memory (WM) information represented in the brain? Neural and computational models have used data aggregated over hundreds of trials to argue for different perspectives on how population neural activity encodes individual memories. The two main perspectives are information rich representations such as in probabilistic coding models (a probability distribution over the whole feature space), and information sparse representations, such as in high-threshold ( a precise feature value) or drift models (a value with a confidence interval unrelated to the direction of drift). The use of aggregate data represents a key inferential bottleneck that critically limits the ability to adjudicate between different formats of individual memory coding in WM. This study used a powerful method to link behavioral and neural estimates of WM representation on individual trials. We asked participants (n = 12) to memorize a motion direction over a brief delay. After the delay, instead of making a single report about the memorized direction, they indicated their memory by placing 6 “bets”, resulting in a distribution over the 360° direction space that reflected their probabilistic memory representation on individual trials. Additionally, we used a Bayesian decoder to estimate the posterior of the memorized direction given the fMRI signal during memory maintenance on individual trials. Comparing the shape of the behavioral and neural estimates on individual trials, we found significant correspondences in their mean and width, and critically, a significant correspondence in their asymmetry. The correspondences were found in the visual hierarchy with meaningful WM representations in occipital, parietal and frontal regions. These results indicate (1) individual WM representations are complex probability distributions that contain more information than that can be deduced from aggregate data; (2) WM neural representations contain rich and complex information about WM, with meaningful asymmetry information influencing behavior.

16:00
Unveiling the neural dynamic of the interaction between working memory and long-term memory
PRESENTER: Zhehao Huang

ABSTRACT. Recent research has challenged the conventional understanding of fixed working memory (WM) capacity, by demonstrating an increase in WM capacity for real-world objects with prolonged encoding time. This suggests a potential interaction between WM and long-term memory (LTM), where LTM aids in WM storage, extending its capacity temporally. However, direct evidence supporting this interaction is still lacking. To explore this phenomenon, we conducted a study measuring WM capacity with different encoding times while recording intracranial electroencephalogram (iEEG) signals. Our behavioral results showed that prolonged encoding time correlates positively with enhanced performance. Expanding upon these observations, our iEEG results unveiled nuanced changes in neural oscillatory patterns. Specifically, we observed a concomitant increase in the duration of excitatory high-frequency (60-140 Hz) and inhibitory low-frequency (8-30 Hz) signals as encoding time lengthened. Further dissecting the neural dynamics during the encoding period, we discovered intriguing patterns of temporal representation synchronization and phase-frequency coupling. Correct trials exhibited heightened temporal representation synchronization and stronger phase-frequency coupling compared to incorrect trials, with these differences accentuating with prolonged encoding durations. We calculated the Granger causality between low and high-frequency signals, revealing that the high-frequency signal exhibited predictive capabilities over the low-frequency signal. Notably, we further focused on the hippocampus (a brain region associated with the LTM system), and observed that only contacts in this region showed specific activities linked to behavioral performance. Overall, our findings indicate that prolonged encoding time induces systematic neural activities linked to high-frequency signals, primarily occurred in the hippocampus, subsequently enhancing WM capacity.

16:15
Different states of hippocampus during the formation of new memories
PRESENTER: Yuanyuan Zhang

ABSTRACT. Memory plays an important role in supporting various cognitive processes. In last centuries, although the hippocampus (HPC) is deemed as a core brain region for the formation of new memories, it still lacks direct neuronal evidence supporting its involvement in memory formation. To investigate how the HPC works during memory formation, the present study analyzed intracranial electroencephalogram (iEEG) recordings obtained from eighteen neurosurgical patients engaged in a simple working memory task, in which participants had to memorize two fixed orientations (e.g., 45 and 135) across all trials, involving the repetition of memorizing these two orientations over time (i.e., memory rehearsal). Behaviourally, memory rehearsal of two fixed orientations significantly improved memory performance. With a multivariate approach, we successfully decoded orientation memory from theta (4 – 8Hz) power in the HPC, middle temporal gyrus (MTG) and prefrontal cortex (PFC), but not in the inferior parietal (IP). In the first section (first 90 trials), orientation memory was represented in the neocortex (MTG and PFC) before being detected in the HPC; yet this pattern was reversed in the second section (last 90 trials), orientation was initially detected in the HPC before being represented in the neocortex. This suggests that the HPC encoded information from the neocortex before forming a long-term store, afterwards the HPC altered its role transitioning to aiding in memory retrieval accordingly. Moreover, we detected more ripple activities when the HPC became involved in memory retrieval in the second section. And hippocampal gamma couples to ongoing theta in the first section, suggesting a stable link between gamma and theta oscillations during the formation of new memories. Altogether, these findings provide compelling neuronal evidence supporting the involvement of the HPC in memory formation, and how it works.

15:30-16:30 Session 17: Talks - Aesthetics and philosophy of vision
15:30
The Significance of Complexity in the Appreciation of Abstract Artworks and Music
PRESENTER: Rongrong Chen

ABSTRACT. The theory of Taste Typicality suggests that individuals’ typical aesthetic tastes exhibit a consistent pattern across different modalities, serving as a crucial factor in understanding the diverse aesthetic experiences among the general population (Chen et al., 2022). Building upon prior research on the role of visual complexity in shaping individuals' visual preferences, here we aim to further investigate the impact of complexity in shaping collective aesthetic preferences across both visual and auditory domains. To evaluate visual aesthetic appreciation, we instructed 28 undergraduate students (16 males and 12 females) to instinctively select their preferred paintings from a pair of Ely Raman's abstract artworks presented simultaneously for a brief duration of 500 ms. A higher selection rate of paintings with higher image complexity would suggest a preference for complexity in the visual aesthetics. To evaluate auditory aesthetic appreciation, participants were exposed to Western tonal music for a mere five seconds before engaging in a simple Go-no-go choice reaction task. Longer delay in the reaction time for the Go-no-go task were interpreted as a signal of heightened engagement with the preceding music. Our findings revealed a notable inclination towards complexity, as evidenced by a selection rate significantly above chance, particularly for paintings with higher entropy in its image statistics (one-sample t-test: t= 2.49,p = 0.019). Participants also displayed a preference for more intricate musical compositions, as evidenced by a significant delay in reaction time in the Go-no-go task when compared to simpler music (p < 0.001). Moreover, individuals who showed a preference for images with a greater range of color and brightness variations also tended to prefer complex music (r = 0.39, p = 0.039). These findings hold promising implications for the enhancement of various applications by integrating personalized color and music choices that align with users' preferences for complexity.

15:45
Chthulucene visions: the contemporary obsession with seeing, and the denial of the tangible

ABSTRACT. This paper investigates the notion of vision in the age of the Chthulucene, the era that begins with our awareness of environmental crisis, following Donna Haraway’s teachings (Haraway, 2016). I am inquiring into the contemporary reliance on the sense of sight, fostered by the massive adoption of digital devices. I am observing how the imposed dependency on digital devices and online connection for socio-economic purposes, has drastically rerouted the meaning of sensorial experience (Flusser, 2000). I am observing that the apparent freedom in the use of these digital gadgets, is a baleful trap, with the potential consequence of reduced scope, and nuance of the lived sensorium (Classen, 2012); reduced sense of emplacement (Howes, 2005); the ties of the current hyper-digital obsession with accelerated reality (Stiegler, 2014); bio-technologies of bodily control and monitoring (Haraway, 1990); the obsession with communication technology and systems of regulation of human behaviour (Foucault, 1995). By recording first-hand experimental investigation into the perception of various species of time (mechanical, integral, etc.), space (in terms of human perception) and the human body (Merleau-Ponty, 2005); this experiment relies on feedback from the author’s own body to bring the processing of sense-phenomena into focus. One example of this is an experimental design the researcher calls soundography in which space is probed and mapped through sound (Pellegrini, 2022). This examination documents an attempt to regain what the body may have lost in the massive adoption of today’s technical and digital prosthetics. The researcher’s experience over the course of the investigation suggests this methodology could be extended across a broad range of sensory input and applied pedagogically or therapeutically to other fields. Is there is an alternative pathway for digital technology and devices to become a positive addition to our freedom of choice, rather than a grid of pre-established options (Zielinski, 2006)?

16:00
A Study on the Relationship between Poster Image Design Techniques and Topics
PRESENTER: Yung-En Chou

ABSTRACT. In Taiwan's design education, many departments encourage students to participate in international design competitions to accumulate design experience, and poster works are considered one of the highest forms of artistic design. The utilization of poster images affects people's understanding of the posters. In the Taiwan International Student Design Competition (TISDC), the topic varies each year, so this study explores the relationship between images and topics. Using "content analysis," this study analyzes the image designs in the gold-medal poster works in the visual design category of the TISDC from 2008 to 2023, and categorizes topics according to Peirce's semiotic theory of sign types (icon, index, symbol), which can be considered as the applications of perception and attention and as the specific performance of people’s thoughts and action. The results show that the use of symbol is the highest, reaching 59%, while the usage of icon is 27%. This implies that symbol is the most frequently used way. Additionally, we find that images are obviously related to the topics. Through analysis, we categorized the poster topics into four categories: "Innovation," "Social," "Cultural," and "Environmental." In the "Innovation" topic, all three types of imagery (icon, index, and symbol) were applied. In the "Cultural" topic, icon and symbol were employed in the imagery. For both "Environmental" and "Social" topics, symbol were utilized. Therefore, it is recommended that designers consider topics when designing imagery.

16:15
Modeling color preference based on the distance from the memory colors in a color space
PRESENTER: Songyang Liao

ABSTRACT. “Mere Exposure Effect” describes that unreinforced exposure increases positive effects on a novel stimulus. It has been shown to account for preference for a wide range of stimuli, yet the possibility that it may affect color preference. We hypothesized that there would be a relationship between color preference and memory colors of the mere exposed objects: the colors closer (more similar) to the memory colors in a perceptual uniformed color space would be preferred, and vice versa. To test this hypothesis, we first predicted the abstract memory colors of frequently encountered fruits and vegetables and then examined the relationship between color preference and memory color. Our findings partially supported the hypothesis, revealing that mere exposure induced a preference for red, yellow, and green, whereas no such effect was observed for purple and blue. Subsequently, a multiple regression model was developed based on this memory-preference relationship, by the color's location on CIELAB color space, and the distance to the memory colors of the fruits and vegetables, together with a constant k. This model explained 64% of the variance in our data, and all variables made a significant contribution to the model. We propose that the differential preferences observed for red, yellow, and green compared to blue and purple could be attributed to ecological adaptations over the course of primate evolution. Colors signaling ripeness or nutritional richness, or those with greater familiarity due to prior exposure, are more likely to attract individuals due to associated benefits. Conversely, blue and purple, not relied upon by primates, reflect distinct color preference strategies.

17:00-18:00 Session 18: Keynote 4: Li Zhaoping

Talk delivered via Zoom.

Chair:
17:00
VBC: the V1 Saliency Hypothesis, the Attentional Bottleneck, and the Central-Peripheral Dichotomy

ABSTRACT. The V1 Saliency Hypothesis (V1SH) holds that neural responses in primary visual cortex (V1) to visual inputs form a bottom-up saliency map of the visual field. V1SH has received convergent experimental support: e.g., V1 activity to a visual location is correlated with faster saccades to that location in monkeys (Yan, Zhaoping, Li 2018), and human gaze is strongly attracted to a location with a unique eye-of-origin of input which V1 responses would single out, even though it is not perceptually distinctive (Zhaoping 2008).  Since the saliency map guides visual attention to center the attentional spotlight on the fovea, V1SH motivates the idea that the attentional bottleneck, which limits the extent of deeper processing of visual information, starts already at V1's output to downstream areas along the visual pathway.  Together, V1SH and the bottleneck motivate the central-peripheral dichotomy (CPD) theory, which hypothesizes distinct roles for central and peripheral vision that should be supported by different algorithms and neural architecture (Zhaoping 2019): (1) peripheral vision is mainly for looking (guiding gaze/attentional shifts) whereas central vision is mainly for seeing (recognition); (2) top-down feedback from downstream to upstream regions along the visual pathway should mainly target central vision to aid seeing by querying for more information from upstream areas (e.g., V1). I will review recent evidence from neural, fMRI, and psychophysical data in support of this V1SH-Bottleneck-CPD (VBC) framework.  I will highlight psychophysical findings from experiments that test two predictions of the VBC framework: (1) the novel reversed depth illusion, that is only, or more, visible in peripheral vision; and (2) this illusion nevertheless becomes visible in central vision when top-down feedback is compromised by backward masking.