Expectations about dynamic visual objects facilitates early sensory processing of congruent sounds

Published: 13 August 2021| Version 4 | DOI: 10.17632/k3j772tmwk.4
Contributors:
,
,

Description

In everyday life, the perception of a moving object can lead to the expectation of an object’s sound, yet little is known about how visual expectations influence early auditory processing. We examined how dynamic visual input – an object moving continuously across the visual field – influences early auditory processing of a sound that is either congruent with the object’s motion, and thus likely perceived as being part of the visual object, or incongruent with the object’s motion. In Experiment 1, EEG activity was recorded from 29 adults who passively viewed a ball that appeared either on the far left or right boundary of a display and continuously traversed along the horizontal midline to make contact and elicit a bounce sound off the opposite boundary. Our main analysis focused on the auditory-evoked event-related potential. For audio-visual (AV) trials, a knocking sound accompanied the visual input the moment the ball made contact with the opposite boundary (AV-synchronous), or the sound occurred shortly before contact (AV-asynchronous). We also included audio-only and visual-only trials. For Experiment 1, AV-synchronous sounds elicited an earlier and attenuated auditory response relative to AV-asynchronous or audio-only events. Experiment 2 was conducted to examine the roles of expectancy and multisensory integration in influencing this early auditory response. In addition to the audio-only, AV-synchronous, and AV-asynchronous conditions, 19 adults were shown a ball that became visually occluded prior to reaching the boundary of the display, but elicited an expected knocking sound at the point of occluded collision. Here, the auditory response during the AV-occluded condition resembled the AV-synchronous condition, suggesting that expectations induced by a moving object can influence early auditory processing. Broadly, the results suggest that dynamic visual stimuli can help generate expectations about the timing of auditory events, which then facilitates the processing of auditory information that matches these expectations. Each version of this online data repository reflects revisions made to the manuscript post-peer review. The EEG/ERP data attached (.set/.fdt files) were processed in MATLAB using EEGLAB/ERPLAB software. These files can be found in the "Segmented ERP Data" folder. EEG/ERP data were processed using the scripts contained within the "EEG ERP Scripts (MATLAB)" folder. The EEG data were processed using the specifications outlined below. N1 and P2 peak amplitude and latency stats and other relevant data sets are contained within the "Data Sets (Long Format)" folder. All statistical analyses were conducted in R, and the scripts used to conduct such analyses can be found in the "R Scripts" folder. The E-prime scripts and the stimuli used for each experiment can be found in the "Experiment Stimuli and Scripts" folder. Raw EEG data sets are also available for download via a separate online data repository (osf.io/d245g/).

Files

Steps to reproduce

Movie stimuli were borrowed from the labs of Dr. Dima Amso and Dr. David Lewkowicz. The stimuli were made using the software Adobe After Effects. A full description of these stimuli can be found in Werchan et al., 2018 (see link below for reference). Continuous EEG was recorded via a 128-channel HydroCel Geodesic Sensor Net (Electrical Geodesics, Inc.; EGI). Impedances were kept below 50 kOhms in all electrodes and the raw EEG data were referenced online to the vertex (Cz) and digitized at 500 Hz. EEG data were amplified according to the default settings of an EGI internal amplifier (model type: Net Amps 300). All data were processed off-line using MATLAB and EEGLAB/ERPLAB software (Delorme & Makeig, 2004; Lopez-Calderon & Luck, 2014). The raw EEG data were first digitally filtered using a 0.05 to 50 Hz bandpass (Butterworth) and 60 Hz notch filters. Data were then manually inspected for individual bad channels present throughout at least 50% of the recording, as well as electromyographic (EMG) and other movement artifacts. EEG data with evidence of egregious EMG, movement, or muscle artifacts were rejected from the analysis. Data from bad channels were replaced using a spherical spline interpolation algorithm. The cleaned EEG data were then taken through an independent component analysis (ICA), where evidence of eye artifact (eye blinks and saccades) was removed from the data set. We also noticed ICA components that resembled high-frequency harmonics and opted to remove these components. The EEG data were then segmented into 1000ms epochs (-200 to 800ms relative to stimulus onset), and baseline corrected using mean voltage during the 200ms pre-stimulus baseline period. ERPs were time-locked to the onset of the sound in all conditions except the visual-only condition in which case the ERPs were time-locked to the exact moment the ball touched the boundary. Each segmented data set was again manually inspected for excessive artifacts. Once artifact rejection was completed, the EEG data were again filtered, this time using a 30 Hz lowpass (Butterworth) filter and then re-referenced to an average reference. Grand-averaged ERPs were then obtained for each participant by averaging all available epochs for each condition. The N1 was operationalized as the minimum peak amplitude and latency occurring within 100-200ms after sound onset. The P2 was operationalized as the maximum peak amplitude and latency occurring within 200-300ms after sound onset. Both the time window and the regions of interest were selected based on our hypotheses about the timing of each ERP component (Stekelenburg & Vroomen, 2007; Vroomen & Stekelenburg, 2010) and from visual inspection using the grand averaged ERP across all participants and conditions. A six-channel frontal-central auditory region was constructed to evaluate differences between each sensory condition.