Spectral energy (color-coded or as a family of curves; for the latter, each trace represents the amplitude spectrum for one phase bin) as a function of original envelope phase and sound frequency. For original speech snippets (A), there is a strong bias of spectral energy toward a certain phase of the original envelope (phase 0; i.e., the peak). This spectral “imbalance” can trivially entrain even the earliest levels of auditory processing (e.g., the cochlea). We corrected for this bias by counterbalancing the spectral differences between different envelope phases by injecting specifically constructed noise. Results are shown in B, suggesting that we were able to compensate for these spectral inequalities. Reproduced from Zoefel and VanRullen (2015).
... Phase entrainment of neural oscillations, the brain's adjustment to rhythmic stimulation, is a central component in recent theories of speech comprehension: the alignment between brain oscillations and speech sound improves speech intelligibility. However, phase entrainment to everyday speech sound could also be explained by oscillations passively following the low-level periodicities (e.g., in sound amplitude and spectral content) of auditory stimulation—and not by an adjustment to the speech rhythm per se. Recently, using novel speech/noise mixture stimuli, we have shown that behavioral performance can entrain to speech sound even when high-level features (including phonetic information) are not accompanied by fluctuations in sound amplitude and spectral content. In the present study, we report that neural phase entrainment might underlie our behavioral findings. We observed phase-locking between electroencephalogram (EEG) and speech sound in response not only to original (unprocessed) speech but also to our constructed “high-level” speech/noise mixture stimuli. Phase entrainment to original speech and speech/noise sound did not differ in the degree of entrainment, but rather in the actual phase difference between EEG signal and sound. Phase entrainment was not abolished when speech/noise stimuli were presented in reverse (which disrupts semantic processing), indicating that acoustic (rather than linguistic) high-level features play a major role in the observed neural entrainment. Our results provide further evidence for phase entrainment as a potential mechanism underlying speech processing and segmentation, and for the involvement of high-level processes in the adjustment to the rhythm of speech.... Top panels: Cross-correlation between original signal envelope and EEG signal (both unfiltered) for all channels (black lines) and the standard deviation across channels (blue line). Only the original condition shows a peak in standard deviation at ~110ms (time lag between speech and EEG), indicating an entrainment to low-level cues. A later peak (~190ms) can be seen in all conditions, and an earlier, slightly weaker peak (~50ms) that is most evident in the original and constructed reversed condition. Both peaks indicate an entrainment to acoustic high-level cues. The insets show the topographical distribution of cross-correlation, with respect to the timing of the two most pronounced peaks (110ms and 190ms) in the original condition. Bottom panels: Significance values of the time–frequency transform of cross-correlation functions (averaged across channels). Note that the 110-ms (low-level) component of cross-correlation involves significant correlations at higher frequencies including the gamma-range, whereas the other (high-level) components entail correlations restricted to the theta-range. FDR-corrected significance threshold, alpha<0.05, is shown as a red line in the colorbar.
Contributors:Michael Hauck, Susanne Metzner, Fiona Rohlffs, Jürgen Lorenz, Andreas K. Engel
Pain-induced oscillations after laser stimulation. The plots to the left show time-frequency representations of the pain-induced responses averaged across central (A), temporal (B), and parietal (C) cluster of sensors. Note that different frequencies are displayed in each plot. Responses are computed as percentage changes in signal amplitude relative to the baseline (500ms before laser onset). Panels to the right show the topographic distribution of the response components. Three oscillatory response components were observed after laser stimulation. An increase in the delta band (maximum of 3Hz at approximately 300ms poststimulus) was most prominent at the central and parietal sensors. This component was accompanied by an increase in gamma power (maximum of 75Hz at approximately 350ms) with maximal strength at central sensors. Furthermore, a sustained decrease in the alpha and beta bands (maximum of 28Hz at approximately 1100ms) appeared at the bilateral temporal sensors.
... Poststimulus time courses of grand mean responses within gamma (A), beta (B), and delta (C) bands during different music conditions, expressed as percent change in signal power relative to prestimulus baseline (−500 to 0ms). Significantly lower gamma-band activity (70 to 80Hz) appeared during self-composed healing compared with pain music (A). Gamma-band activity did not differ between no sound and preferred music. No significant effects were observed in the beta band (24 to 34Hz) (B). Delta power (2 to 6Hz) was significantly lower during preferred music compared with the no-sound condition around 300ms, whereas no differences between self-composed pain vs healing music were observed in this frequency band (C).
Contributors:Benjamin D. Charlton, Roland Frey, Allan J. McKinnon, Guido Fritsch, W. Tecumseh Fitch, David Reby
During the breeding season, male koalas produce ‘bellow’ vocalisations that are characterised by a continuous series of inhalation and exhalation sections, and an extremely low fundamental frequency (the main acoustic correlate of perceived pitch) . Remarkably, the fundamental frequency (F0) of bellow inhalation sections averages 27.1 Hz (range: 9.8–61.5 Hz ), which is 20 times lower than would be expected for an animal weighing 8 kg  and more typical of an animal the size of an elephant (Supplemental figure S1A). Here, we demonstrate that koalas use a novel vocal organ to produce their unusually low-pitched mating calls.... A video sequence to show the VVFs oscillating periodically at frequencies ranging from 10-45 Hz. The placement of the video camera just below the larynx allowed us to visualize the VVFs through the space between the arytenoid cartilages and visually document their role in sound production. The F0 for each phonation event is displayed in the top left-hand corner of the screen. A total of eight phonation events are shown in which F0 increases from 10-45 Hz in 5 Hz increments.
Contributors:Gerold Baier, Thomas Hermann, Ulrich Stephani
Waveform (top) and spectrogram (bottom) of one audio channel of Sound 3: background and absence seizure in patient 2, channels T4 (left) and F4 (right) in stereo panning starting from second 20 in Fig. 3, top. Oscillatorfrequency is 200Hz for channel T4, and 300Hz for channel F4.
... Sound 3: Background and absence seizure in patient 2, channels T4 (left) and F4 (right) in stereo panning starting from second 20 in Fig. 3. Oscillatorfrequency is 200Hz for channel T4, and 300Hz for channel F4. Waveform and spectrogram in Fig. 4, top and bottom.
Contributors:Marianne Latinus, Phil McAleer, Patricia E.G. Bestelmeyer, Pascal Belin
Distance-to-Mean in Voice Space
(A) Stimuli from Experiment 1 (32 natural voices per gender) are represented as individual points in the three-dimensional space defined by their average log(f0), log(FD), and HNR, Z scored by gender (resulting in overlapping male and female stimulus clouds). Red discs represent female voices; blue discs represent male voices. The prototypical voices generated by averaging together all same-gender stimuli are located on top of the stimulus cloud (triangles) owing to their high HNR value. Distance-to-mean = df02+dHNR2+dFD2.
(B) Voice averaging in Experiment 1. Spectrograms of example voice stimuli (top row) represent male speakers uttering the syllable “had.” Black circles indicate manually identified time-frequency landmarks put in correspondence across stimuli during averaging, corresponding to the frequencies of the first three formants at onset of phonation (left side), at onset of formant transition, and at offset of phonation (right side). A prototypical voice (bottom) is generated by morphing together stimuli from 32 different speakers. Note the smooth texture caused by averaging, resulting in high HNR values.
(C) Histograms of distance-to-mean distributions for the voice stimuli of Experiment 1 (gray) and Experiment 2 (black); the mode of the two distributions is for intermediate values of distance-to-mean.
(D) Scatterplot of distance-to-mean versus distinctiveness ratings (Z scored) for the 126 stimuli of Experiment 1. Distance-to-mean explains over half of the variance in distinctiveness ratings (R2 = 0.53): voices with greater distance-to-mean are judged to be more distinctive. See also Figure S2 for correlations coefficients in other spaces.
... Acoustical Dimensions of Voices
(A) During voice production, the vocal folds in the larynx oscillate periodically generating a buzzing sound with a fundamental frequency (f0) and a highly harmonic structure. Acoustical filtering by the vocal tract airways—nasal cavity (a) and mouth cavity (b)—above the larynx modifies this buzzing sound, resulting in regions of enhanced energy in the spectrum called formants.
(B) Spectrogram of the syllable “had” spoken by an adult female speaker. Color scale indicates power (dB). Note the vertical stripes corresponding to the harmonics (integer multiples of f0) and the bands corresponding to the formants (F1–F3).
(C) Stimulus power spectrum.
(D and E) Stimulus amplitude waveform. See also Figure S1 and Table S1 for more information on the acoustical parameters measured in the different studies.
... Dur, duration (ms); f0, fundamental frequency (Hz); Std(f0), SD of f0 (Hz); Int, intonation measured as the difference in f0 between the first and last voiced frames (Hz); F1–F4, frequencies of the first four formants (Hz); FD, formant dispersion (Hz); H1–H2, power difference between the first two harmonics (dB); HNR, harmonic-to-noise (dB); Jitter, 100 times jitter value (5); alpha, alpha ratio; Loud, loudness (dB).
Contributors:Masaki Ieda, Ji-Dong Fu, Paul Delgado-Olguin, Vasanth Vedantham, Yohei Hayashi, Benoit G. Bruneau, Deepak Srivastava
Induced Cardiomyocytes Exhibit Spontaneous Ca2+ Flux, Electrical Activity, and Beating
(A and B) Cardiac fibroblast (CF)-derived iCMs showed spontaneous Ca2+ oscillation with varying frequency (A), similar to neonatal cardiomyocytes (B). Rhod-3 intensity traces are shown.
(C) Tail-tip dermal fibroblast (TTF)-derived iCMs showed spontaneous Ca2+ oscillation with lower frequency. The Rhod-3 intensity trace is shown.
(D) Spontaneous Ca2+ waves observed in CF-derived α-MHC-GFP+ iCMs (white dots) or neonatal cardiomyocytes (arrows) with Rhod-3 at Ca2+ max and min is shown. Fluorescent images correspond to the Movie S1.
(E) Spontaneous Ca2+ oscillation observed in the TTF-derived α-MHC-GFP+ iCMs with Rhod-3 at Ca2+ max and min is shown. Fluorescent images correspond to the Movie S2.
(F) Spontaneously contracting iCMs had electrical activity measured by single cell extracellular electrodes. Neonatal cardiomyocytes showed similar electrical activity.
(G) Intracellular electrical recording of CF-derived iCMs cultured for 10 weeks displayed action potentials that resembled those of adult mouse ventricular cardiomyocytes. Representative data are shown in each panel (n = 10 in A–F, n = 4 in G). See also Figure S5 and Movies S1, S2, S3 and S4.
See also Movies S1, S2, S3, and S4 and Figure S5.
... Movie S2. Spontaneous Ca2+ Oscillations Were Observed in the TTF-Derived iCMs, Related to Figure 6E... Movie S1. Spontaneous Ca2+ Oscillations Were Observed in the CF-Derived iCMs, Related to Figure 6D
Contributors:Christopher I. Petkov, Mitchell L. Sutter
Conditions for eliciting auditory perception restoration and analogous visual phenomena. (A) On the left is a drawing of a cat illustrating an intact sensory representation. In the middle we simulate sensory degradation whose source is unclear. Finally on the right the blank areas are filled in identifying the source of the sensory degradation. With this information the visual system can segregate the objects and restore a more global perceptual representation of the ‘cat’. (B) Schematized spectrogram (time-frequency plots) of different auditory stimuli and how they relate to masking and perceptual restoration (continuity). Note how the ‘gapped’ monkey vocalization with interrupting noise in the gap (the condition that evokes fill-in) is similar to the visual picket-fence except the missing segment occurs in time.
... Neural correlates of auditory continuity in human auditory cortex. (A), fMRI group results rendered on an unfolded average cortex representation (adapted from Riecke et al., 2007). Left: Middle superior-temporal gyrus and Heschl’s gyrus (HG) in the right auditory cortex (AC) appear to play a role in the masking of gaps in tones and the hearing of continuity illusions of these tones respectively. Hemodynamic activity in these regions was related to the masking strength (dark blue) or the actual illusion strength (light green). Right: The illusion-related region (green outline) on HG (dotted line) was situated in a medial portion of a mirror-symmetric map of frequency sensitivity suggesting that the underlying primary auditory cortex processes are involved in perceptual restoration. (B) Distributed EEG source modeling group results obtained with the same behavioral paradigm rendered on a folded average cortex representation (adapted from Riecke et al., 2009). Left: Theta oscillations in central portions of the right AC appear to be relevant for hearing continuity illusions (see shading near cross). Right: Theta power in these regions (see cross on the left) was suppressed during and also around the masked gap interval when listeners reported continuity illusions compared to true discontinuity of the same ambiguous stimulus.
Contributors:Eric Vatikiotis-Bateson, Adriano Vilela Barbosa, Catherine T. Best
Auto-correlation of tongue dorsum (TR) for talker S1 saying cop–top (Trial 4). Two primary bands of high correlation are shown for zero offset and the offset corresponding to the repetition period (approximately 700ms). Emergence of the high correlation band at twice the frequency (white oval) indicates persistent co-production of the /k/ and /t/ for about 10s.
... Instantaneous correlation results for two identical, but phase-shifted sine waves (1st panel) and two non-linearly related sine waves (3rd panel). In both cases, correlation, ρ(t), oscillates between −1 and +1 (2nd and 4th panels).