A solution to the mystery of the sub-harmonic series via a linear model of the cochlea
In this paper, we study a simple linear model of the cochlea as a set of vibrating strings. We make hypothesis that the information sent to the auditory cortex is the energy stored in the strings and consider all oscillation modes of the strings. We show the emergence of the sub-harmonic series whose existence was hypothesized in the XVI century to explain the consonance of the minor chord. We additionally show how the nonlinearity of the energy can be used to study the emergence of the combination tone (Tartini’s third sound) shedding new light on this long debated subject.
💡 Research Summary
The paper presents a novel explanation for two long‑standing musical‑acoustic phenomena—the sub‑harmonic (undertone) series and Tartini’s third‑tone (combination tone)—by modeling the cochlea as a set of non‑interacting, linearly damped strings whose stored mechanical energy is assumed to be the signal transmitted to the auditory cortex.
First, the authors describe the cochlear basilar membrane as a continuum of tensioned strings whose natural frequencies vary continuously from about 20 kHz at the base to 20 Hz at the apex. Each string obeys a linear damped wave equation, and the model deliberately ignores traveling‑wave dynamics, focusing instead on the steady‑state response to periodic inputs. The central hypothesis (denoted (P)) is that for each string at position x, the quantity E(x,t) — the sum of kinetic and potential energy of that string — is the neural information sent to the cortex. Because energy scales with the square of the displacement, E is inherently quadratic in the acoustic amplitude, introducing a non‑linear element even though the underlying mechanical system is linear.
When a pure sinusoid of frequency F excites the system, the energy distribution E(x,t) settles into a quasi‑static profile with a dominant peak at the string whose natural frequency matches F. However, each string also supports higher vibration modes (2 F₀, 3 F₀, …). If the forcing frequency coincides with a higher mode of a given string, that string vibrates at F while its higher mode simultaneously stores energy at a fraction of the forcing frequency (F/2, F/3, F/5, …). Consequently, the energy map exhibits secondary peaks at these fractional frequencies, which correspond precisely to the sub‑harmonic series. Symmetry analysis shows that, under the simplest parameter choices, only odd sub‑harmonics survive, providing a physical basis for Zarlino’s 16th‑century proposal that the undertone series underlies the perception of minor chords.
The model also accounts for the emergence of combination tones. Because E is quadratic, a superposition of two sinusoids F₁ and F₂ produces cross‑terms proportional to 2 u₁ u₂, generating new frequency components in the energy signal at F₂ − F₁, F₁ + F₂, 2 F₁, and 2 F₂. The difference tone F₂ − F₁ is especially prominent, matching the historically observed Tartini third‑tone, even though the underlying mechanical equations remain linear. The authors note that these energy‑derived tones could feed back into the fluid of the cochlea, but such feedback is not explicitly modeled.
Numerical simulations using two parameter sets—one taken from experimental data (Nobili et al., 2003) and a second adjusted to broaden the frequency range—confirm the analytical predictions. Simulations of pure sine inputs reveal clear energy peaks at odd sub‑harmonics, while harmonic-rich inputs produce simultaneous harmonic and sub‑harmonic peaks, reproducing the timbral richness of real musical sounds. Simulations with two simultaneous tones demonstrate the appearance of the difference tone in the energy spectrum, supporting the proposed mechanism for combination tones.
By linking the linear string model with a quadratic energy read‑out, the paper bridges a gap between purely physical models of cochlear mechanics and psychoacoustic observations that have traditionally required explicit non‑linearities. It offers a parsimonious explanation for why minor chords feel “dark” (due to the presence of odd sub‑harmonics) and why listeners perceive combination tones even when the ear’s mechanical response is essentially linear.
In conclusion, the study suggests that the cochlea’s intrinsic multi‑modal string dynamics, together with an energy‑based neural encoding, can generate both sub‑harmonic series and combination tones without invoking active amplification or intrinsic mechanical non‑linearities. This perspective opens new avenues for interdisciplinary research, inviting neurophysiological verification of energy‑based coding and prompting revisions of auditory models that traditionally separate linear mechanics from non‑linear perception.
Comments & Academic Discussion
Loading comments...
Leave a Comment