The Anatomy of an Echo: Frequency Matters

Figure 1Audio practitioners and musicians alike are intimately familiar with slapback echoes when a snare drum's sound bounces off a specular rear wall or balcony face, a phenomenon less frequently observed with kick drum sounds. This discrepancy hints at a frequency‑dependent echo perception‑threshold.

The delay at which the perception of one fused sound becomes two separate sounds is called the echo threshold.

- Litovsky et al., 1999 -

Echo: Psychoacoustics A perceptually distinct copy of the original sound;
a delayed duplicate.

- Blesser, Barry "An Interdisciplinary Synthesis of Reverberation Viewpoints," J. Audio Eng. Soc., vol. 49, pp. 867-903 -

Figure 1.1 shows two identical copies of the same — broadband — pulse, separated by a 10‑millisecond‑long time offset. Where each pulse contains every frequency from DC to daylight (or Nyquist in the digital domain). And at first glance one may by inclined to think that there will be no interference between these apparent separate events (in time).

But please bear in mind that 10 milliseconds' worth of downtime between consecutive events, is the same amount of time that 100 Hz takes to finish a single cycle. Meanwhile, during the same 10 ms, 1 kHz can complete 10 cycles, and 10 kHz a hundred cycles. Whereas 10 Hz has barely begun its first cycle. So surely the same time‑gap cannot affect all frequencies equally.

If we pass our broadband pulses through a third‑octave filter bank (Figure 1.2) we get a collection of — narrowband — pulses (Figure 1.3). Where each waveform represents a third‑octave‑wide group of frequencies. Notice that each frequency group lasts by a different time‑amount. And despite the initial 10‑millisecond time gap, lower frequency groups are still very much overlapped unlike high frequency groups.

Reciprocity

All of this goes back to the reciprocal relationship between time and frequency

\begin{equation}T \times f = 1\end{equation}
And subsequently the duration of bandwidth‑limited pulses. Like those shown in Figure 1.3.

A bandwidth‑limited pulse is a pulse of a wave
that has the — minimum possible duration — for a given spectral bandwidth.

- Wikipedia -

In other words, "you cannot have your cake and eat it too." And have a kick‑drum sound, confined to lower frequencies, last as short as a snare drum sound that consists of higher frequency‑content.

So for transient signals (as opposed to steady‑state signals) higher frequencies are in danger of becoming echoic sooner than low frequencies.

The Zipper

I call this the zipper‑effect. And whenever there is time‑misalignment between, e.g., two loudspeakers, or direct sound and an early reflection, you will notice first at high frequencies.

For — transient — signals, high frequencies become unfused first, and get torn apart (like Velcro), setting the stage for echoes. Whereas low frequencies will remain fused for much longer, and become subject to strong tonal coloration.

The in‑between frequencies, once the hearing sense can no longer resolve closely grouped frequencies — within a single critical band — will manifest "auditory roughness" associated with phenomena such as (but not limited to) "beating".

Proof is in The Pudding

The video below contains two duplicate tracks containing a — broadband — black pulse (whose spectrum drops with 9 dB per increasing octave) repeating every half‑second.

Because the ear acts like a high‑pass filter, it renders the textbook "Dirac" pulse (with white spectrum) frustratingly silent and useless.

After the fifth repetition, I progressively start to delay the second instance (Track 2) by as much as 200 ms.

There are two things worth noting. First, there is an audible change in pitch (because comb filters have pitch). Secondly, the high frequencies become split in time first (Figure 1.3). And not until much, much later, i.e., more delay, do the low frequencies start to exhibit an echoic heartbeat‑like pattern.

Early Reflections

So whether or not early reflections are useful or detrimental, greatly depends on the frequency content contained within those lower‑order reflections. They offer potential for adding gain and spaciousness (both oftentimes desirable), as well as echoes.

Which frequencies exist in these reflections is determined by: a) the frequency‑dependent spill by various sound sources onto acoustically reflective surfaces (as opposed to the audience), and b) whether in turn those frequencies are absorbed upon striking said surfaces.