By Norman Varney
Audio enthusiasts are always concerned about frequency response. We see this data published in the specifications sheets of audio equipment, we often see it displayed graphically in reviews, and we are often interested in the frequency response of our room, etc. This is all fine, but what we should care much more about is phase.
Our experienced brain is very forgiving when it encounters missing frequencies or intensities of recognizable sounds, and it does not know the difference when frequencies are missing from unfamiliar sounds. For example, you've probably never heard an actual explosion like those in action movies, or like many, have never been in the presence of a live orchestral performance. When inexperienced, you don't know what you're missing. On the other hand, you hear the kick drum on "Billy Jean" over tiny speakers and recognize it as such. Your brain works hard to fill in the missing amplitude and low frequency information in order to make it believable. Our brain is not able to do such a great job of modeling for phase.
What is phase?
Phase, in relationship to audio, has to do with time. Time in audio is measured in units. For example, velocity is defined in terms of length and time, or feet per second. Frequency is measured in cycles per second (abbreviated Hz.), and wavelength is measured in distance per cycle period. A 1 kHz. tone is about 1.13 feet long and takes about 1 ms to generate, while a 100 Hz. tone is about 11.3 feet long and requires about 10 ms to produce. The standard unit of time is the second (abbreviated s). The standard clock is the Cesium-133 atom, which undergoes 9,192,631,770 oscillations per second.
You might be thinking that time and frequency are just two different mathematical ways to describe the same information in different domains, and you'd be right. However, when we start talking about more than one frequency happening simultaneously, as in a recording or playback system, we have to analyze their relationship to each other in order to determine the accuracy of what we perceive. Phase is both time and frequency dependent. Phase is the term used to describe the progress of a waveform in time relative to a starting point.
What is phase error?
Phase error results when two sound waves reach their maximum and minimum values at different times. Any degree of phase shift will cause the combined signal to be altered respectively, via the result of constructive and destructive interference.
How do we perceive relative phase distortions?
In physics, sound is only vibration, but for the human brain, sound requires processing a lot of information in order to make sense of it as a sensation and react to it. Localization is instinctively our primary concern regarding sounds. We spatially map the location using the disparity of time (below approx.700 Hz.) and/or intensity (above approx. 700 Hz.) between our two ears. This is followed by frequency (pitch) and /or loudness, whichever wins our attention to indicate possible threat. Finally, requiring a tad more information (time), we analyze tone. We process this information a number of ways, looking for clues to discover whether the sound is friend or foe. We will look at some basic characteristics of sound as it is related to phase and human perception:
1. Amplitude. If we were to play a steady tone of say 500 Hz. on the left speaker in an anechoic (reflection-free) chamber, and then add the same tone to the right, with the same relative phase, the sound energy will have doubled in power and the result is perceived as 3 decibels louder. What if we were to delay the second tone one half cycle later in time than the first? We still have double the power, however it is 180 degrees out of phase from the first causing cancellation of the two frequencies, resulting in silence. What's happening is, as the first speaker is moving forward (compressing air molecules), the second speaker is moving backward (rarefaction of air molecules). The combination leaves the air molecules at rest. Now you understand how phase error effects frequency response. Nature begins her sounds with a wave of compression. Electrons however, flow without regard to our human perception. You are just beginning to see how important phase is to accurate audio.
2. Spatiality. The easiest and most drastic phase distortion that most people recognize is the confounded sound when the polarity of one stereo channel is reversed. Rather than organized in space, sounds are difficult to localize and seem disoriented, thin and hollow. Both the soundstage (the apparent physical size of the presentation) and the image (the events that take place within the soundstage) are in chaos when the two channels are 180 degrees out of phase with each other. This is an unnatural phenomenon that you feel, and your brain works hard to make sense of it for comfort, but to no avail. Less dramatic degrees of phase shift will effect spatial cues and cause sounds to be incorrectly located or wander about in apparent location and size. Spatiality cues typically occur within the first few milliseconds of the signal's introduction.
3. Timbre. Timbre is the subjective tonality or "character" of sound. It has nothing to do with pitch or loudness per se. When hearing a flute and a violin each playing the note Middle A (440 Hz.), it is the differences in their unique attack, envelopment of harmonics (partials) and decays that distinguish them apart. This is due to not only the way an instrument is played whether: plucked, struck, blown, rubbed, etc., but also their harmonic make-up (most musical instruments posses up to twenty overtones above the fundamental), and their resonance make-up (the body of the instrument amplifies or dampens certain frequencies). Good timbre is synonymous with good fidelity, whether you are talking about a musical instrument or a hi-fi system. A cheap violin does not have the rich resonances found in a quality one, and a cheap stereo system probably won't distinguish between steel and nylon strings on a guitar, let alone the difference between Ernie Ball and D'Addario strings. It's the intensity of the overtones, during various points in time, that make these distinctions. Plomp (1970) summarized: a) Phase has maximum effect on timbre when alternate harmonics differ by 90 degrees. b) The effect of phase on timbre appears to be independent of the sound level and the spectrum. Timbre recognition occurs in about the first 20-50 ms of introduction.
(a) The waveform of an attack transient. (b) Amplitudes of the first five harmonics of the attack transient of a 110 Hz. diapason organ pipe. (From Keeler, 1972). Notice the second harmonic develops slightly faster than the others, including the fundamental. In other woodwinds, the fundamental usually leads.
Timbre is altered when phase is shifted. Phase distortion to the original signal confuses our brain. It is interesting to note the experiments by Berger in 1963 where he removed the first and last half seconds of 10 various band instruments and asked 30 band students to identify them. Among the confusion, only the oboe was correctly identified by more than one third of the group, eleven identified the alto saxophone as a French horn, and 25 thought the tenor saxophone was a clarinet!
Experiment
Though the following exercise does not follow "real world" situations, it does a great job of allowing the reader to understand and experience what happens when phase shifts alter timbre.
While holding your hand flat with your palm facing you, say shhhhhhhhhhhhh while slowly bringing it up to your mouth. Notice how the timbre of the sound changes. You are hearing the original sound conflicting with the reflected sound off your hand. As you move your hand closer to your mouth, different frequencies (predominately around 1 kHz. - 16 kHz.) are passing through one another in opposite directions, and depending on the interval in time, or point in space you happen observe the sound, it will appear different (brighter or darker, louder or softer) at certain frequencies.
What causes phase distortion?
There are two types of relative phase distortions that typically occur during the recording and playback process: electrical and acoustical. And there are two causes of phase distortion: delays and repeats.
1. Electrical. Any and all types of audio electronics will add a time delay to an applied signal, from microphones, to cables, to loudspeakers and all processors in between. Each electronic device in the signal path introduces some capacitance (stored voltage charge) and inductance (stored current charge) to the moving electrons. These inherent charges take time to develop and each signal frequency has a unique voltage and current. If the time delay is constant at all frequencies between the input and the output of the device, it is said to be phase linear or phase coherent.
2. Acoustical. Acoustical interference occurs when room reflections cause constructive (additive) and destructive (subtractive) phase errors, as can less-than-precise speaker/listener alignment, and multi-microphone leakage. As the direct signal combines at our ears with the delayed signal(s) of itself, we experience distortion.
a. With room reflections, and stationary listening, our brain can adapt with some "spectral compensation" to the room, especially in the higher frequencies. However, reflections that are within -15dB of the direct sound will definitely cause audible phase anomalies.
b. Ideally, each speaker voice coil should be the same distance to the listening position so that the signals from each arrive together. When they are not aligned, the relative signal arrival times are different, causing change to the sound, and to the polar response (directivity) of the speaker. Note that the more off-axis a listener is, the more time incoherency is increased. Note also that good designs take into account cross-over network phase and delays, and that even a 5 us change can be audible.
c. Two microphones, each in a different location, but both picking up similar information can cause tonality errors. For example, a snare top head and bottom head mic both picking up the high-hat, or the bottom head mic picking up the direct sound with reflected sound from the floor.
What can you do to reduce phase errors?
There are several things one can do, even if you don't have sophisticated test equipment or knowledge:
1. Train your ears. Listen to unamplified music for reference. Enjoy the richness of harmonic content, the spatial imaging, the attack, envelopment and decay of individual sounds.
2. Confirm that all amplifier/speaker channels are the same polarity.
3. Confirm that all speaker drivers are the correct polarity. Placing a 9 volt battery across the speaker cable leads should push all drivers forward in nearly all speaker designs.
4. Investigate interconnects, speaker cables and speakers that boast about energy efficiency, time/phase alignment, etc.
5. Do your best to locate the speaker/listening position for smoothest room mode response in the room.
6. Confirm that each speaker is the same distance to the listening position.
7. Treat first order reflection points in the room with absorption or diffusion. This can be done with the "mirror trick". Treat the locations with at least a 2' area to cover frequencies down to about 500 Hz.
Conclusion
As with many blogs, a book could be written about the subject. Phase is easily one of them. Only scratching the surface, there are many sub-topics of phase effects that I did not include, such as pitch, resonance, ringing, beat frequencies, etc.
Time delay spectrometry has only been around since the late 1960's. Prior to that, we didn't have the computer processing required to analyze the relationships between time, energy and frequency. This may be why we are so concerned and familiar with frequency response. Phase distortion is the primary reason why one piece of equipment sounds different from another. It is also the primary reason why most people are denied the full potential of their audio investment and cannot enjoy the full experience created by the artist.
In regards to relative phase perceptibility over frequency response, most any piece of good audio equipment today will offer good frequency response, but most do not have good phase response. The problem for the end-user is integrating synergistic system components, setting them up properly both physically and electronically, and controlling room acoustics. Noticing minor phase errors in a typically reverberant room will be difficult because of the lack of resolution available from the room. Controlling the acoustics properly is like discovering the deep sea. You probably have no idea of all the cool stuff that is below the surface.