MP3s and the Degradation of Listening

  Reading time 11 minutes

Don’t get me wrong! I own three iPods, which I use extensively and absolutely adore for their portability and other obvious advantages. I, of course, use them differently than most listeners. (If you are lazy or impatient, feel free to jump to the bottom of the page and read how.) Most listeners use mp3 players and mp3 files in ways that severely degrade sound quality and eventually deteriorate the listener’s ability to even tell the difference between good and bad sound quality. But more on this a little later.

Disclaimer: For the cynics amongst you, I am not sponsored by any record label trying to boost CD sales; I could actually not care less. All the information below is not product-specific, is based on facts, and is common knowledge to anyone with a basic understanding of the physics of sound, digital sound processing, hearing physiology, and auditory perception. Ignore at your own risk!

CD sound quality

First, let me address some fundamental issues related to the relationship between CD sound data rates and sound quality.

CD quality is usually described in terms of:

  • sampling rate (44,100 samples/sec.),
  • bit rate (16 bits), and
  • stereo presentation.

Doing some simple math, we can figure of that CD-quality sound corresponds to a data rate of 1411 kbits/sec. (44,100 * 16 * 2 = 1,411,200 bits/sec. = ~1411 kbits/sec.) Sampling rate determines the upper frequency limit (corresponding, in general, to timbre, or sound quality) that can be faithfully represented in a digital sound file (about half of the sampling rate). Bit rate determines the dynamic range (i.e. difference between the softest and strongest sound) that can be faithfully represented in a digital sound file (~6 dB per bit).

Given the maximum frequency and dynamic range of safe and functional human hearing (~20 kHz and ~100 dB respectively), CD-quality digital sound is very close to the best sound quality we can ever hear. There have been several valid arguments put forward, advocating the need for sampling rates higher than 44,100 samples/sec. (e.g. 98,200 samples/sec.), bit rates higher than 16 bits (e.g. 24 or 32 bits), and more than two channels (e.g. various versions of surround sound). Depending on the type of sound in question (e.g. the sound’s frequency/dynamic range and spatial spread) and what you want to do with it (e.g. process/analyze it in some way or just listen to it), such increases may or may not result in a perceptible increase in sound quality. So for the vast majority of listening contexts, CD-quality sound (i.e. 1411 kbits/sec. data rate) does correspond to the best quality sound one can hear.

Compressed sound quality

Now, let’s move to compressed quality sound, whether in mp3, iPod, Real, or any other format.

Every sound-compression technique has two objectives:

a) to reduce a sound file’s data rate and therefore overall file size (for easier download and storage) and

b) to accomplish (a) without noticeably degrading the perceived quality of the sound.

Sound-compression algorithms basically remove bits from a digital sound file and select the bits to be removed so that the information that will be lost will not be perceived by listeners as a noticeable loss in quality.

Compression algorithms base their selective removal of information from a digital file on three perceptual principles:

  1. Just noticeable difference in frequency and intensity:
    Our ears’ ability to perceive frequency and intensity differences as pitch and loudness differences respectively is not as fine grained as the frequency and intensity resolution supported by CD-quality sound. So it is possible to selectively remove some relevant information without the listeners noticing their removal.
  2. Masking:
    Strong sounds at one frequency can mask soft sounds at nearby frequencies, making them inaudible. It is, therefore possible to remove digital information representing soft frequencies that are closely surrounded by much stronger frequencies, without the listeners noticing the removal, since they would not have been able to hear such soft sounds in the first place.
  3. Dependence of loudness on frequency:
    Even if different frequencies have the same intensity they do not sound equally loud. In general, for a given intensity, middle frequencies sound louder than high frequencies, which sound louder than low frequencies. Given the phenomenon of masking described above, this dependence of loudness on frequency allows us to remove some soft frequencies even if they are further away from a given strong frequency, providing an additional opportunity to remove bits (information) from a digital file without listeners noticing the loss. In addition, the dynamic range of hearing is much lower for low than for middle and high frequencies and may be adequately represented by ~10 versus 16 bits, offering one more possibility for unnoticeable data-rate reduction.

Different compression algorithms (e.g. mp3, iTunes, etc.) implement the above principles in different ways, and each company claims to have the best algorithm, achieving the most reduction in file size with the least noticeable reduction in sound quality.

Digital music downloads and the stupefaction of a generation of listeners

Regardless of which company and algorithm is the best, one thing is certain. No matter how the previously discussed principles are implemented and no matter how inventive each company’s programmers are, there is no way for the above principles to support the over 90 percent reduction of information required to go from a CD-quality file to a standard mp3. In other words, reducing data rates from CD quality (1411 kbits/sec.) to the standard downloadable-music-file quality (128 kbits/sec.) is impossible without a noticeable deterioration in sound quality.

In fact, the 139th meeting of the Acoustical Society of America devoted an entire session on the matter, with multiple acousticians and music researchers presenting their perceptual studies on the relationship between compression-data rates and sound quality. Based on these and other, more recent, relevant works, it appears that data rates below ~320 kbits/sec. result in clearly noticeable deterioration of perceived sound quality for all sound files with more than minimal frequency, dynamic, and spatial spread ranges. (E.g. listening to early Ramones at low or high data rates will not make as much of a difference as listening to, say, the Beatles’ “Sergeant Pepper” album.) Such low data rates cannot faithfully represent wide ranges of perceivable frequency, intensity, and spatial-separation changes, resulting in ‘mp3s’ that include only a small proportion of the sonic variations included in the originally recorded file.

As data rates drop, there is a gradual deterioration in

a) frequency resolution (loss of high frequencies, translated as loss of clarity),

b) dynamic range (small, dynamic changes become noninterpretable by the compressed file, resulting in flatter ‘volume’ song profiles), and

c) spatial spread (loss of cross-channel differences, resulting in either exaggeration or loss of stereo separation).

When this degradation of sound quality is combined with the fact that most young listeners get their music only online, what we end up with is a generation of listeners that is exposed to, and therefore ‘trained’ in, an impoverished listening environment. Prolonged and consistent exposure to impoverished listening environments is a recipe for cognitive deterioration in listening ability. That is, in the ability to focus attention on and be able to tell the difference between fine (and, if we continue this way, even coarse) sound variations.

Such deterioration will not only affect how we listen to music but also sound perception and communication in general, since our ability to tell the difference between sound sources (i.e. who said what) and sound source locations (i.e. where did the sound come from) is intricately linked to our ability to focus attention on fine sound-quality differences.

What you should do

a) Do not listen to music exclusively in mp3 (or any other compressed) format.
Go to a live concert! Listen to a CD over a good home sound system, set of headphones, or car stereo!

b) Unless a piece of music is not available in another format, do not waste your money on iTunes or any other music download service, until such services start offering data rates greater than 300 kbits/sec.

c) When you load CDs on your iPod or other devise, select the uncompressed conversion rate (e.g. .wav or .aif formats). If you don’t have the hard disk space on your player to do this, convert at the highest available data rate (currently 320kBits/sec on iTunes).

d) Finally, get a good pair of headphones for your mp3 player! The headsets given out with iPods and most mp3 players are of such bad quality that they essentially create a tight bottleneck to the quality of your digital files and players. The response of these headphones has been designed to match the low quality of popular iTunes or other mp3 files (128 kbits/sec). Mp3-player manufacturers do this for two wise (for them) reasons:

i) poor quality headsets are cheap to produce and good enough to reproduce the poor quality mp3s files you are fed, and

ii) poor quality headsets prevent you from creating/requesting music files at higher data rates because when listening over such headphones you cannot even tell the difference between good and bad sound quality.

Well, what can I say? Wake up and listen to the music!

3 thoughts on “MP3s and the Degradation of Listening

  1. totally agree with u..

    since i ve gotten into high end audio..128 kbps sounds very dull.

    for my home all my CDs are ripped into FLAC, ALAC and for iPod its mainly 320 kbps.

    one more thing..many people think Bose is high end audio..but its a very colored and undetailed sound when compared to Sennheiser.

    so guys ..feed your ears good audio.

  2. Great article. It should be noted that when I post songs from my studio to digital download sites via Tune Core, they require that I send them files that are at least 320kbps. Lastly, one reason why less-than-CD quality can sound “okay” to common listeners is that they often listen over headphones (ear pieces). Thus, the fidelity of mp3 appears, to them, as acceptable.

  3. The sound-chain audio quality of an iPod or other musical player does not accurately reproduce higher bit rates. While you might notice bad sound from 64, 96 or 128 Kbps encodings, anything from 196 Kbps and up (including the 256 kbps retail “hifi” encodings are prefectly matched to the iPod and other MP3 players. Going to lossless WAV or even 320 kpbs is absolutely a waste of both space and time — especially given the quality of the headphones which tend to be the biggest problem.

    The Acoustical Society of America report you mentioned deals with the highest end reproduction equipment available under perfect listening conditions. Under those conditions, SOME people might notice a difference in sound quality at rates under 320 Kbps. But nobody will notice a difference when using lower rates with portable players. The music quality surpasses the capability of the equipment to reproduce.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.