Recently a friend asked, why does the tritone (the diminished fifth interval) sound so sour to the ear? Here are some thoughts.

First, it is helpful to reflect on just what it is that makes music sound musical to the human ear. The 8 notes in the common scale (the one from “Sound of Music” as the familiar “doe rae mee fah sow lah tee doe”) are nearly perfect ratios of integers with the root frequency. For example, the fifth (“doe” played simultaneously with “sow”) is such a strong interval because 2^(7/12) is almost exactly 3/2, and the human ear has an affinity for such whole number ratios. The fourth turns out to match 2^(5/12)=4/3, the major third 5/4. It’s the auditory equivalent of the beautiful symmetry of flowers and leaves and cat whiskers.

The question of why the tritone in particular (and, it turns out its relative the augmented fifth) should sound so dissonant is very interesting, because the whole number ratio of the tritone (7/5), while less favorable (higher integers needed to get the rational fraction) than that of the fifth (3/2), fourth (4/3), third (5/4), minor third (6/5), is comparable to the sixth (8/5), and second (9/8) intervals, and actually somewhat more favorable than that of the major seventh (15/8) and the minor seventh (16/9). But I noticed one feature that makes the tritone (diminished fifth) — and the augmented fifth for that matter — unique, and that is an implied dissonance with the fifth, that is a conflict with the strongest interval in the scale.

A Fourier transform of any note played by a piano or other instrument (guitar, violin, etc.) will reveal a distribution of overtones. That is, the note played on the piano will primarily be at the “root” frequency (frequency of the note played), but it will also include overtones, that is whole number multiples of that frequency. For piano especially (but not exclusively), the third harmonic is very prominent, so you can see where this is going. If I play a note, the third harmonic implicitly (that is not in the note itself but the rich overtone structure) aligns with the fifth interval. If I then play either a diminished fifth or an augmented fifth, it has the effect of clashing with the fifth, as 2^[1/12]=1.06 (18/17), very close together. It is not a broad enough separation to form a nice interval like the second, but kind of blurry and disorienting.

In thinking about this, what I realized was that this contrasts with the even the most pensive intervals like the major seventh and the minor seventh. The major seventh is located a major third up from the fifth, and the minor seventh a minor third up from the fifth. Even the sixth is a second (a whole note) up from the fifth, and that appears to be why none of these three have quite the dissonance of the tritone or of the augmented fifth. In simplest terms, they have a jarring “sour” quality for precisely the same reason any half-interval (e.g., C-C# together) sounds jarring.

To put it another way, the tritone produces a disturbing dissonance with the third overtone which corresponds to a multiple of the fifth interval. To fill in some details, suppose you have a root note of frequency N (for example a middle “C” aka C4, 261.626 Hz). In the piano, the third Harmonic (3N) is very strong (it’s an interplay between the string vibrations and the wooden sound board, lots of wonderful physics involved), so that means a rich harmonic there at at 3N. Since the fifth interval (in this example the “G” note above that C) is 2^[7/12]= approx. 3/2 that is the G4 note, this third harmonic aligns perfectly with the “G” an octave above it, in other words the very predominant third overtone aligns closely with twice the frequency of the also-strong “fifth” interval (the G5). When you play a C4 by itself, you don’t hear the G5 explicitly, but it’s there in the mix of overtones, richly so. And when you add the tritone note, in this example F#, its second harmonic (F#5) is directly competing with the 3N overtone that aligns with the fifth (G5), and the result is a dissonant quality.

It’s also interesting to consider that this explanation is not resting upon an arbitrary combination of multiples of overtones and multiples of intervals. The next strongest interval is the fourth, frequency ratio 4/3. Looking at what it would take to create a similar alignment, one would have to go up three octaves to match the 4N overtones with a multiple of the fourth interval, and that will not be very discernible, because the amplitudes in the Fourier transform drop off significantly when you get that many multiples of N. This is only accentuated by the characteristic distribution of the piano where 3N has a much higher amplitude than 4N.

In all, very interesting stuff to ponder!