This chapter is about how Western musical tradition treats pitch, and why. Since pitch is primarily heard (by most people) in terms of ratios of frequencies, it is natural to use a logarithmic scale to assign pitches (which are subjective) to (objective) frequencies. But one has to pick a scale, that is, a ratio that corresponds to one unit of interval. This ratio in the West is the twelfth root of two, approximately equal to 1.059. That particular number turns out to be such a good choice of interval to measure pitches by, that it came to rule over a millennium of Western art music. Although we won't be concerned with all the historical details, this chapter will try to explain what's so special about the twelfth root of two as a unit of pitch.
The punch line is that this particular logarithmic scale turns out to have a surprisingly high number of sweet-sounding intervals in it. To develop this idea we first have to figure out what makes some intervals sound sweeter than others; this is pretty well explained by what is known as the Helmholz theory of consonance and dissonance (section ). Then we will investigate the actual intervals that arise in the Western scale (section ). Finally we'll consider some of the consequences of the way pitch is organized in Western music and consider some alternative ways to organize pitches.
It's a commonplace that some intervals sound sweet and some sound sour, like this:
SOUND EXAMPLE 1: a musical fifth (usually considered sweet sounding)
SOUND EXAMPLE 2: a tritone (sour by comparison).
To call these sweet and sour is a rather clumsy metaphor. In musical language, we refer to a sweet-sounding interval as consonant and a sour-sounding on as dissonant--terms that can be taken to mean ``going together" and ``not going together". (Even this more neutral-sounding terminology carries an implicit value judgment that should not be accepted unquestioningly.) It turns out that the two intervals above have a physical difference that correlates with people's judgment of consonance and dissonance (as they are measured by psychoacousticians in experiments), that fits into what we know today as the theory of consonance and dissonance.
Although the theory of consonance and dissonance is usually associated with Hermann von Helmholz (1821 - 1894), many of its ideas and concepts date back further, even to ancient Greece; and the theory was much elaborated upon (and argued with) over the century since Helmholz published his contributions. The theory seems to have finally been brought to a definitive form in Plomp and Levelt's very readable and persuasive 1965 paper on the subject.
In the theory, we consider two complex periodic tones, that is, tones that may be written as a sum of sinusoids with frequencies in the ratios 1:2:3:..; in other words, tones all of whose partials are tuned to multiples of a fundamental frequency. Here is what happens when the two fundamental frequencies are chosen, for instance, as 100 and 150 Hz:
(To make the picture easy to see, all the harmonics of the 100 Hz. tone are given the same power, and so are all the multiples of the 150-Hz. tone; but the theory doesn't rely on that fact. Also, the double peaks at 300 and 600 Hz. are in fact single sinusoids; they're drawn this way for clarity). Here, on the other hand, is the situation when the fundamental frequencies are 100 and 140:
These pictures roughly correspond to the two sound examples above. The first one is consonant and the second one, dissonant. The Helmholz theory explains the consonance of the first example and the dissonance of the second one, by the absence or presence of awkward pairs of sinusoids (in this example there are two: 280 and 300 Hz, and 400 and 420 Hz.) These pairs are far enough apart to be perceived separately but close enough to interfere with each other by vibrating in heavily overlapping regions of the cochlea (Section 3.4).
Plomp and Levelt go so far as to posit the consonance and dissonance of two sinusoids as a function of their separation in critical bands, thus:
Under this rule, the two pairs of sinusoids in the dissonant example above are almost as dissonant as possible (20 Hz. being close to 1/4 of a 100-Hz. critical band). The wider separations in the first example are about 1/2 of a critical band and contribute much less dissonance.
It's unavoidable that multiples of two fundamental frequencies would give rise to close neighbors here and there. The special reason the closely placed harmonics that occur in the first example didn't contribute to dissonance is that they landed right on top of each other. For this to happen, the ratio between the two pitches must be an integer ratio. For instance, for the third harmonic of one tone to coincide with the second harmonic of another, the fundamentals must be in a 2:3 ratio.
Here are the definitions of some intervals given by integer ratios between one and two (that is, within an octave), arranged from the most consonant to the most dissonant. The names are what they are for music-theoretical reasons too abstruse to explain here:
RATIO NAME 1:1 unison 2:1 octave 3:2 fifth 4:3 fourth 5:4 major third 5:3 major sixth 8:5 minor sixth 6:5 minor third
In many situations it's a good, practical move to choose, out of the set of all possible musical pitches, a reasonably small set of pitches, called a scale, to which you would restrict yourself when writing music. One reason for this might be that instruments, such as pianos or fretted guitars, are often designed to play a discrete set of pitches out of the whole continuum. (But if we consider that vocal music predates the development of keyboard and fretted instruments, this may cease to seem a compelling reason). Another consideration might be that you would want to be able to write music down. It would be impractical to write all the pitches as numerical frequencies, so in practice (in the West as well as elsewhere) musical traditions have settled on sets of pitches, typically between 5 and 21 in an octave, out of which a working musical context might use 5 to 7 at a time. For example, the Western scale has 12 pitches per octave, and one often chooses a musical key which implies a choice of 7 out of the 12.
Now suppose we wanted to divide the octave into equal intervals to make up a
musical scale. (Using equal sized intervals sounds like a good choice; it's
like using a ruler whose marks are spaced regularly along it. You could
reasonably request, for instance, that the interval you hear when you play the
first and third notes on the scale should be the same interval you get from the
second to the forth, or the third to the fifth, and so on.) If we call the
interval between two successive pitches in the scale , then the interval
between the first and third is , and so on; the whole octave is a
ratio of . Since we know an octave is a ratio of 2:1, we get
Would our scale (with equal steps per octave) have an interval that
approximates a fifth (ratio of 3:2)? To answer this it's best to go back to
computing things in octaves. One step is octave. How many octaves is
the interval 3:2? From section 1.3 we get the interval in
How bad are those errors? If a critical band is taken as an 18% increment
in frequency, it is then
To gauge how well a scale is doing at hosting consonant intervals, we can study how its fifths and major thirds come out (the other intervals listed above can be formed from octaves, fifths, and major thirds.) Here is a table of the results. The first row gives the number of steps we divide the octave into; the two other rows give the error, in thousandths of an octave, of the fifth and major third:
division 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 fifth error 82 85 15 82 14 40 29 15 40 2 30 14 15 22 3 third error 11 72 78 11 36 53 11 22 42 11 14 35 11 9 28 division 18 19 20 21 22 23 24 fifth error 26 6 15 14 6 20 2 third error 11 6 22 11 4 18 11
The first column that gives even half-decent approximations for the fifth and the third has 12 steps per octave. (Column 19 looks comparable and 22 even slightly better, but it would be hard to argue that the improvement is so great as to merit adding all those extra notes. Think what a piano would have looked like.)
How good or bad is the 12-steps-per-octave scale at reproducing a major third
and a fifth? Well, the approximations turn out to be four and seven half-steps:
Whether this is a reasonable result (i.e., whether this interval should be regarded as consonant) is more a matter of taste than of measurable scientific fact. Some people complain about it, and some instruments (brass and voice, for instance) have a tradition of slightly altering a pitch here and there to make thirds sound more consonant when they appear in the music.
The other consonant intervals (minor third, fourth, major and minor sixths) may be built out of combinations of the octave, fifth, and major third. For example, to go up a fourth (4:3) ratio), we can go up an octave (2:1) and then down a fifth (2:3). This is approximated in the Western scale by going up 12 steps (the octave) and down 7 (the fifth), or in other words, by going up 5 steps. Similarly, the minor third (6:5) is up a fifth (3:2) and down a major third (4:5); this gives 7-4=3 steps. Here is a summary of how the intervals fare in the Western scale (with the error now reported in steps--twelfths of an octave--to make them more readable):
|error (in steps)
A quick look at a piano keyboard will suggest (and a few minutes spent studying the way pitches are represented in Western musical notation will amply confirm) that the twelve steps of the octave are not considered equal; instead, they are organized in a highly non-uniform way that at first seems highly non-intuitive.
If someone had asked me to design the piano keyboard I would have simply put the even-numbered keys on the bottom and the odd-numbered ones on top; that way, once you learned to play a piece you could quickly transpose it to another key just by moving your hands right or left. But there is deep wisdom--learned over a thousand years or so with many spilled tears and even some blood--in the layout that we now use. With the benefit of hindsight, we can see why things are as they are in a fairly simple way, and although we shouldn't forget that there are rich historical resonances here, we can cheerfully leave them for a course in music history and confine our own study to the acoustics of the situation.
Here is a diagram showing a span of 22 steps of the Western scale, as they would appear on a piano keyboard:
All the 22 pitches shown appear as evenly spaced stripes on the right-hand side of the diagram. These are the pitches of the 12-note-per-octave Western scale. For historical reasons, and somewhat over-poetically, this is called the chromatic scale. On the left-hand side of the keyboard you see only seven out of every twelve pitches; these are labeled A through G. (The labels repeat because pitches that are separated by an octave are given the same label). They are the piano's white keys, and they comprise the diatonic scale, so named because they are (mostly) spaced two steps apart..
The seven pitches per octave that make up the diatonic scale are called naturals (as in ``G natural") to distinguish them from the other five, which are called accidentals. Accidentals are named for an adjacent natural, as in ``D sharp" (the pitch between D and E) or ``D flat" (between C and D).
A simple description of the diatonic scale is that it consists of three major triads. These are chords (collections of pitches) separated by a major and a minor third in turn. The diagram shows the three triads: an F triad (F, A, and C), a C triad (C, E, and G) and a G triad (G, B, and D); the notes C and G are in common between them. This is a natural way to choose 7 of the 12 pitches that have the maximum possible number of consonant intervals between them; in addition to the 3 major and 3 minor thirds, there are five pairs of fifths. (The other intervals are also available by reducing an octave by each of these three.)
These seven pitches (F, A, C, E, G, B, and D) can be re-ordered to get the pitches (A, B, C, D, E, F, G) that we know as the diatonic scale. By convention this scale is often arranged in the order (C, D, E, F, G, A, B, C) (repeating the C at either end of the scale); in this form it is called the C major scale.
This scale has many wonderful properties, but perhaps the most important is that, if we shift the entire thing by a fifth, we get back almost all the same notes. To see this we'll go back to its arrangement as three major triads, as in the diagram, and shift upward in pitch. Because we designed it as three major triads joined end to end, the first five pitches (forming the first two triads) land on other notes in the scale. Of the other two, the D shifts up to an A (you can check this by counting up 7 steps from the middle 'D', landing on 'A'. The B, however, lands on F sharp, the pitch between F and G. The new, shifted scale has the pitches (C, E, G, B, D, F sharp, and A), or, in letter order, (A, B, C, D, E, F sharp, G). Going further, we can shift the scale up or down a fifth, an arbitrary number of times, by changing one pitch in the scale for each shift. Shifting a musical scale (or a chord, or an entire piece of music) by a fixed interval is called transposition.
Starting again with the pitch F, we now consider what happens if we repeatedly transpose it (but just F now, not the whole scale) by a fifth. We get the pitches (F, C, G, D, A, E, B, F sharp, C sharp, G sharp, D sharp, A sharp, F)--after which, being back at F, the sequence repeats. We ended up hitting each of the twelve pitches exactly once: first all the naturals, then all the accidentals. This arrangement is called the circle of fifths, and it sends music theorists into paroxysms of joy.
The Western chromatic scale is not without its discontents, who often complain about the poor accuracy of approximating major thirds as four twelfths of an octave. We can fix that if we are willing to relax the requirement that our scale have equal steps. (In fact, we could then do anything we wanted).
Returning to the keyboard diagram above, we could simply assign frequencies to the diatonic scale so that all the marked intervals are exactly correct. An interval that is an exact ratio of integers, such as 3:2 or 5:4, is called a just interval, and the scale we then get is called a just-intoned scale. (The word intonation is music jargon for ``tuning".)
To construct the just-intoned scale we figure out the frequency for each pitch as an interval from C. So for instance, the pitch A is a minor third below C, for a relative frequency of 5/6. We then raise or lower it by octaves until it resides withing an octave above the original C; in this case we have to go up an octave, giving 5/3. Continuing in this way we get the just-intoned scale in C as shown:
C D E F G A B 1 9/8 5/4 4/3 3/2 5/3 15/8
This is all excellent except for two things: first, there has to be a different scale starting at each key. (Even if you put in extra pitches for the accidentals, you can't get all the intervals the same and so transposing such a scale gives you a whole new set of pitches.)
Second, certain of the intervals aren't
what you'd wish for. In particular, the interval from D to A, ideally 3:2, is
a thorny 40/27; and the interval from E to A, ideally 4:3, is instead 25/12.
If we were to move A and E over to fix those intervals, E would land at
Although it's easiest to relax and let the modern Western scale rule over your music, the investigation of alternative pitch scales has been, and continues to be, an exceedingly fruitful avenue for composers including Harry Partch, Alvin Lucier, Charles Dodge, John Chowning, and Rand Steiger (and certainly many others), who have found highly individual ways of organizing sets of pitches.
1. In the Western tempered scale, if A is tuned to 440 Hz., what is the frequency of the C below it?
2. What is the frequency of the same C, under the same conditions, using the just-intoned scale in C instead of the tempered one?
3. How many half-tones, in the Western tempered scale, are there between the fundamental and the seventh partial? If the fundamental is tuned to a note on the Western Scale, how far is the nearest note on the scale to the seventh partial?
4. How many distinct major thirds can be formed using the 7-note diatonic scale? (Count two of them as being `the same' if they differ by an octave).
5. What is the frequency ratio (as an exact number) between B and the next F above it in the Western tempered scale?
6. How many half-tones is the syntonic comma (as defined in Section 4.4)?
Project: How much detuning makes an interval sound sour? This project is a test of the Helmholz theory of consonance and dissonance. The interval we'll work on is the fourth below 440 Hz. (and later, 220 Hz.)
First, using "sinusoid" objects, make a perfect fourth using the frequencies 440 and 330. You can connect them to the same "output" object so that they have the same amplitude as each other. Now drag the 330 Hz. tone down in frequency until, to your ears, the result starts to sound ``sour". How many Hz. did you have to decrease the 330-Hz. tone to make it sour? (If it never sounds sour to you at all, just report that.)
Now do the same things with pulse trains. You'll need the "pulse" object which is in version 2 of the Music 170 library (a folder named m170-function-library-v2, uploaded Oct. 15) - if you have version 1 get the new one (and change Pd's path or your working directory accordingly). When you've got it updated you can type "pulse" into a box to make a pulse generator.
Make two of them, frequencies 440 and 330, with "BW" (bandwidth) set to 2000, and connect them to an "output" object as you did with the sinusoids. Now reduce the 330-Hz. one to 329. What do you hear?
Now reduce it further until it sounds sour. How many Hz. less than 330 did you have to go? Was it further away than the tempered fourth (329.628)?
One could think that the number of Hz. you have to mis-tune an interval to get sourness might be a constant or else that it might be a constant proportion (i.e., interval). To find out, repeat the experiment for 220 Hz. and 165 Hz. Again, decrease the lower frequency (165) until you think it sounds sour. How many Hz. did it take and is it more nearly the same frequency difference or the same proportion?