next up previous contents index
Next: Examples Up: Time shifts Previous: Accuracy and frequency response   Contents   Index


Pitch shifting

A favorite use of variable delay lines is to alter the pitch of an incoming sound using the Doppler effect. It may be desirable to alter the pitch in a variable way (either randomly or periodically, for example), or alternatively, to maintain a fixed musical interval of transposition over a length of time.

Returning to Figure 7.17, we see that, using a single variable delay line, we can maintain any desired pitch shift for a limited interval of time, but if we wish to sustain a fixed transposition we will always eventually land outside the diagonal strip of admissible delay times. In the simplest scenario, we simply vary the transposition up and down so as to remain in the strip.

Figure 7.19: Vibrato using a variable delay line. Since the pitch shift alternates between upward and downward, it is possible to maintain it without drifting outside the strip of admissible delay.
\begin{figure}\psfig{file=figs/fig07.19.ps}\end{figure}

This works, for example, if we wish to apply vibrato to a sound as shown in figure 7.19. Here the delay function is

\begin{displaymath}
d[n] = {a_0} + a \cos(\omega n)
\end{displaymath}

where $a_0$ is the average delay, $a$ is the amplitude of variation about the average delay, and $\omega $ is an angular frequency. The momentary transposition, which depends on the sample number $n$, is approximately

\begin{displaymath}
t = 1 + a \omega \cos(\omega n)
\end{displaymath}

This ranges in value between $1 - a \omega$ and $1 + a \omega$.

Figure 7.20: Piecewise linear delay functions to maintain a constant transposition (except at the points of discontinuity). The outputs are enveloped as suggested by the bars above each point, to smooth the output at the points of discontinuity in delay time.
\begin{figure}\psfig{file=figs/fig07.20.ps}\end{figure}

Suppose, on the other hand, that we wish to maintain a constant transposition over a longer interval of time. In this case we caon't maintain the transposition forever, but it is still possible to maintain it over fixed intervals of time broken by discontinuous changes, as shown in Figure 7.20.

The delay time is the output of a suitably normalized sawtooth function, and the output of the variable delay line is enveloped as suggested in the figure to avoid discontinuities.

Figure 7.21: Using a variable delay line as a pitch shifter. The sawtooth wave creates a smoothly increasing or decreasing delay time. The output of the delay line is enveloped to avoid discontinuities. Another copy of the same diagram should run 180 degrees ($\pi $ radians) out of phase with this one.
\begin{figure}\psfig{file=figs/fig07.21.ps}\end{figure}

This is accomplished as shown in Figure7.21. The output of the sawtooth generator is used in two ways. First it is adjusted to run batween the bounds ${d_0}$ and ${d_0}+w$, and this adjusted sawtooth controls the delay time, in samples. The initial delay $d_0$ should be at least enough to make the variable delay feasible; for four-point interpolation this must be at least one sample. Larger values of $d_0$ add a constant, additional delay to the output; this is usually offered as a control in a pitch shifter since it is essentially free. The quantity $w$ is sometimes called the window size.

The sawtooth output is also used to envelope the output in exatly the same way as in the enveloped wavetable sampler of Figure 2.7. The envelope is zero at the points where the sawtooth wraps around, and in between, rises smoothly to a maximum value of 1 (for unit gain).

If the frequency of the sawtooth wave is $f$ (in cycles per second), then its value sweeps from 0 to 1 every $R/f$ samples (where $R$ is the sample rate). The difference between successive samples is thus $f/R$. If we let $x[n]$ denote the output of the sawtooth oscillator, then

\begin{displaymath}
x[n+1] - x[n] = {f \over R}
\end{displaymath}

(except at the wraparound points). If we adjust the output range of the wavetable oscillator to the value $w$ (as is done in the figure) we get a new slope:

\begin{displaymath}
w \cdot x[n+1] - w \cdot x[n] = {{wf} \over R}
\end{displaymath}

Adding the constant $d_0$ has no effect on this slope. The Momentary Transposition is then calculated as:

\begin{displaymath}
t = 1 - {{wf} \over R}
\end{displaymath}

To complete the design of the pitch shifter we must add the other copy halfway out of phase. This gives rise to a delay reading pattern as shown in Figure 7.22.

Figure 7.22: The pitch shifter's delay reading pattern using two delay lines, so that one is at maximum amplitude exactly when the other is switching.
\begin{figure}\psfig{file=figs/fig07.22.ps}\end{figure}

The pitch shifter can transpose either upward (using negative sawtooth frequencies, as in the figure) or downward, using positive ones. Pitch shift is usuappy controlled by changing $f$ with $w$ fixed. To get a desired transposition interval $t$, set

\begin{displaymath}
f = {{(t - 1) R} \over w}
\end{displaymath}

The window size $w$ should be chosen small enough, if possible, so that the two delayed copies ($w/2$ samples apart) do not sound as distinct echoes. However, very small values of $w$ will force $f$ upward; values of $f$ greater than about 5 result in very audible modulation. So if very large transpositions are required, the value of $w$ may need to be increased. Typical values range from 30 to 100 milliseconds (about $R/30$ to $R/10$ samples).

Although the frequency may be changed at will, even discontinuously, $w$ must be handled more carefully. The most common choice is to mute the output while changing $w$ discontinuously; alternatively, $w$ may be ramped continuously but this causes hard-to-predict Doppler shifts.

The choice of envelope is usually one half cycle of a sinusoid. If we assume on average that the two delay outputs have neither positive nor negative correlation, the signal power from the two delay lines, after enveloping, will add to a constant (since the sum of squares of the two envelopes is one).

Many variations exist on this pitch shifting algorithm. One widely used variant is to use a single delay line, with no enveloping at all. In this situation it is necessary to choose the point at which the delay time jumps, and the point it jumps to, so that the output stays continuous. For example, one could wait for the output signal to pass through zero (a ``zero crossing") and jump discontinuously to another one. Using only one delay line has the advantage that the signal output sounds more ``present". A disadvantage is that, since the delay time is a function of input signal value, the output is no longer a linear function of the input.


next up previous contents index
Next: Examples Up: Time shifts Previous: Accuracy and frequency response   Contents   Index
Miller Puckette 2006-03-03