We have made a simple test of these ideas, in which the synthesis technique is simply a phase-vocoder-based time base correction of a forty-second sample of speech. Two such samples were used: the voice of a well-known politician, and a short vocal improvisation by Trevor Wishart. These two were each tested as controls of themselves (it worked) and then as controls of each other. A third control source, a Zeta violin, was also tested, using the politician as output.
Using Pd, each corpus described above was analyzed at 30-millisecond frames, yielding about 1300 analyses. The sample correlations between the eleven channels were measured using Octave.
A Pd extern,
searchvec, was written to take real-time 11-channel timbre
estimates from the
bonk~ object, decorrelate the 11 channels, and look
the result up in the database of analyses of the target sound. The Pd phase
vocoder (FFT example
10.phaselockedvoc.pd in the Pd distribution) was
used to resynthesize output. The resulting instrument can be played live from
the violin, or by playing back either of the two voices.
No attempt was made to make the output continuous, so as to maximize the responsiveness of the output. As a result the resynthesis jumps frequently from one place to another in the soundfile.
Especially when one voice controlled the other, but at least somewhat when the violin was the controller, the shape of the controlling sound could be heard in the output. The timbral nature of the output remained audibly that of the resynthesis sample. Not surprisingly, no semblance of phonetic intelligibility remained in the output.