Legacy: CASY

CASY: Configurable Articulatory Synthesis

CASY vocal tract (via MRI of C. Best)

The Configurable Articulatory Synthesizer (CASY) is a version of the articulatory synthesis program (ASY) that lets the user superimpose an outline of our vocal tract model on an acquired sagittal image (typically an MRI). The user can then graphically adjust the model parameters to fit the dimensions of the image. Transfer functions and acoustic output can be generated using those model parameters. CASY's model parameters are a superset of those in ASY and include values that were in the original Mermelstein model, but which could not be adjusted by the user. In addition, the fixed surfaces of the vocal tract are represented parametrically so that they can be adjusted to match any arbitrary speaker's vocal tract. Finally, they include the values of the coefficients of Mermelstein's equations that determine dependencies among parts of the vocal tract geometry.

CASY also includes several improvements over the ASY model: (1) The vocal tract surfaces are computed using 2nd-order Bezier curves, with a user-specified number of Bezier segments. This defines the surfaces as shapes, independent of any set of gridlines that can be superimposed on that shape. (2) The outline of the soft palate and uvula is improved by the additional of an outline parameter. (3) The relationship between the soft palate position and nasal coupling is made easier to control by the addition of parameters that assign maximal and minimal coupling values to particular velum heights. (4) A new model of the tongue tip eliminates many of the artifacts that appeared in ASY when the tongue tip was placed in certain configurations. (5) A new algorithm for computing the area function from a sagittal outline is employed that eliminates many of the shortcomings and quirks of the standard ASY algorithm. (6) Finally, the new algorithm provides a graphical display of how the area is computed at each vocal tract section, and allows the user to flexibly change the parameter values of the equations that relate sagittal distance to area in various parts of the tract.

Time-varying vocal tract shapes can now be specified, and an acoustic signal calculated. Time varying shapes may be specified in three ways: (a) graphical editor -- a vocal tract shape specified on the screen by means of graphical editing capabilities can be "pasted" at a particular time into a larger script; (b) ASY script and control tables -- CASY can read the files used by ASY to specify dynamic articulations. (c) CASY sequence files -- similar in format to the older ASY script and control tables, these files allow all of CASY's model parameters (including those not available under ASY) to be specified dynamically. An important feature of the system is that any parameters that are not specified in a time-varying sequence are "inherited" from the current graphical specification. With this capability it is possible, for example, to set the parameters that specify the size of the vocal tract and the outline of its fixed structures by matching an image obtained from a particular speaker, and then read in a file that specifies how the articulators move to produce a particular utterance that is not specific to that speaker.

CASY includes the capability of displaying dynamic pellet data, that is, the positions of X-ray microbeam pellets or EMMA (eletromagnetic midsagittal articulometer) receivers placed in the vocal tract. The horizontal and vertical positions of the pellet/receivers can be displayed as time functions; a point in time can then be graphically selected, and the locations of the pellet/receivers at that time are displayed in the midsagittal plane, superimposed on the model vocal tract, and/or any currently displayed image. This capability permits comparisons of dynamic data of this kind with movements of the model vocal tract.

| ASY DEMO | VOWELS | VOCAL TRACT | DYNAMIC SYNTHESIS | INFORMATION |