As introduced in this post, I am currently exploring novel ways of representing 3D Vocal Tract [VT] geometries in 2D, by means of extending the base concept of area function.
This is one of the many parallel branches of the project, as important as perfecting our 2D flow model [my main duty] and the excitation models [Arvind‘s work].
So in mid May I was very happy to start supervising a new student who showed interest into 3D to 2D VT mapping.
Her name is Anna Zietlow, undergrad from UBC Cognitive Systems; she has a good background in programming and she decided to join us to develop the research project mandatory for her [last] summer course.
I have to say she has done a very good job so far and here are some details.
Anna is working on ArtiSynth [awesome 3D modelling platform proudly made at UBC], to export 3D VTs as a 2D contour that contains information of:
- cross sectional area [aka area function style]
- curvature of the tract and intrinsic asymmetries
- lobes and parallel tracts [e.g., piriform fossae (scientific image) and valleculae (less scientific image)]
So far she hit the first two points, here is the workflow.
We started working on a small portion of a VT 3D scan, the junction between the glottal laryngeal cavity and the pharynx. I’ll drop some images:
This nice guy [or “spaceship”, as we like to call it], sits vertically right at the end of the larynx, it is short enough to allow for quick geometrical computation but includes all those features that make handling the biomechanics a difficult task [highly asymmetric, several forks, etc.].
Anna started with tracing a line that runs into the center of the spaceship, as precisely a possible. She then used an ArtiSynth class that automatically slices the volume of the spaceship, whipping planes normal to the central line at a constant parametric distance. This class automatically discards the intersections with any side branches and returns the areas of the cross sections of the main cavity, i.e., the area function of the geometry. Here is a screenshot of the central line of the spaceship and the resulting area function:
At this point, Anna designed a new class [that extends the one previously used] capable of keeping track of the curvature of the shape it is fed with and to combine this piece of information with the area function. The output is a 2D contour of the main cavity, where at each sample point the distance between the lower and the upper contour is equal to the diameter of the circle whose area exactly matches the area function at that abscissa. And this is the result on the spaceship:
These curves are really interesting, as arguably more similar to the original shape than the mere area function. The curvature makes also possible to maintain some intrinsic asymmetries of the main cavity.
Despite the simplicity of the result, the implemented algorithm was no easy task!
Anna had to play around with quite a bit of trigonometry in 3D, then flattening everything out trying to save as much information as possible.
Also, the cross sections of the spaceship [and in general any VT] often resemble more a very skewed ellipse than a perfect circle. This means that, when we represent the area as a circle [i.e., using a diameter as distance between the 2 contour lines], we may cause different diameters to cross and form spurious profiles in the contour.
You can see one of these artefacts in the lower curve in the picture. One possible fix is to shift the central line along that sample point, but more tests need to be done to understand the consequences on the rest of the shape.
So, what’s next?
First, we need to test this, and to do so the algorithm should spin on a whole vocal tract, possibly one that comes with a known frequency response. At that point we will export both this curved contour and a straight one [aka 2D area function] and compare the resulting acoustics of the two. This is the only way to verify if this extra processing on the geometry benefits anyhow our articulatory synthesizer.
Also, as I highlighted before, the starting class we are using discards the cross sectional information from the parallel branches…this is handy to shape the main tract in 2D, but does not allow us to remap the piriform fossae and the valleculae. These cavities have remarkable effects on voice production, as studied in this paper. Anna is already working on a new version of the class, that still divides the contributions from the different tracts but, instead of ignoring the additional cross sections, combines them into the main 2D profile.
Go Anna Go_