VokalJäger Enhanced Algorithmic Tool Box: VJ.EAT

Latest version: 0.12 as of 05/07/2020.

VJ.EAT (VokalJäger Enhanced Algorithmic Tool box) re-implements the PRAAT and R kernel algorithms from Keil 2017: Der VokalJäger: Eine phonetisch-algorithmische Methode zur Vokaluntersuchung. Exemplarisch angewendet auf historische Tondokumente der Frankfurter Stadtmundart, Deutsche Dialektgeographie Vol. 122. The core idea of VJ.EAT is to offer a one-stop solution for a robust and automated formant analysis. VJ.EAT has algorithms for formant trajectory calculation, formant extracting resp. formant picking, robust formant normalization and finally phonetic formant classification using machine learning. The VJ.EAT algorithms here are packaged and tested for Microsoft Windows 10 but can be adjusted for any platform. The VJ.EAT re-implementation is by a factor of 10 more performant than the original version, more flexible and robust plus rearranged for modular use.

The 0.x series is still a pre-production version, although fully functional. The final version will be published on CRAN.

In particular, VJ.EAT contains the following phonectic algorithms resp. phonetic calculation and analytics packages:

  • Using PRAAT, it calculates for a given sound file and a text grid, which defines the samples resp. time ranges to go for, the intra-sample formant trajectories. That is done for a series of upper formant ceilings.
  • Using R, it performs the so called “sweep”. Here the formant trajectories of the preceding step are smoothed with a DTT and formant values are extracted from a specific point on the curve. Out of all different trajectory-bundles for each sample – one for each upper formant ceiling – the one is chosen, which is optimal under a certain heuristic.
  • Using R, it performs a formant normalization. Here the extracted formant values of the preceding step are normalized using the (optionally: robust) Lobanov or Gerstman procedures.
  • Using R, it classifies the phonems (not implemented yet).
  • Using R, it creates a series of statistical analysis and formant plots.

The core algorithms: VJ.EAT.CORE

Holds all files required for the VokalJäger algorithms to work.


Download VJ.EAT algorithms ZIP package here.

Download VJ.EAT documentation PDF here.

Sample graphics produced with VJ.EAT

Simple F1 distribution statistic
Simple F2 distribution statistic.
Measured formants of the 2 sample files in VJ.EAT.demo (left: male; right: female) created by VJ.EAT as a classical formant plot in the F1/F2 plane. the triangles correspond to the “expected” ranges of formants in the F1/F2 plane (smaller: male; larger: female).
Same measurement as per above, but now overlayed by VJ.EAT with elllipses which show the expected distribution of the formants (assuming bi-variate normal).
Same measurement as per above, but now showing normalized formants which are “backprojected” as hypothetical androgynous speakers to the F1/F2 plane. As can clearly can be seen the range differences between male and female speakers are eliminated.
The idealized spectrum of an [u:] sample, as implied by the first 3 formants and bandwidths, calculated by VJ.EAT by evaluating the frequency response of the corresponding vocal tract transfer function with 2 times 3 poles.

VJ.EAT.CORE is published under CC 4.0 BY.

A worked example: VJ.EAT.demo

Holds an exemplarily fully-operational workspace set-up, with all input sound and textgrid files plus and output formant files and plots. It comes with workspace specific batch scripts and config files. Explore this and clone – in case – for your specific tasks.

Download VJ.EAT demo ZIP package here.

VJ.EAT.demo sound files are published under Public Domain Mark 1.0, all other files under CC 4.0 BY.