Employing all the techniques described in the preceding phonetcis section, it is possible to analyze large speech corpora with the VokalJäger in a fully automated way. This was done in Keil (2018, p. 112-165) for the Kiel PHONDAT Corpus, which serves as a reference for High German [Kohler 1994].

The results show the patterns, which are expected for High German. Below diagram documents the distribution of the formants in the F1/F2-plane as measured with the VokalJäger. The ellipses plotted are idealized areas to contain 50% of the respective measurements. Note that the values are after normalization and are re-projected to the hypothesized androgenic speaker of High German – as such they are across the data of both genders within one chart. The patterns align nicely with those published in other studies, most notably those reported by Pätzold, Simpson (1997), which investigated the Kiel PHONDAT Corpus as well [Keil 2017, p. 159-164]. There is one exception to the expected pattern: a centralization of [uː] respectively a backing of [oː]. That may be related to the words selected – respectively to the consonant conditions for those utterances – in the Kiel corpus [Keil 2017, p. 150; Pätzold, Simpson 1997, p. 13].

F1/F2 formant distribution for the High German Kiel Phondat corpus [Keil 2017, figure 56, p. 151; colored version]


As reference, the measured High German formant values are documented in the following tables (with key formant statistics listed: median F1/F2 and lower plus upper quartiles). The first table below documents the original F1/F2 values by gender, before any normalization was applied:

Below second table shows the F1/F2 values after normalization (those being the values corresponding to above F1/F2 plot). The results are re-projected (German: “rückprojiziert”) to the vowel-space occupied by both sexes. Hence the values correspond to those of a hypothesized androgenic speaker of High German and there is no distinction between male and female speakers [Keil 2017, p. 92].

Much more results can be found in chapter 4 of Keil (2017), which is available here in in its entirety.


With a fully calibrated set of binary classifiers the VokalJäger is equipped to measure floating phonetic features values ζ in speech signals.

Below chart shows the result of an experiment: Here floating phonetic feature values ζ are measured for the openness of back vowels and the roundness of front vowels within the data on which the VokalJäger was calibrated, the Kiel PHONDAT Corpus – here: panel “HG1” [Kohler 1994]. The results are as expected: the ζ values show exactly the values – with some statistical variations – one would expect from the underlying vowels respectively their elementary phonetic feature values.

Values ζ for the floating phonetic feature openness of back vowels on the High German Kiel Corpus (left) and on Frankfurterisch recordings from the REDE material (right). ζ=4 represents open, ζ=1 closed [Keil 2017, figure 85, p. 228; colored version].

Interesting is the reaction of the VokalJäger to vowels, on which the algorithm was not calibrated. E.g. the sound [a] was omitted from training, but shows exactly the same behavior as sound [a:], on which the VokalJäger was trained – as expected from phonetic experience.

Another data set was fed into the VokalJäger for confirmation (here: panel “DFRA”). Those are dialect recordings from the REDE project, which had been phonetically transcribed [Schmidt, Herrgen, Kehrein 2008 f.]. Hence two challenges could be tested: Firstly, can the VokalJäger, which was calibrated on High German, deal with dialect signals; secondly, does it produce the ζ values one would expect from the transcriptions? With the exception of the U-sounds, which may have to do with their representation in the Kiel Corpus, the results are as expected.


A similar experiment was conducted utilizing the floating phonetic feature of the roundness of front vowels. The results are as well encouraging: The ζ values end on the values, which are expected:

Values ζ for the floating phonetic feature roundness of front vowels on the High German Kiel Corpus (left) and on Frankfurterisch recordings from the REDE material (right). ζ = 1 represents not-round, ζ = 2 round [Keil 2017, figure 85, p. 228; colored version].

TheVokalJäger was calibrated in this setting on the long front vowels – as such the results are most accurate for long vowels. The short front vowels range in with some significant deviations from their long counterparts but still lie within the ζ bands one would deem round ( ζ > 1.5) respectively not-round (ζ < 1.5).

Consequently, one may conclude, the VokalJäger indeed was decently calibrated on High German.


The key target use case of the VokalJäger is to measure phonetic distances Δζ between phonologically assembled contrast groups. This section describes, how this approach was used to track phonetic resp. phonological language development in the German dialect of Frankfurt, Frankfurterisch (for more details on Frankfurterisch, visit the sister website Examples and summaries are printed below – for a comprehensive documentation of the experiment and the corresponding phonological reasoning concerning Frankfurterisch, please refer to Keil (2017, p. 231–432).


Around 1937 German dialect recordings have been conducted by the researcher Bernhard Martin from Marburg. The resulting set of 300 records, each about 3 minutes in length, were assembled to form the Lautdenkmal reichsdeutscher Mundarten (literally German: Sound monument of the dialects of the German Reich) – and handed to the “Führer” Adolf Hitler as birthday present. The recordings are problematic due to its fascist content but represent the “earliest area-wide” mass recording of German dialects [Schmidt, Herrgen 2011, p. 117].

The Lautdenkmal recording have now been digitized and are available for linguistic research. The project is run by Christoph Purschke with the website [Purschke 2012; Purschke 2014 f.]. Christoph provided the Lautdenkmal recordings for the VokalJäger analysis employed in my PhD thesis and documented here (thanks!).

Luckily there does exist a Lautdenkmal recording for Frankfurt – henceforth: the Frankfurt Lautdenkmal. It was recorded as number MD-17 on April, 22nd in 1937. A 38 year old fascist praises Hitler’s grip to power (the “Machtergreifung” of 1933). This Frankfurt Lautdenkmal tape constitutes the oldest recording of true historic Frankfurterisch in decent quality. The Frankfurt Lautdenkmal was analyzed with the VokalJäger alongside other Lautdenkmal recordings, most notably from Klein-Gerau. Here the Frankfurter Lautdenkmal is referenced as FraLD and the Klein-Gerau ones as KG1/KG2.


Within the REDE project – the Akademievorhaben (REDE) [Schmidt, Herrgen, Kehrein 2008 f.] – several recordings have been produced from the 2000s onward, which are associated with Frankfurt. They are labelled FraF1L to FraF4L and FraFJ1 for some more “younger” speakers while FraFA1L plus FraFA2L represent “older” speakers. Those recordings document a more recent state of Frankfurterisch, the Neu-Frankfurterisch.


One of the most defining characteristic of the historic Frankfurterisch is the prevalence of a “dark A”, a velar back-shifted A-sound, similar to [ɔ], but with less lip rounding and some A-like sound. A-sounds, which originate from old long A-s, denoted â in Middle High German and labelled here  or “old-long” (“altlang”), are prominent examples. For full phonological details concerning â in Frankfurterisch, please refer to Keil (2017, p. 306–318).

If one now measures old-long Â-s with the VokalJäger, one gets the ζ values – for the floating phonetic feature openness of back vowels – as documented in the chart below. A striking patter can be observed: For the Frankfurt Lautdenkmal FraLD one measures a ζ of 3.3 while for all of the High German Kiel Corpus speakers (K07MR-K10MR) and most of the more recent REDE speakers (right side in the charts) the ζ ranges near 4. But a ζ of 3 corresponds to an open [ɔ], while a ζ of 4 represents the neutral [a].

Hence, one may suppose: the dark A, perfectly measurable in 1937’s historic Frankfurterisch is virtually non-existent in modern Frankfurterisch.

ζ values for old-long A-s – floating phonetic feature openness of back vowels: ζ = 4 corresponds to maximally open, ζ = 1 to maximally closed [Keil 2017, picture 98, p. 316; colored version].

To test the hypothesis of the disappearance of the old dark A, already pretty obvious in the floating phonetic feature value measurements, one has to conduct a Δζ feature distance analysis. This is depicted in the next chart. One observes, e.g. that for the Frankfurt Lautdenkmal there does exist a difference Δζ = 0.7 between the contrast groups of old long Â-s and short A-s: The Frankfurt speaker from 1937 significantly separates (on the 99% significance level: **) the sounds of long Â-s (being O-like dark A-s with a ζ of 3.3) from those of short A-s (being neutral A with a ζ of 4). That separation, indicated by different ζ values respectively Δζ in excess of 0, does neither exist for High German speakers nor for most of the REDE tapes (say K07MR or FraFA2L, both with a Δζ = 0).

Δζ values within contrast groups of old-long A-s on one side and neutral A, short O, long O: and O-s before R on the other side. The fat symbol represents the contrast group of old-long Â-s. The y-axis represents the floating phonetic feature openness of back vowels: ζ = 4 corresponds to maximally open, ζ = 1 to maximally closed [Keil 2017, picture 99, p. 317; colored version].


If one conducts above analysis for a wider range of contrast groups, one can empirically identify two main phonetic changes, which occurred in Frankfurterisch between 1937 – as represented by the Frankfurt Lautdenkmal – and today – as represented in the REDE recordings. The findings are discussed in detail in Keil (2017, p. 398–408) and summarized in the following diagram:

Summary for changes measured in Frankfurterisch [Keil 2017, picture 116, p. 405; colored version/ excerpt].

So one may conclude, when performing with the VokalJäger concept the floating phonetic feature analysis:

It occurs that the historic dark A for formerly long A-s disappeared in Frankfurterisch (at least for the recordings analyzed here). While there are phonetic feature distances Δζ > 0 in the Frankfurt Lautdenkmal (picture: 1) between long and short A-s concerning openness, which indicate the prevalence of a dark O-like sound for (formerly) long A-s, there are much lesser differences in the modern or High German recordings (picture: 2).

Further it occurs that while the old Frankfurterisch was de-rounded, the new Frankfurterisch allows for more rounding in front vowels.  Concerning rounding of front vowels there are no phonetic feature distances (Δζ = 0) in the Frankfurt Lautdenkmal (picture: 3) between words, which have Ü/Ö and I/E like counterparts in High German – indicating the de-rounded state of Frankfurterisch. But there much higher differences in the modern or High German recordings, indicating the prevalence of rounded utterances (picture: 4).