Home


The VokalJäger (literally in German: hunter of vowels) is an algorithm which I designed to measure the prevalence of dialect patterns - more precise features of the Frankfurt dialect - within a speech signal. To that end it measures phonetic differences in speech signals applying machine-learning techniques. Its calculation kernel is implemented in the statistical programming language R [R Development Core Team 2015]. The initial version of the VokalJäger formed the basis of my 2016 PhD-thesis Der VokalJäger. Eine phonetisch-algorithmische Methode zu Vokaluntersuchung, at Marburg University (FB09 Germanistik und Kunstgeschichte). This document, hence forward referenced as Keil (2017), delivers the full introduction and description of the algorithm plus the results of the experiments conducted 2014–2016.

My PhD-thesis is freely available under a CC-BY license and can be downloaded here as PDF. You can browse the book online on archive.org.


My PhD-thesis was previously published in 2017 as Volume 122 of the Deutsche Dialektgeographie by Olms.

ROBUST ACOUSTIC VARIABLES PROCESSING

To process acoustic variables the VokalJäger automatically determines the most “appropriate” set of configuration parameters to perform phonetic measurements with the software Praat [Boersma, Weenink 2015]. It further applies robust statistics to extract and normalize formant values and related phonetic variables to achieve independence from both, corpus vocabulary and speaker physiognomy.

MACHINE LEARNING

Machine-learning is utilized to “train” the VokalJäger to detect the prevalence of so called binary features: Can this or that phonetic feature – like roundness or the highest grade of openness – be found in a segment? Here Kuhn (2015)’s generic R-suite CARET is employed. Once trained, the VokalJäger can automatically calculate the probability of the binary feature being present in an unknown signal. With this probability it is further possible to calculate an expected value, the floating phonetic feature value, here called ζ (zeta). Then, using this ζ-value, quantitative probability weighted phonetic differences, here labelled Δζ (delta-zeta) can be derived. This allows to statistically test, whether or not two different groups of speech segments – usually phonologically assembled at different points in real-time – separate significantly concerning a certain floating phonetic feature. One may then conclude the occurrence of a merger or a split in language development.

USE CASES HIGH GERMAN AND FRANKFURTERISCH

A prototypical setting is to train the VokalJäger on High German and to test it – respectively: use it – with dialect recordings. In Keil (2017) the VokalJäger was used to analyze historic audio files of the now legacy Frankfurt city dialect Frankfurterisch. It is shown that certain phonetic monophthonic vowel features, which are representative for the dialect, were still measurable in the old Frankfurt Lautdenkmal recording from 1937 but missing in the newer tapes from the regionalsprache.de (REDE) project [Purschke 2014 f.; Schmidt, Herrgen, Kehrein 2008 f.]. In this setting the VokalJäger was trained on the Kiel PHONDAT Corpus of Read Speech as representation of High German [IPDS 1994].

A fully fledged documentation of the Frankfurt dialect can be found on the sister webpage frankfurterisch.org.

VOKALJÄGER 2.0: THE NEXT GENERATION

The original VokalJäger was a monolithic standalone application, trained on the High German Kiel PHONDAT Corpus and limited to monophthongs. In a currently ongoing project with the Phlipps-Universität-Marburg, Deutscher Sprachatlas, the VokalJäger is being converted into a flexible and generic toolbox bridging phonetics and machine-learning. This “new” tool is code-named VokalJäger 2.0 – Enhanced Algorithmic Tool Box (VJ.EAT).

VJ.EAT can be downloaded from this page here.