The EMU Speech Database System
EMU and ToBI

ToBI is a system for transcribing the intonation patterns of spoken language. ToBI defines a number of annotation levels and the criteria for placing labels on each. These labels include a segmentation into words along with tone labels which mark prosodic events such as prosodic tones and phrase boundary events. ToBI annotations have largely been made using the Unix based Waves+ toolkit from Entropic and the example materials are made available in the ESPS format which is readable only by Waves+ (and by Emu when an ESPS licence is available -- ie. on a Unix platform). This page describes some tools for using ToBI annotation in the Emu system.

The ToBI training materials available from the Ohio State web site are in the ESPS format. We have converted these files to the SSFF format read by Emu on both Unix and Windows. These files are now available on our server:

The smaller files consist of only 12 utterances from the database (those beginning with `a').

These packages contain the original label files (augmented with a dummy label at the start of the word level so that Emu can treat the words as segments rather than events). Two Emu template files are provided, one mimics the traditional ToBI annotation scheme which presents four independant tiers, the other adds domination relations and two additional levels for intonational and intermediate phrases. A script is provided to convert traditional flat ToBI annotations into hierarchical ones. An example hierarchical annotation is shown below.

In this scheme, the Tone level is preserved, non-phrasal tones are linked to the word in which they occur and words are grouped into Intonational and Intermediate phrases based on the position of phrase boundary events.

Pitch Tracking on Windows

Researchers investigating prosody have relied on ESPS/Waves+ for both labelling and pitch tracking. Emu can manage the labelling role but does not provide a pitch tracker. There are a number of possible pitch trackers that might be of use.

One that is available now which can write to the SSFF format read by Emu -- this is the pitch tracker included with the Edinburgh Speech Tools toolkit, which runs on Unix and Windows systems. Edinburgh do not provide a binary distribution of the toolkit for Windows, we are currently investigating providing at least the pitch tracker for download here for use on Windows.

Another alternative is the Speech Filing System from UCL which provides an environment for data capture on Windows and Unix systems. SFS includes both a pitch tracker and a formant tracker and stores data in its own file format. We are currently looking at supporting the SFS file format in Emu which would enable researchers to use SFS to generate pitch and formant tracks and then use Emu to annotate and query their databases.

