Brille/Links

Relevant links for the Brille project

Conceptual

People

Libraries and Tools

Sphinx-based systems

Docs

C/C++ based feature analysis

Snack

A multi-platform real-time sound acquisition package (C++) that can perform formant frequency and pitch analysis.

Snack Docs

CLAM

This seems like quite an advanced package.

It includes Python bindings:

Somebody has recently added a  formant frequency analysis module:

SPRACH

Neural network (connectionist) speech recognition tools and speech feature extraction tools. Source code. Separate core and GUI code. neural networks, feature-calculation, sound/audio interface, conversion, etc.

The structure of the neural net system is rather simple. Whereas Gaussian mixture systems typically rely on a rather fine sub-phonetic state division, and in consequence have complicated state-tying infrastructure to maximise training efficiency, a connectionist acoustic model can be a single neural net with a few tens of outputs, each of which has a direct interpretation as a particular phone.

SFS

 Speech Filing System or SFS - unfortunately not open source, but source code available. The key winner:

ISPI Automatic Speech Recognition library

 http://www.isip.piconepress.com/projects/speech/index.html is a freely available, modular, state-of-the-art speech recognition system that can be easily modified to suit your research needs. The system is built on top of a vast hierarchy of general purpose C++ classes that implement generic math, data structure, and signal processing concepts.

CSLU toolkit

The  CSLU Toolkit was created to provide the basic framework and tools for people to build, investigate and use interactive language systems. These systems incorporate leading-edge speech recognition, natural language understanding, speech synthesis and facial animation technologies. The toolkit provides a comprehensive, powerful and flexible environment for building interactive language systems that use these technologies, and for conducting research to improve them.

Python based formant frequency analysis

ESPS

Kyle Gorman has developed a python based interface between python and Praat. Features include:

  • F_0 analysis
  • Signal intensity
  • Spectral slices
  • Formant analysis

Links:

ESPS requirements

Other systems

Papers

Many of these papers are directly available here

Phoneme Extraction

Formant Frequency Estimation

 Formants - good article describing what formants are, different concepts

Related projects

  •  VoxForge - freely licensed audio of speech with transcriptions