Speech recognition demo: Dictation

This is a www-interface for the Finnish large vocabulary continuous speech recognizer at Aalto University.
Your speech file will be recognized on the laboratory's grid computing system and the results sent to you by e-mail.
If you are interested in trying the system, please ask for a password from asrdemo@cis.hut.fi.
For some thoughts about the system and performance, see the FAQ.
We reserve the right to use your file for research purposes.

You can find the demo submission form here.

Speech recognition applications

  1. Our speech recognizer transcribing broadcast news. The raw transcripts can be used for indexing and browsing the video content. In this demo the transcripts are overlaid to the bottom of the video like subtitles to visualize the performance. Other augmented texts include: automatic video event labeling (top), face recognition (names obtained by OCR) (right), and speaker segmentation, clustering and identification (names from OCR) [See projects for more about NextMedia].
  2. Speech recognition, machine translation, speech synthesis, and cross-lingual speaker adaptation in real-time speech-to-speech translation trial 1, trial 2, trial 3, [See projects for more about EMIME]

Speech synthesis demos

  1. Speech synthesis adaptation using noisy data, samples related to the listening test in November 2012 and the associated paper submission.

  2. Speech synthesis for children, related to the paper "Creating Synthetic Voices For Children By Adapting Adult Average Voice Using Stacked Transformations And VTLN" in ICASSP12 (paper in IEEExplore)

  3. Accent in foreign language speech synthesis related to the paper "Speaker similarity evaluation of foreign-accented speech synthesis using HMM-based speaker adaptation" in ICASSP11 (paper in IEEExplore)
  4. Mixed language speech synthesis