Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

VT Speech processing

From EGIWiki
Revision as of 10:26, 13 March 2012 by Sipos (talk | contribs)
Jump to navigation Jump to search

General Project Information

  • Leader: Ing. Milan Rusko <milan.rusko@savba.sk>, IISAS, Slovakia (Administration: Gergely Sipos <gergely.sipos@egi.eu>)
  • Mailing List: vt-speech-processing@mailman.egi.eu
  • Status: Active
  • Start Date: 7/Mar/2011
  • End Date: not yet
  • Meetings: not yet

Motivation

Current automatic speech processing technology is strongly oriented to data-driven approaches demanding huge computational power especially in the training and testing phases. The evaluation of an automatic speech recognition (ASR) system with one setting typically requires several hours of computing on a one hundred core computer cluster. Since there are tens of parameters and settings, most of the iteration based optimization seem to be too computationally expensive. Moreover, optimization of one part of the recognizer is not independent from the settings of the other parts. Speech processing community should therefore take the opportunity of exploiting the benefits of grid technology and its enormous computing power in an effort to achieve satisfactory optimization of the contemporary ASR systems. Furthermore, approaches useful for ASR can be easily extended to modern speech synthesis systems since both problems are commonly based on very similar principles of modeling.

Output

The expected output is two-fold. First, through a dedicated user-interface, Grid computing will become available to a wide scientific community of researchers dealing with speech processing. Second, a set of methods for optimization and diagnostics specifically in speech processing and tools implementing these methods in the grid platform will be developed.

Tasks

The required output for the project will be achieved by the following tasks:

  1. Establishment of contacts, investigation of the state of the art, formation of a consortium
  2. Methodology development for
    1. holistic optimization
      1. ASR (may include speaker identification, speaker recognition and language recognition)
      2. Text to Speech (TTS) systems
    2. holistic diagnostics
      1. ASR
      2. TTS
  3. Implementation aspects
    1. porting the computations in the Automatic Speech Processing domain to the Grid platform
    2. solving particular domain-dependent problems of using Grid computing in automatic speech processing
      1. Problem of needed high data transfers and its influence on Grid computing speed
      2. Data security and program security
  4. Storage possibilities for large databases in Grid
  5. Porting commercial applications to Grid

Members

  • NGIs - confirmed:
    • Slovakia:
      • Ing. Milan Rusko (Institute of Informatics of the Slovak Academy of Sciences (Leader))
      • Ladislav Hluchy, SAVBA
      • Technical University in Košice, Slovak Republic,
        • Speech processing group
        • Grid computing group
    • Switzerland:
      • Milos Cernak, Idiap research institute
  • EGI.eu:
    • Nuno Ferreira
    • Gergely Sipos
    • Karolis Eigelis

Resources

Progress