SpeechTek Resources


  • CAV3D (Co-located Audio-Visual streams with 3D tracks) dataset here.
  • TLT-school:  a corpus  of non  native children  speech. Click here to download.
  • DIRHA data: dataset collected during the DIRHA project, details here.


  • A matlab implementation of AV3T, a tool for audio-visual tracking of multiple targets is available here: AV3T matlab code
  • ConflictNet: An end-to-end CNN-LSTM architecture with attention mechanism that estimates the level of verbal conflict from raw speech signals (git repo with code for SSPNet Conflict Corpus)
  • On-line supervised speaker diarization using an extension of the UIS-RNN based on the use of the sample mean loss: git repo
    Check my video presentation: https://www.youtube.com/watch?v=N6fpRrt1lgo&t=11s