Audio

10 Datasets

Datasets


2000 HUB5 English

English-only speech data used most recently in the Deep Speech paper from Baidu.

speech

LibriSpeech

Audio books data set of text and speech. Nearly 500 hours of clean speech of various audio books read by multiple speakers, organized by chapters of the...

speech

VoxForge

Clean speech dataset of accented english. Useful for instances in which you expect to need robustness to different accents or intonations.

speech

TIMIT

English-only speech recognition dataset.

speech

CHIME

Noisy speech recognition challenge dataset. Dataset contains real simulated and clean voice recordings. Real being actual recordings of 4 speakers in ne...

speech

TED-LIUM

Audio transcription of TED talks. 1495 TED talks audio recordings along with full text transcriptions of those recordings.

speech

Piano-midi.de classical pia...

Piano-midi.de: classical piano pieces

symbolic music

Nottingham over 1000 folk ...

Nottingham : over 1000 folk tunes

symbolic music

MuseData electronic library...

MuseData: electronic library of classical music scores

symbolic music

JSB Chorales set of four-pa...

JSB Chorales: set of four-part harmonized chorales

symbolic music