Czech - Czechia

Czech Speech Data - Spontaneous Dialogue - 112h

Telecommunication
Audio demo
Sender
Invitee

Dataset Details

About

Domain
Telecommunication
Use case(s)
call centre, conversational AI
Model Applications
Acoustic Modelling, ASR Testing, Benchmarking
Total recordings
1549
Hours
112
Word error rate (%) Measurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced.
0%
Average amount of recordings per speaker
14.15
License Type Link
Published date
Jun 21, 2022
File size
12.12GB
Packaging description
A zip file containing metadata files in tsv format and a folder with all the audio files

Demographic

Number of speakers
219.000000
Locale The language(s) and country(s) applicable to the speakers in the dataset.
cs-cz
Language
Czech
Speaker Origin Language
Czech
Country
Czechia
Speaker Origin Country
Czechia
Female | Male | Unspecified View on chart
49% | 31% | 20%
18-73
Accent(s) View on chart
Benešov, Beroun, Blansko, Brno-mesto, Brno-venkov, Bruntál, Ceské Budejovice, Ceský Krumlov, Cheb, Chomutov, Chrudim, Decín, Domažlice, Frýdek Místek, Havlíckuv Brod, Hradec Králové, Jicín, Jihlava, Jindrichuv Hradec, Karlovy Vary, Karviná, Kladno, Kolín, Kromeríž, Liberec, Litomerice, Most, Náchod, Nový Jicín, Olomouc, Opava, Ostrava-mesto, Other, Pardubice, Pelhrimov, Plzen-mesto, Praha-východ, Praha-západ, Prerov, Prostejov, Svitavy, Trebíc, Trutnov, United States, Ústí nad Labem, Vsetín, Vyškov, Ždár nad Sázavou, Zlín

Audio Details

Words
1426054
Recording environment
noisy, silent
Audio format
WAV
Bits per sample
16
Device type
mobile
Communication band
narrowband
Sample rate
8kHz

Chart details