English - Spanish

English Speech Data - Scripted Monologue - 20h

Generic
Audio demo
Sender
Invitee

Dataset Details

About

Domain
Generic
Use case(s)
mobile speech
Model Applications
Acoustic Modelling, ASR Testing, Benchmarking
Total recordings
12045
Hours
20
Word error rate (%) Measurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced.
4.1%
Total prompts
12045
Unique prompts
7797
Average amount of recordings per speaker
49.36
License Type Link
Published date
Mar 27, 2022
File size
2.15GB
Packaging description
A zip file containing metadata files in tsv format and a folder with all the audio files

Demographic

Number of speakers
244.000000
Locale The language(s) and country(s) applicable to the speakers in the dataset.
en-es
Language
English
Country
Spanish
Female | Male | Unspecified View on chart
43% | 57% | 0%
19-64
Accent(s) View on chart
Aguascalientes, Aisén del General Carlos Ibañez del Campo, Alicante, Antioquia, Antofagasta, Anzoátegui, Aragua, Atlántico, Ávila, Baja California, Baja California Sur, Barcelona, Biobío, Bolívar, Buenos Aires, Carabobo, Castellón, Chihuahua, Ciudad de México, Coahuila de Zaragoza, Colima, Distrito Capital, Distrito Capital de Bogotá, Durango, Falcón, Florida, Guadalajara, Guanajuato, Guatemala, Guerrero, Huelva, Jaén, Jalisco, Lara, Lima, Los Lagos, Madrid, Managua, Mérida, México, Miranda, Morelos, Nayarit, Nuevo León, Oregon, Portuguesa, Puebla, Quintana Roo, Región Metropolitana de Santiago, San Juan, Sinaloa, Sonora, Sucre, Tamaulipas, Tarragona, Valparaíso, Veracruz de Ignacio de la Llave, Washington, Yucatán, Zacatecas, Zulia

Audio Details

Words
101421
Recording environment
silent
Audio format
WAV
Bits per sample
16
Device type
mobile
Communication band
broadband
Sample rate
16kHz

Chart details