Portuguese - Brazil

Portuguese Speech Data - Scripted Monologue - 43h

Retail
Audio demo
Sender
Invitee

Dataset Details

About

Domain
Retail
Use case(s)
mobile speech
Model Applications
Acoustic Modelling, ASR Testing, Benchmarking
Total recordings
26590
Hours
43
Word error rate (%) Measurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced.
5.1%
Total prompts
26590
Unique prompts
10482
Average amount of recordings per speaker
44.1
License Type Link
Published date
Sep 1, 2021
File size
4.63GB
Packaging description
A zip file containing metadata files in tsv format and a folder with all the audio files

Demographic

Number of speakers
603.000000
Locale The language(s) and country(s) applicable to the speakers in the dataset.
pt-br
Language
Portuguese
Country
Brazil
Female | Male | Unspecified View on chart
38% | 62% | 0%
18-60
Accent(s) View on chart
Amazônia (Amazonas, Acre), Baiano (Bahia), Brasiliense (Brasília), Caipira (Estado de São Paulo), Carioca (Estado e Cidade do Rio de Janeiro), Costa Norte (Ceará), Florianópolis (Cidade de Florianópolis), Fluminense (Espirito Santo), Gaúcho (Rio Grande do Sul), Mineiro (Minas Gerais), Nordestino (Rio Grande do Norte, Paraiba, Alagoas, Pernambuco, Maranhão, Piauí), Norte (Roraima, Amapá, Pará), Paulistano (Cidade de São Paulo), Recifense (Cidade de Recife), Sertanejo (Tocantins, Goiás, Mato Grosso, Rondônia), Sul (Mato Grosso do Sul, Paraná, Santa Catarina)

Audio Details

Words
268033
Recording environment
noisy, silent
Audio format
WAV
Bits per sample
16
Device type
mobile
Communication band
broadband
Sample rate
16kHz

Chart details