TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese

Bibliographic Details
Main Author: Casanova, Edresson
Publication Date: 2022
Other Authors: Junior, Arnaldo Candido, Shulby, Christopher, Oliveira, Frederico Santos de, Teixeira, João Paulo, Ponti, Moacir Antonelli, Aluísio, Sandra
Format: Article
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10198/25428
Summary: Speech provides a natural way for human–computer interaction. In particular, speech synthesis systems are popular in different applications, such as personal assistants, GPS applications, screen readers and accessibility tools. However, not all languages are on the same level when in terms of resources and systems for speech synthesis. This work consists of creating publicly available resources for Brazilian Portuguese in the form of a novel dataset along with deep learning models for end-to-end speech synthesis. Such dataset has 10.5 h from a single speaker, from which a Tacotron 2 model with the RTISI-LA vocoder presented the best performance, achieving a 4.03 MOS value. The obtained results are comparable to related works covering English language and the state-of-the-art in European Portuguese.
id RCAP_b73cf9e4e170a54adc46d822d4c7af99
oai_identifier_str oai:bibliotecadigital.ipb.pt:10198/25428
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian PortugueseCorporaSpeech synthesisTTSPortugueseSpeech provides a natural way for human–computer interaction. In particular, speech synthesis systems are popular in different applications, such as personal assistants, GPS applications, screen readers and accessibility tools. However, not all languages are on the same level when in terms of resources and systems for speech synthesis. This work consists of creating publicly available resources for Brazilian Portuguese in the form of a novel dataset along with deep learning models for end-to-end speech synthesis. Such dataset has 10.5 h from a single speaker, from which a Tacotron 2 model with the RTISI-LA vocoder presented the best performance, achieving a 4.03 MOS value. The obtained results are comparable to related works covering English language and the state-of-the-art in European Portuguese.This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001, as well as CNPq (National Council of Technological and Scientific Development) Grant 304266/2020-5Biblioteca Digital do IPBCasanova, EdressonJunior, Arnaldo CandidoShulby, ChristopherOliveira, Frederico Santos deTeixeira, João PauloPonti, Moacir AntonelliAluísio, Sandra2022-05-11T10:56:15Z20222022-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10198/25428engCasanova, Edresson; Junior, Arnaldo Candido; Shulby, Christopher; Oliveira, Frederico Santos de; Teixeira, João Paulo; Ponti, Moacir Antonelli; Aluísio, Sandra (2022). IN PRESS - TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese. Language Resources and Evaluation.1574-021810.1007/s10579-021-09570-41574-020Xinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-25T12:16:11Zoai:bibliotecadigital.ipb.pt:10198/25428Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T11:43:37.044010Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
title TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
spellingShingle TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
Casanova, Edresson
Corpora
Speech synthesis
TTS
Portuguese
title_short TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
title_full TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
title_fullStr TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
title_full_unstemmed TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
title_sort TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
author Casanova, Edresson
author_facet Casanova, Edresson
Junior, Arnaldo Candido
Shulby, Christopher
Oliveira, Frederico Santos de
Teixeira, João Paulo
Ponti, Moacir Antonelli
Aluísio, Sandra
author_role author
author2 Junior, Arnaldo Candido
Shulby, Christopher
Oliveira, Frederico Santos de
Teixeira, João Paulo
Ponti, Moacir Antonelli
Aluísio, Sandra
author2_role author
author
author
author
author
author
dc.contributor.none.fl_str_mv Biblioteca Digital do IPB
dc.contributor.author.fl_str_mv Casanova, Edresson
Junior, Arnaldo Candido
Shulby, Christopher
Oliveira, Frederico Santos de
Teixeira, João Paulo
Ponti, Moacir Antonelli
Aluísio, Sandra
dc.subject.por.fl_str_mv Corpora
Speech synthesis
TTS
Portuguese
topic Corpora
Speech synthesis
TTS
Portuguese
description Speech provides a natural way for human–computer interaction. In particular, speech synthesis systems are popular in different applications, such as personal assistants, GPS applications, screen readers and accessibility tools. However, not all languages are on the same level when in terms of resources and systems for speech synthesis. This work consists of creating publicly available resources for Brazilian Portuguese in the form of a novel dataset along with deep learning models for end-to-end speech synthesis. Such dataset has 10.5 h from a single speaker, from which a Tacotron 2 model with the RTISI-LA vocoder presented the best performance, achieving a 4.03 MOS value. The obtained results are comparable to related works covering English language and the state-of-the-art in European Portuguese.
publishDate 2022
dc.date.none.fl_str_mv 2022-05-11T10:56:15Z
2022
2022-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10198/25428
url http://hdl.handle.net/10198/25428
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Casanova, Edresson; Junior, Arnaldo Candido; Shulby, Christopher; Oliveira, Frederico Santos de; Teixeira, João Paulo; Ponti, Moacir Antonelli; Aluísio, Sandra (2022). IN PRESS - TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese. Language Resources and Evaluation.
1574-0218
10.1007/s10579-021-09570-4
1574-020X
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833592185077891072