Mixed-source multi-document speech-to-text summarization
Main Author: | |
---|---|
Publication Date: | 2008 |
Other Authors: | |
Language: | eng |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | http://hdl.handle.net/10071/28158 |
Summary: | Speech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We propose the inclusion of related, solid background information to cope with the difficulties of summarizing spoken language and the use of multi-document summarization techniques in single document speech- to-text summarization. In this work, we explore the possibilities offered by pho- netic information to select the background information and conduct a perceptual evaluation to better assess the relevance of the inclusion of that information. Results show that summaries generated using this approach are considerably better than those produced by an up-to-date latent semantic analysis (LSA) summarization method and suggest that humans prefer summaries restricted to the information conveyed in the input source. |
id |
RCAP_5335fdc84b200b8503311fcfa8ec96ab |
---|---|
oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/28158 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
Mixed-source multi-document speech-to-text summarizationSpeech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We propose the inclusion of related, solid background information to cope with the difficulties of summarizing spoken language and the use of multi-document summarization techniques in single document speech- to-text summarization. In this work, we explore the possibilities offered by pho- netic information to select the background information and conduct a perceptual evaluation to better assess the relevance of the inclusion of that information. Results show that summaries generated using this approach are considerably better than those produced by an up-to-date latent semantic analysis (LSA) summarization method and suggest that humans prefer summaries restricted to the information conveyed in the input source.Coling 2008 Organizing Committee2023-03-03T09:45:22Z2008-01-01T00:00:00Z20082023-03-03T09:43:31Zconference objectinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/10071/28158eng978-1-905593-51-4Ribeiro, R.Matos, D. M. de.info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-07-07T02:28:56Zoai:repositorio.iscte-iul.pt:10071/28158Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:59:08.325915Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
Mixed-source multi-document speech-to-text summarization |
title |
Mixed-source multi-document speech-to-text summarization |
spellingShingle |
Mixed-source multi-document speech-to-text summarization Ribeiro, R. |
title_short |
Mixed-source multi-document speech-to-text summarization |
title_full |
Mixed-source multi-document speech-to-text summarization |
title_fullStr |
Mixed-source multi-document speech-to-text summarization |
title_full_unstemmed |
Mixed-source multi-document speech-to-text summarization |
title_sort |
Mixed-source multi-document speech-to-text summarization |
author |
Ribeiro, R. |
author_facet |
Ribeiro, R. Matos, D. M. de. |
author_role |
author |
author2 |
Matos, D. M. de. |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Ribeiro, R. Matos, D. M. de. |
description |
Speech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We propose the inclusion of related, solid background information to cope with the difficulties of summarizing spoken language and the use of multi-document summarization techniques in single document speech- to-text summarization. In this work, we explore the possibilities offered by pho- netic information to select the background information and conduct a perceptual evaluation to better assess the relevance of the inclusion of that information. Results show that summaries generated using this approach are considerably better than those produced by an up-to-date latent semantic analysis (LSA) summarization method and suggest that humans prefer summaries restricted to the information conveyed in the input source. |
publishDate |
2008 |
dc.date.none.fl_str_mv |
2008-01-01T00:00:00Z 2008 2023-03-03T09:45:22Z 2023-03-03T09:43:31Z |
dc.type.driver.fl_str_mv |
conference object |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10071/28158 |
url |
http://hdl.handle.net/10071/28158 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
978-1-905593-51-4 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Coling 2008 Organizing Committee |
publisher.none.fl_str_mv |
Coling 2008 Organizing Committee |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833597113237241856 |