Detalhes bibliográficos
Ano de defesa: |
2014 |
Autor(a) principal: |
Martins, Débora Beatriz de Jesus |
Orientador(a): |
Caseli, Helena de Medeiros
 |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
por |
Instituição de defesa: |
Universidade Federal de São Carlos
|
Programa de Pós-Graduação: |
Programa de Pós-Graduação em Ciência da Computação - PPGCC
|
Departamento: |
Não Informado pela instituição
|
País: |
BR
|
Palavras-chave em Português: |
|
Palavras-chave em Inglês: |
|
Área do conhecimento CNPq: |
|
Link de acesso: |
https://repositorio.ufscar.br/handle/20.500.14289/563
|
Resumo: |
The project described in this document focusses on the post-editing of automatically translated texts. Machine Translation (MT) is the task of translating texts in natural language performed by a computer and it is part of the Natural Language Processing (NLP) research field, linked to the Artificial Intelligence (AI) area. Researches in MT using different approaches, such as linguistics and statistics, have advanced greatly since its beginning in the 1950 s. Nonetheless, the automatically translated texts, except when used to provide a basic understanding of a text, still need to go through post-editing to become well written in the target language. At present, the most common form of post-editing is that executed by human translators, whether they are professional translators or the users of the MT system themselves. Manual post-editing is more accurate but it is cost and time demanding and can be prohibitive when too many changes have to be made. As an attempt to advance in the state-of-the-art in MT research, mainly regarding Brazilian Portuguese, this research has as its goal verifying the effectiveness of using an Automated Post-Editing (APE) system in translations from English to Portuguese. By using a training corpus containing reference translations (good translations produced by humans) and translations produced by a phrase-based statistical MT system, machine learning techniques were applied for the APE creation. The resulting APE system is able to: (i) automatically identify MT errors and (ii) automatically correct MT errors by using previous error identification or not. The evaluation of the APE effectiveness was made through the usage of the automatic evaluation metrics BLEU and NIST, calculated for post-edited and not post-edited sentences. There was also manual verification of the sentences. Despite the limited results that were achieved due to the small size of our training corpus, we can conclude that the resulting APE improves MT quality from English to Portuguese. |