É possível distinguir a tradução automática da tradução humana? uma perspectiva baseada em corpus e aprendizagem de máquina

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Souza, Aline Tomasuolo lattes
Orientador(a): Sardinha, Antonio Paulo Berber lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Pontifícia Universidade Católica de São Paulo
Programa de Pós-Graduação: Programa de Estudos Pós-Graduados em Linguística Aplicada e Estudos da Linguagem
Departamento: Faculdade de Filosofia, Comunicação, Letras e Artes
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.pucsp.br/jspui/handle/handle/39453
Resumo: In recent years, there have been significant advances in machine translation technologies, leading to questions about their effectiveness compared to human translation. In this master's dissertation, we explore this issue through a corpus-based and machine-learning approach. The compiled corpus includes English texts from the financial area, specifically from listed companies, including translated texts from Portuguese to English and texts written in English by native speakers. The corpus was divided into three subcorpora: an English-native text corpus (comparable corpus), a human translation corpus, and an automatic translation corpus (parallel corpora). We used the Biber Tagger for grammatical analysis and Weka for lexical analysis of the corpora. With the Biber Tagger, we examined the grammatical structures of the corpus. Through Weka, we conducted a lexical analysis of the corpora, identifying differences and similarities between automatic translation, human translation, and texts written by native English speakers. This approach allowed us to create a probabilistic model that can predict, with 85% accuracy, if a translation was produced by a machine or a human translator. We concluded that lexically, it is possible to differentiate automatic translation from human translation; however, grammatically, both translations are nearly identical and at comparable levels to texts written by native English speakers.