Export Ready — 

Detecting translingual plagiarism and the backlash against translation plagiarists

Bibliographic Details
Main Author: Sousa-Silva, Rui
Publication Date: 2017
Format: Article
Language: por
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: https://ojs.letras.up.pt/index.php/LLLD/article/view/2444
Summary: Plagiarism detection methods have improved signiVcantly over the last decades, and as a result of the advanced research conducted by computational and mostly forensic linguists, simple and sophisticated textual borrowing strategies can now be identiVed more easily. In particular, simple text comparison algorithms developed by computational linguists allow literal, word-for-word plagiarism (i.e. where identical strings of text are reused across diUerent documents) to be easily detected (semi-)automatically (e.g. Turnitin or SafeAssign), although these methods tend to perform less well when the borrowing is offuscated by introducing edits to the original text. In this case, more sophisticated linguistic techniques, such as an analysis of lexical overlap (Johnson, 1997), are required to detect the borrowing. However, these have limited applicability in cases of ‘translingual’ plagiarism, where a text is translated and borrowed without acknowledgment from an original in another language. Considering that (a) traditionally non-professional translation (e.g. literal or free machine translation) is the method used to plagiarise; (b) the plagiarist usually edits the text for grammar and syntax, especially when machine-translated; and (c) lexical items are those that tend to be translated more correctly, and carried over to the derivative text, this paper proposes a method for ‘translingual’ plagiarism detection that is grounded on translation and interlanguage theories (Selinker, 1972; Bassnett and Lefevere, 1998), as well as on the principle of ‘linguistic uniqueness’ (Coulthard, 2004). Empirical evidence from the CorRUPT corpus (Corpus of Reused and Plagiarised Texts), a corpus of real academic and non-academic texts that were investigated and accused of plagiarising originals in other languages, is used to illustrate the applicability of the methodology proposed for ‘translingual’ plagiarism detection. Finally, applications of the method as an investigative tool in forensic contexts are discussed.
id RCAP_c60ed30f9331d92db2f04dcaf1fc3188
oai_identifier_str oai:ojs.pkp.sfu.ca:article/2444
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Detecting translingual plagiarism and the backlash against translation plagiaristsArtigos/ArticlesPlagiarism detection methods have improved signiVcantly over the last decades, and as a result of the advanced research conducted by computational and mostly forensic linguists, simple and sophisticated textual borrowing strategies can now be identiVed more easily. In particular, simple text comparison algorithms developed by computational linguists allow literal, word-for-word plagiarism (i.e. where identical strings of text are reused across diUerent documents) to be easily detected (semi-)automatically (e.g. Turnitin or SafeAssign), although these methods tend to perform less well when the borrowing is offuscated by introducing edits to the original text. In this case, more sophisticated linguistic techniques, such as an analysis of lexical overlap (Johnson, 1997), are required to detect the borrowing. However, these have limited applicability in cases of ‘translingual’ plagiarism, where a text is translated and borrowed without acknowledgment from an original in another language. Considering that (a) traditionally non-professional translation (e.g. literal or free machine translation) is the method used to plagiarise; (b) the plagiarist usually edits the text for grammar and syntax, especially when machine-translated; and (c) lexical items are those that tend to be translated more correctly, and carried over to the derivative text, this paper proposes a method for ‘translingual’ plagiarism detection that is grounded on translation and interlanguage theories (Selinker, 1972; Bassnett and Lefevere, 1998), as well as on the principle of ‘linguistic uniqueness’ (Coulthard, 2004). Empirical evidence from the CorRUPT corpus (Corpus of Reused and Plagiarised Texts), a corpus of real academic and non-academic texts that were investigated and accused of plagiarising originals in other languages, is used to illustrate the applicability of the methodology proposed for ‘translingual’ plagiarism detection. Finally, applications of the method as an investigative tool in forensic contexts are discussed.Faculdade de Letras da Universidade do Porto2017-05-30T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttps://ojs.letras.up.pt/index.php/LLLD/article/view/2444por2183-3745Sousa-Silva, Ruiinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2022-09-21T15:48:18Zoai:ojs.pkp.sfu.ca:article/2444Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T10:16:34.031415Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Detecting translingual plagiarism and the backlash against translation plagiarists
title Detecting translingual plagiarism and the backlash against translation plagiarists
spellingShingle Detecting translingual plagiarism and the backlash against translation plagiarists
Sousa-Silva, Rui
Artigos/Articles
title_short Detecting translingual plagiarism and the backlash against translation plagiarists
title_full Detecting translingual plagiarism and the backlash against translation plagiarists
title_fullStr Detecting translingual plagiarism and the backlash against translation plagiarists
title_full_unstemmed Detecting translingual plagiarism and the backlash against translation plagiarists
title_sort Detecting translingual plagiarism and the backlash against translation plagiarists
author Sousa-Silva, Rui
author_facet Sousa-Silva, Rui
author_role author
dc.contributor.author.fl_str_mv Sousa-Silva, Rui
dc.subject.por.fl_str_mv Artigos/Articles
topic Artigos/Articles
description Plagiarism detection methods have improved signiVcantly over the last decades, and as a result of the advanced research conducted by computational and mostly forensic linguists, simple and sophisticated textual borrowing strategies can now be identiVed more easily. In particular, simple text comparison algorithms developed by computational linguists allow literal, word-for-word plagiarism (i.e. where identical strings of text are reused across diUerent documents) to be easily detected (semi-)automatically (e.g. Turnitin or SafeAssign), although these methods tend to perform less well when the borrowing is offuscated by introducing edits to the original text. In this case, more sophisticated linguistic techniques, such as an analysis of lexical overlap (Johnson, 1997), are required to detect the borrowing. However, these have limited applicability in cases of ‘translingual’ plagiarism, where a text is translated and borrowed without acknowledgment from an original in another language. Considering that (a) traditionally non-professional translation (e.g. literal or free machine translation) is the method used to plagiarise; (b) the plagiarist usually edits the text for grammar and syntax, especially when machine-translated; and (c) lexical items are those that tend to be translated more correctly, and carried over to the derivative text, this paper proposes a method for ‘translingual’ plagiarism detection that is grounded on translation and interlanguage theories (Selinker, 1972; Bassnett and Lefevere, 1998), as well as on the principle of ‘linguistic uniqueness’ (Coulthard, 2004). Empirical evidence from the CorRUPT corpus (Corpus of Reused and Plagiarised Texts), a corpus of real academic and non-academic texts that were investigated and accused of plagiarising originals in other languages, is used to illustrate the applicability of the methodology proposed for ‘translingual’ plagiarism detection. Finally, applications of the method as an investigative tool in forensic contexts are discussed.
publishDate 2017
dc.date.none.fl_str_mv 2017-05-30T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://ojs.letras.up.pt/index.php/LLLD/article/view/2444
url https://ojs.letras.up.pt/index.php/LLLD/article/view/2444
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv 2183-3745
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Faculdade de Letras da Universidade do Porto
publisher.none.fl_str_mv Faculdade de Letras da Universidade do Porto
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833590662300172288