Export Ready — 

Comparison of record linkage methods

Bibliographic Details
Main Author: Vieira, Marcus André Alves Zimmermann
Publication Date: 2023
Other Authors: Louise e Silva, Karoline
Format: Article
Language: eng
Source: GeSec
Download full: https://ojs.revistagesec.org.br/secretariado/article/view/2171
Summary: Record linkage is an important tool to enhance database integration. This is even more valuable in a scenario with more hefty budget cuts and a growing drop in response rate in traditional surveys. This strategy makes it possible to expand the crossing alternatives with variables not present in the original base. However, there are many different data pairing methods exposed in the literature. In this sense, the objective of this paper is to compare well-known methods of record linkage. The comparison was made in synthetic dataset. To compare the methods, it was adopted a quantitative approach based on the Precision, Recall, and F-Statistics metrics, using two comparison functions: Levenshtein and Jaro-Winkler. Among the six types of classifiers analyzed, the supervised methods had the best results.
id SINSESP_ee6910cecf5b5a87bfdff82b40e4c85e
oai_identifier_str oai:ojs2.revistagesec.org.br:article/2171
network_acronym_str SINSESP
network_name_str GeSec
repository_id_str
spelling Comparison of record linkage methodsRecord LinkageData CleaningComparisonClassificationQualityRecord linkage is an important tool to enhance database integration. This is even more valuable in a scenario with more hefty budget cuts and a growing drop in response rate in traditional surveys. This strategy makes it possible to expand the crossing alternatives with variables not present in the original base. However, there are many different data pairing methods exposed in the literature. In this sense, the objective of this paper is to compare well-known methods of record linkage. The comparison was made in synthetic dataset. To compare the methods, it was adopted a quantitative approach based on the Precision, Recall, and F-Statistics metrics, using two comparison functions: Levenshtein and Jaro-Winkler. Among the six types of classifiers analyzed, the supervised methods had the best results.Revista de Gestão e Secretariado2023-05-18info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://ojs.revistagesec.org.br/secretariado/article/view/217110.7769/gesec.v14i5.2171Revista de Gestão e Secretariado (Management and Administrative Professional Review); Vol. 14 No. 5 (2023): Revista de Gestão e Secretariado v.14, n.5, 2023; 7999-8004Revista de Gestão e Secretariado; Vol. 14 Núm. 5 (2023): Revista de Gestão e Secretariado v.14, n.5, 2023; 7999-8004Revista de Gestão e Secretariado; v. 14 n. 5 (2023): Revista de Gestão e Secretariado v.14, n.5, 2023; 7999-80042178-9010reponame:GeSecinstname:Sindicato das Secretárias do Estado de São Paulo (SINSESP)instacron:SINSESPenghttps://ojs.revistagesec.org.br/secretariado/article/view/2171/1142Vieira, Marcus André Alves ZimmermannLouise e Silva, Karolineinfo:eu-repo/semantics/openAccess2023-05-19T10:31:03Zoai:ojs2.revistagesec.org.br:article/2171Revistahttps://www.revistagesec.org.br/ONGhttps://ojs.revistagesec.org.br/secretariado/oaieditor@revistagesec.org.br | gestoreditorial@revistagesec.org.br | rf.sabino@gmail.com2178-90102178-9010opendoar:2023-05-19T10:31:03GeSec - Sindicato das Secretárias do Estado de São Paulo (SINSESP)false
dc.title.none.fl_str_mv Comparison of record linkage methods
title Comparison of record linkage methods
spellingShingle Comparison of record linkage methods
Vieira, Marcus André Alves Zimmermann
Record Linkage
Data Cleaning
Comparison
Classification
Quality
title_short Comparison of record linkage methods
title_full Comparison of record linkage methods
title_fullStr Comparison of record linkage methods
title_full_unstemmed Comparison of record linkage methods
title_sort Comparison of record linkage methods
author Vieira, Marcus André Alves Zimmermann
author_facet Vieira, Marcus André Alves Zimmermann
Louise e Silva, Karoline
author_role author
author2 Louise e Silva, Karoline
author2_role author
dc.contributor.author.fl_str_mv Vieira, Marcus André Alves Zimmermann
Louise e Silva, Karoline
dc.subject.por.fl_str_mv Record Linkage
Data Cleaning
Comparison
Classification
Quality
topic Record Linkage
Data Cleaning
Comparison
Classification
Quality
description Record linkage is an important tool to enhance database integration. This is even more valuable in a scenario with more hefty budget cuts and a growing drop in response rate in traditional surveys. This strategy makes it possible to expand the crossing alternatives with variables not present in the original base. However, there are many different data pairing methods exposed in the literature. In this sense, the objective of this paper is to compare well-known methods of record linkage. The comparison was made in synthetic dataset. To compare the methods, it was adopted a quantitative approach based on the Precision, Recall, and F-Statistics metrics, using two comparison functions: Levenshtein and Jaro-Winkler. Among the six types of classifiers analyzed, the supervised methods had the best results.
publishDate 2023
dc.date.none.fl_str_mv 2023-05-18
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://ojs.revistagesec.org.br/secretariado/article/view/2171
10.7769/gesec.v14i5.2171
url https://ojs.revistagesec.org.br/secretariado/article/view/2171
identifier_str_mv 10.7769/gesec.v14i5.2171
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://ojs.revistagesec.org.br/secretariado/article/view/2171/1142
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Revista de Gestão e Secretariado
publisher.none.fl_str_mv Revista de Gestão e Secretariado
dc.source.none.fl_str_mv Revista de Gestão e Secretariado (Management and Administrative Professional Review); Vol. 14 No. 5 (2023): Revista de Gestão e Secretariado v.14, n.5, 2023; 7999-8004
Revista de Gestão e Secretariado; Vol. 14 Núm. 5 (2023): Revista de Gestão e Secretariado v.14, n.5, 2023; 7999-8004
Revista de Gestão e Secretariado; v. 14 n. 5 (2023): Revista de Gestão e Secretariado v.14, n.5, 2023; 7999-8004
2178-9010
reponame:GeSec
instname:Sindicato das Secretárias do Estado de São Paulo (SINSESP)
instacron:SINSESP
instname_str Sindicato das Secretárias do Estado de São Paulo (SINSESP)
instacron_str SINSESP
institution SINSESP
reponame_str GeSec
collection GeSec
repository.name.fl_str_mv GeSec - Sindicato das Secretárias do Estado de São Paulo (SINSESP)
repository.mail.fl_str_mv editor@revistagesec.org.br | gestoreditorial@revistagesec.org.br | rf.sabino@gmail.com
_version_ 1838625559898226688