When External Knowledge Does Not Aggregate in Named Entity Recognition

Bibliographic Details
Main Author: Privatto, Pedro Ivo Monteiro [UNESP]
Publication Date: 2021
Other Authors: Guilherme, Ivan Rizzo [UNESP]
Format: Conference object
Language: eng
Source: Repositório Institucional da UNESP
Download full: http://dx.doi.org/10.1007/978-3-030-91699-2_42
http://hdl.handle.net/11449/233938
Summary: In the different areas of knowledge, textual data are important sources of information. This way, Information Extraction methods have been developed to identify and structure information present in textual documents. In particular there is the Named Entity Recognition (NER) task, which consists of using methods to identify Named Entities, such as Person, Place, among others, in texts, using techniques from Natural Language Processing and Machine Learning. Recent works explored the use of external sources of knowledge to boost the Machine Learning models with sets of domain specific relevant information for the NER task. This work aims to evaluate the aggregation of external knowledge, in the form of Gazetter and Knowledge Graphs, for NER task. Our approach is composed of two steps: i) generation of embeddings, ii) definition and training of the Machine Learning methods. The experiments were conducted on four English datasets, and their results show that the applied strategies for external knowledge integration did not bring great gains to the models, as expressed by F1-Score metric. In the performed experiments, there was an F1-score increase in 17 of the 32 cases where external knowledge was used, but in most cases the gains were lesser than 0.5% in F1-score. In some scenarios the aggregated external knowledge does not capture relevant content, thus not being necessarily beneficial to the methodology.
id UNSP_138468a84eee9f53c3f7454a65ebd78d
oai_identifier_str oai:repositorio.unesp.br:11449/233938
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling When External Knowledge Does Not Aggregate in Named Entity RecognitionInformation extractionKnowledge embeddingsNamed entity recognitionIn the different areas of knowledge, textual data are important sources of information. This way, Information Extraction methods have been developed to identify and structure information present in textual documents. In particular there is the Named Entity Recognition (NER) task, which consists of using methods to identify Named Entities, such as Person, Place, among others, in texts, using techniques from Natural Language Processing and Machine Learning. Recent works explored the use of external sources of knowledge to boost the Machine Learning models with sets of domain specific relevant information for the NER task. This work aims to evaluate the aggregation of external knowledge, in the form of Gazetter and Knowledge Graphs, for NER task. Our approach is composed of two steps: i) generation of embeddings, ii) definition and training of the Machine Learning methods. The experiments were conducted on four English datasets, and their results show that the applied strategies for external knowledge integration did not bring great gains to the models, as expressed by F1-Score metric. In the performed experiments, there was an F1-score increase in 17 of the 32 cases where external knowledge was used, but in most cases the gains were lesser than 0.5% in F1-score. In some scenarios the aggregated external knowledge does not capture relevant content, thus not being necessarily beneficial to the methodology.Institute of Geosciences and Exact Sciences UNESP - São Paulo State University, SPInstitute of Geosciences and Exact Sciences UNESP - São Paulo State University, SPUniversidade Estadual Paulista (UNESP)Privatto, Pedro Ivo Monteiro [UNESP]Guilherme, Ivan Rizzo [UNESP]2022-05-01T11:54:06Z2022-05-01T11:54:06Z2021-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject616-627http://dx.doi.org/10.1007/978-3-030-91699-2_42Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13074 LNAI, p. 616-627.1611-33490302-9743http://hdl.handle.net/11449/23393810.1007/978-3-030-91699-2_422-s2.0-85121798857Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)info:eu-repo/semantics/openAccess2024-11-27T14:10:17Zoai:repositorio.unesp.br:11449/233938Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestrepositoriounesp@unesp.bropendoar:29462024-11-27T14:10:17Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv When External Knowledge Does Not Aggregate in Named Entity Recognition
title When External Knowledge Does Not Aggregate in Named Entity Recognition
spellingShingle When External Knowledge Does Not Aggregate in Named Entity Recognition
Privatto, Pedro Ivo Monteiro [UNESP]
Information extraction
Knowledge embeddings
Named entity recognition
title_short When External Knowledge Does Not Aggregate in Named Entity Recognition
title_full When External Knowledge Does Not Aggregate in Named Entity Recognition
title_fullStr When External Knowledge Does Not Aggregate in Named Entity Recognition
title_full_unstemmed When External Knowledge Does Not Aggregate in Named Entity Recognition
title_sort When External Knowledge Does Not Aggregate in Named Entity Recognition
author Privatto, Pedro Ivo Monteiro [UNESP]
author_facet Privatto, Pedro Ivo Monteiro [UNESP]
Guilherme, Ivan Rizzo [UNESP]
author_role author
author2 Guilherme, Ivan Rizzo [UNESP]
author2_role author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (UNESP)
dc.contributor.author.fl_str_mv Privatto, Pedro Ivo Monteiro [UNESP]
Guilherme, Ivan Rizzo [UNESP]
dc.subject.por.fl_str_mv Information extraction
Knowledge embeddings
Named entity recognition
topic Information extraction
Knowledge embeddings
Named entity recognition
description In the different areas of knowledge, textual data are important sources of information. This way, Information Extraction methods have been developed to identify and structure information present in textual documents. In particular there is the Named Entity Recognition (NER) task, which consists of using methods to identify Named Entities, such as Person, Place, among others, in texts, using techniques from Natural Language Processing and Machine Learning. Recent works explored the use of external sources of knowledge to boost the Machine Learning models with sets of domain specific relevant information for the NER task. This work aims to evaluate the aggregation of external knowledge, in the form of Gazetter and Knowledge Graphs, for NER task. Our approach is composed of two steps: i) generation of embeddings, ii) definition and training of the Machine Learning methods. The experiments were conducted on four English datasets, and their results show that the applied strategies for external knowledge integration did not bring great gains to the models, as expressed by F1-Score metric. In the performed experiments, there was an F1-score increase in 17 of the 32 cases where external knowledge was used, but in most cases the gains were lesser than 0.5% in F1-score. In some scenarios the aggregated external knowledge does not capture relevant content, thus not being necessarily beneficial to the methodology.
publishDate 2021
dc.date.none.fl_str_mv 2021-01-01
2022-05-01T11:54:06Z
2022-05-01T11:54:06Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/conferenceObject
format conferenceObject
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1007/978-3-030-91699-2_42
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13074 LNAI, p. 616-627.
1611-3349
0302-9743
http://hdl.handle.net/11449/233938
10.1007/978-3-030-91699-2_42
2-s2.0-85121798857
url http://dx.doi.org/10.1007/978-3-030-91699-2_42
http://hdl.handle.net/11449/233938
identifier_str_mv Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13074 LNAI, p. 616-627.
1611-3349
0302-9743
10.1007/978-3-030-91699-2_42
2-s2.0-85121798857
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 616-627
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv repositoriounesp@unesp.br
_version_ 1834483162804125696