When External Knowledge Does Not Aggregate in Named Entity Recognition

Privatto, Pedro Ivo Monteiro [UNESP]; Guilherme, Ivan Rizzo [UNESP]

When External Knowledge Does Not Aggregate in Named Entity Recognition

Bibliographic Details
Main Author:	Privatto, Pedro Ivo Monteiro [UNESP]
Publication Date:	2021
Other Authors:	Guilherme, Ivan Rizzo [UNESP]
Format:	Conference object
Language:	eng
Source:	Repositório Institucional da UNESP
Download full:	http://dx.doi.org/10.1007/978-3-030-91699-2_42 http://hdl.handle.net/11449/233938
Summary:	In the different areas of knowledge, textual data are important sources of information. This way, Information Extraction methods have been developed to identify and structure information present in textual documents. In particular there is the Named Entity Recognition (NER) task, which consists of using methods to identify Named Entities, such as Person, Place, among others, in texts, using techniques from Natural Language Processing and Machine Learning. Recent works explored the use of external sources of knowledge to boost the Machine Learning models with sets of domain specific relevant information for the NER task. This work aims to evaluate the aggregation of external knowledge, in the form of Gazetter and Knowledge Graphs, for NER task. Our approach is composed of two steps: i) generation of embeddings, ii) definition and training of the Machine Learning methods. The experiments were conducted on four English datasets, and their results show that the applied strategies for external knowledge integration did not bring great gains to the models, as expressed by F1-Score metric. In the performed experiments, there was an F1-score increase in 17 of the 32 cases where external knowledge was used, but in most cases the gains were lesser than 0.5% in F1-score. In some scenarios the aggregated external knowledge does not capture relevant content, thus not being necessarily beneficial to the methodology.

Item metadata

id	UNSP_138468a84eee9f53c3f7454a65ebd78d
oai_identifier_str	oai:repositorio.unesp.br:11449/233938
network_acronym_str	UNSP
network_name_str	Repositório Institucional da UNESP
repository_id_str	2946
spelling	When External Knowledge Does Not Aggregate in Named Entity RecognitionInformation extractionKnowledge embeddingsNamed entity recognitionIn the different areas of knowledge, textual data are important sources of information. This way, Information Extraction methods have been developed to identify and structure information present in textual documents. In particular there is the Named Entity Recognition (NER) task, which consists of using methods to identify Named Entities, such as Person, Place, among others, in texts, using techniques from Natural Language Processing and Machine Learning. Recent works explored the use of external sources of knowledge to boost the Machine Learning models with sets of domain specific relevant information for the NER task. This work aims to evaluate the aggregation of external knowledge, in the form of Gazetter and Knowledge Graphs, for NER task. Our approach is composed of two steps: i) generation of embeddings, ii) definition and training of the Machine Learning methods. The experiments were conducted on four English datasets, and their results show that the applied strategies for external knowledge integration did not bring great gains to the models, as expressed by F1-Score metric. In the performed experiments, there was an F1-score increase in 17 of the 32 cases where external knowledge was used, but in most cases the gains were lesser than 0.5% in F1-score. In some scenarios the aggregated external knowledge does not capture relevant content, thus not being necessarily beneficial to the methodology.Institute of Geosciences and Exact Sciences UNESP - São Paulo State University, SPInstitute of Geosciences and Exact Sciences UNESP - São Paulo State University, SPUniversidade Estadual Paulista (UNESP)Privatto, Pedro Ivo Monteiro [UNESP]Guilherme, Ivan Rizzo [UNESP]2022-05-01T11:54:06Z2022-05-01T11:54:06Z2021-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject616-627http://dx.doi.org/10.1007/978-3-030-91699-2_42Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13074 LNAI, p. 616-627.1611-33490302-9743http://hdl.handle.net/11449/23393810.1007/978-3-030-91699-2_422-s2.0-85121798857Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)info:eu-repo/semantics/openAccess2024-11-27T14:10:17Zoai:repositorio.unesp.br:11449/233938Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestrepositoriounesp@unesp.bropendoar:29462024-11-27T14:10:17Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv	When External Knowledge Does Not Aggregate in Named Entity Recognition
title	When External Knowledge Does Not Aggregate in Named Entity Recognition
spellingShingle	When External Knowledge Does Not Aggregate in Named Entity Recognition Privatto, Pedro Ivo Monteiro [UNESP] Information extraction Knowledge embeddings Named entity recognition
title_short	When External Knowledge Does Not Aggregate in Named Entity Recognition
title_full	When External Knowledge Does Not Aggregate in Named Entity Recognition
title_fullStr	When External Knowledge Does Not Aggregate in Named Entity Recognition
title_full_unstemmed	When External Knowledge Does Not Aggregate in Named Entity Recognition
title_sort	When External Knowledge Does Not Aggregate in Named Entity Recognition
author	Privatto, Pedro Ivo Monteiro [UNESP]
author_facet	Privatto, Pedro Ivo Monteiro [UNESP] Guilherme, Ivan Rizzo [UNESP]
author_role	author
author2	Guilherme, Ivan Rizzo [UNESP]
author2_role	author
dc.contributor.none.fl_str_mv	Universidade Estadual Paulista (UNESP)
dc.contributor.author.fl_str_mv	Privatto, Pedro Ivo Monteiro [UNESP] Guilherme, Ivan Rizzo [UNESP]
dc.subject.por.fl_str_mv	Information extraction Knowledge embeddings Named entity recognition
topic	Information extraction Knowledge embeddings Named entity recognition
description	In the different areas of knowledge, textual data are important sources of information. This way, Information Extraction methods have been developed to identify and structure information present in textual documents. In particular there is the Named Entity Recognition (NER) task, which consists of using methods to identify Named Entities, such as Person, Place, among others, in texts, using techniques from Natural Language Processing and Machine Learning. Recent works explored the use of external sources of knowledge to boost the Machine Learning models with sets of domain specific relevant information for the NER task. This work aims to evaluate the aggregation of external knowledge, in the form of Gazetter and Knowledge Graphs, for NER task. Our approach is composed of two steps: i) generation of embeddings, ii) definition and training of the Machine Learning methods. The experiments were conducted on four English datasets, and their results show that the applied strategies for external knowledge integration did not bring great gains to the models, as expressed by F1-Score metric. In the performed experiments, there was an F1-score increase in 17 of the 32 cases where external knowledge was used, but in most cases the gains were lesser than 0.5% in F1-score. In some scenarios the aggregated external knowledge does not capture relevant content, thus not being necessarily beneficial to the methodology.
publishDate	2021
dc.date.none.fl_str_mv	2021-01-01 2022-05-01T11:54:06Z 2022-05-01T11:54:06Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/conferenceObject
format	conferenceObject
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://dx.doi.org/10.1007/978-3-030-91699-2_42 Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13074 LNAI, p. 616-627. 1611-3349 0302-9743 http://hdl.handle.net/11449/233938 10.1007/978-3-030-91699-2_42 2-s2.0-85121798857
url	http://dx.doi.org/10.1007/978-3-030-91699-2_42 http://hdl.handle.net/11449/233938
identifier_str_mv	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13074 LNAI, p. 616-627. 1611-3349 0302-9743 10.1007/978-3-030-91699-2_42 2-s2.0-85121798857
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	616-627
dc.source.none.fl_str_mv	Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP
instname_str	Universidade Estadual Paulista (UNESP)
instacron_str	UNESP
institution	UNESP
reponame_str	Repositório Institucional da UNESP
collection	Repositório Institucional da UNESP
repository.name.fl_str_mv	Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv	repositoriounesp@unesp.br
_version_	1834483162804125696

When External Knowledge Does Not Aggregate in Named Entity Recognition

Similar Items