Detalhes bibliográficos
Ano de defesa: |
2016 |
Autor(a) principal: |
Santos, Jadson da Silva
 |
Orientador(a): |
Rocha Júnior, João Batista da
 |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
por |
Instituição de defesa: |
Universidade Estadual de Feira de Santana
|
Programa de Pós-Graduação: |
Mestrado em Computação Aplicada
|
Departamento: |
DEPARTAMENTO DE TECNOLOGIA
|
País: |
Brasil
|
Palavras-chave em Português: |
|
Palavras-chave em Inglês: |
|
Área do conhecimento CNPq: |
|
Link de acesso: |
http://localhost:8080/tede/handle/tede/554
|
Resumo: |
The Named Entity Recognition (NER) process is the task of identifying relevant termsintextsandassigningthemalabel.Suchwordscanreferencenamesofpeople, organizations, and places. The variety of techniques that can be used in the named entityrecognitionprocessislarge.Thetechniquescanbeclassifiedintothreedistinct approaches: rule-based, machine learning and hybrid. Concerning to the machine learningapproaches,severalfactorsmayinfluenceitsaccuracy,includingtheselected classifier, the set of features extracted from the terms, the characteristics of the textual bases, and the number of entity labels. In this work, we compared classifiers that use machine learning applied to the NER task. The comparative study includes classifiers based on CRF (Conditional Random Fields), MEMM (MaximumEntropy Markov Model) and HMM (Hidden Markov Model), which are compared in two corpora in Portuguese derived from WikiNer, and HAREM, and two corporas in English derived from CoNLL-03 and WikiNer. The comparison of the classifiers shows that the CRF is superior to the other classifiers, both with Portuguese and English texts. This study also includes the comparison of the individual and joint contribution of features, including contextual features, besides the comparison ofthe NER per named entity labels, between classifiers andcorpora. |