Análise lexicográfica da produção acadêmica da Fiocruz: uma proposta de metodologia

Detalhes bibliográficos
Ano de defesa:	2016
Autor(a) principal:	Lima, Jefferson da Costa
Orientador(a):	Souza, Renato Rocha
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Não Informado pela instituição
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Processamento de linguagem natural Mineração de textos Descritores em ciências da saúde
Link de acesso:	http://hdl.handle.net/10438/17458
Resumo:	With the objective to meet the health needs of the population, a huge amount of publications are generated each year. Scientific papers, thesis and dissertations become available digitally, but make them accessible to the user requires an understanding of the indexing process, which is usually made manually. This work proposes an experiment on the feasibility of automatically identify valid descriptors for the documents in the field of health. Are extracted n-grams of the texts and, after comparison with terms of vocabulary Health Sciences Descriptors (DeCS), are identified those who can act as descriptors for the works. We believe that this process can be applied to classification of document sets with deficiencies in their indexing and, even, in supporting the re-indexing processes, improving the precision and recall of the searches, and the possibility of establishing metrics of relevance.

Registros relacionados