Uso de linguagem natural para consulta de informações dos microdados do Censo Escolar brasileiro
Ano de defesa: | 2021 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Santa Maria
Brasil Ciência da Computação UFSM Programa de Pós-Graduação em Ciência da Computação Centro de Tecnologia |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://repositorio.ufsm.br/handle/1/23051 |
Resumo: | The accelerated growth of the data obtained and stored has been observed for many years, motivating a growing investigation for new forms of querying, enabling other ways to query information that is useful in several knowledge domains. In this sense, Question Answering (QA) is a specialized area of Information Retrieval, whose objective is to obtain precise and direct answers that satisfy the user’s need for information, given a question expressed in Natural Language (NL). For this task, a set of Natural Language Processing (NLP) techniques are applied for understanding human language. Although NLP has maturity in some languages (such as English), this research area presents numerous challenges, due to the difficulty of NL understanding caused by use of words that have similar meanings, slang/regional terms, incorrect spelling, or ambiguity. Moreover, in the Portuguese language, there is still a research gap, possibly motivated by the complexity that Portuguese language present in comparison to other languages. Thus, this research presents an exploratory study on the NLP applied to QA systems, and for that, a QA system was designed and developed for querying information from open data of Brazilian Educational Census, which is the largest and most important statistical research performed by Anísio Teixeira National Institute of Educational Studies and Research. The presented system applies a hybrid approach to understand the meaning of the question, i.e., it combines the linguistic and rule-based approaches, which are manually constructed based on the data dictionary and current educational legislation. The results of the evaluation carried out with Education professionals suggest the ease of use of the QA system, in addition to the importance of the tool for querying information in this data set. However, there are still many difficulties related to the NLP itself, and particularities related to the educational data set used. |