Leitura da web em português em ambiente de aprendizado sem-fim

Detalhes bibliográficos
Ano de defesa: 2016
Autor(a) principal: Duarte, Maísa Cristina
Orientador(a): Hruschka Júnior, Estevam Rafael lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de São Carlos
Câmpus São Carlos
Programa de Pós-Graduação: Programa de Pós-Graduação em Ciência da Computação - PPGCC
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/20.500.14289/8414
Resumo: NELL is a computer system that has the goal of learn to learn 24 hours per day, continuously and learn more an better than the last day, to perform the knowledge base (KB). NELL is running since January 12 of 2010. Furthermore, NELL goals is have hight precision to be able to continue the learning. NELL is developed in macro-reading context, because this NELL needs very much redundancy to run. The first step to run NELL is to have an big (all-pairs-data). An all-pairs-data is a preprocessed base using Natural Language Processing (NLP), that base has all sufficient statistics about a corpus of web pages. The proposal of this project was to create a instance of NELL (currently in English) in Portuguese. For this, the first goal was the developing an all-pairs-data in Portuguese. The second step was to create a new version of Portuguese NELL. And finally, the third goal was to develop a coreference resolution hybrid method focused in features semantics and morphologics. This method is not dependent of a specific language, it is can be applied for another languages with the same alphabet of Portuguese language. The NELL in Portuguese was developed, but the all-pairs-data is not big enough. Because it Portuguese NELL is not running for ever, like the English version. Even so, this project present the steps about how to develop a NELL in other language and some ideas about how to improve the all-pairs-data. By the way, this project present a coreference resolution hybrid method with good results to NELL.