Método ágil de integração semântica de dados científicos baseado em ontologias

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: José Eugênio de Assis Goncalves
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
Brasil
ECI - ESCOLA DE CIENCIA DA INFORMAÇÃO
Programa de Pós-Graduação em Gestão e Organização do Conhecimento
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/34013
Resumo: Integrating data generated by scientific research is an increasingly important activity for the evolution of Data Science. Such integration can be accomplished with the aid of data schemes (models), which define how they should be understood, related and formatted, determining how they are organized. However, if, on the one hand, predefined relational data schemes can favor their integration, sharing and reuse by members of a scientific community, on the other hand, they remove the flexibility of data representation by the researcher, since he must respect the pre-defined scheme if he intends to share your data with the community. The research aims to explore and propose a way to integrate data without the rigidity of pre-defined relational schemes. It is proposed to use ontologies to allow each scientific study to use its own conceptual design, and still maintain the ability to integrate and reuse the data collected by the study. The Integration is obtained from concepts common to studies, formally defined by ontologies. The use of ontologies is expected to contribute to the interoperability of data and systems. Instead of rigid relational schemes, canonical structures in the form of triples of ``subject'', ``predicate'' and ``object'' are used, interconnected and constituting a graph. The objective of the research is to develop an iterative method to facilitate the realization of the process of semantic integration of data produced during scientific research. The method allows the researcher to design the domain ontology (which integrates the data) in short development cycles, throughout the research. This is the main contribution of the proposed method. It frees the researcher from having to develop the integration ontology, only to later integrate the data. Based on the Agile Design Science Research Methodology, it allows integrating data and evolving ontology with each cycle, with the participation of all the actors involved. During the validation phase of the results of this research, it was noted that collaboration between all involved was facilitated with the use of the proposed method, and decisions could be made more readily in view of their early access to data and semantically integrated information, whose analysis was performed with the aid of artifacts designed for this purpose. The method was validated, using a survey that integrated socioeconomic and environmental data with information on cases of dengue and schistosomiasis in Brazil.