Processamento de consulta em um framework baseado em mediador para integração de dados no padrão de linked data

Pinheiro, João Carlos

Processamento de consulta em um framework baseado em mediador para integração de dados no padrão de linked data

Detalhes bibliográficos
Ano de defesa:	2011
Autor(a) principal:	Pinheiro, João Carlos
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Tese
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Não Informado pela instituição
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Processamento eletrônico de dados Framework (Arquivo de computador) Web semântica Processamento distribuído
Link de acesso:	http://www.repositorio.ufc.br/handle/riufc/61237
Resumo:	The Web evolved from a global information space of hypertext to the Linked Data network, also known as Web of Data. The use of RDF, one of the cornerstones of the Semantic Web, has been crucial for storage and publication of Linked Data accessible via SPARQL endpoint through the SPARQL query language, that allows answering distributed queries which could not be answered by a single data source or even search engines on the Web. However the difficulty of distributed query formulation has been an obstacle to take advantage of these data because of the autonomy, distribution and heterogeneous vocabulary of data sources. This scenario confirms the need for efficient mechanisms for data integration that can leverage the reuse of such data simply and efficiently. In that context, this work presents a framework based on a mediator for Linked Data integration accessible via SPARQL endpoint where global schema is represented by a domain ontology, which provides a shared vocabulary. Each data source, published on the Web according to the Linked Data principles, is described by an application ontology, whose vocabulary is restricted to be a subset of the domain ontology vocabulary. Inside this context, this work proposes a method for processing distributed SPARQL queries, including: a) an algorithm for query reformulation in which two key questions are addressed: the search for data only to data sources that may contribute with any intermediate result, without appeal to inference mechanisms for query expansion, and the use of same-as and URI-links to deal with incomplete information, b) the execution step explores algorithms and techniques that enable the reduction in the volume of intermediate data, parallel query processing, pull and push models for delivery of data and processing that combines adaptive join algorithms proficiently. These techniques are essential in the highly dynamic environment of the Linked Data, which have two characteristics that challenge the distributed SPARQL query evaluation: a large scale and unpredictability in time data delivery. The optimization strategy was evaluated through several experiments, and the results provide empirical evidence of its scalability and performance gains for data integration.

Processamento de consulta em um framework baseado em mediador para integração de dados no padrão de linked data

Registros relacionados