Construção automática de grafo de conhecimento no domínio do e-commerce

Bibliographic Details
Main Author: Barbirato, João Gabriel Melo
Publication Date: 2022
Format: Master thesis
Language: por
Source: Repositório Institucional da UFSCAR
Download full: https://repositorio.ufscar.br/handle/20.500.14289/16322
Summary: Extracting knowledge efficiently, when large volumes of data are generated daily, is still a challenge. In most cases, these data are unstructured, that is, they are presented in textual or visual format without a clear delimitation of the information they contain and the relationships between this information. Thus, as important as correctly extracting knowledge is to represent it and store it so that it is useful. One of the ways to represent (store) this knowledge is through knowledge graphs. These structures represent semantic relationships (edges) between entities (vertices), as the semantic relationship is_a between the apple and fruit entities represented by the triple: is_a(apple,fruit). Thus, this work addresses the automatic construction of a knowledge graph for the e-commerce domain, where the vertices of this graph represent products and characteristics, while the edges connecting these vertices are used to describe the relationship between them. Among the challenges that this work faced is having to deal with unstructured, noisy and incomplete data generated by users in the e-commerce domain. Added to this fact are the semantic challenges of the domain, since e-commerce data carry more semantic value because they are real entities that came from very varied categories and contexts. In order to advance in the investigation of methods to deal with such challenges and peculiarities of the e-commerce domain, in this work two graph models were trained for product recommendation: one of them following distributive approach through the RedisGraph tool, and another that explores latent properties of the distributed methods of knowledge graph embeddings. The results show that the latter can contribute to tasks in the e-commerce domain that aim at product diversity.
id SCAR_72b760a58c08b8ce7045b46bfa408ddc
oai_identifier_str oai:repositorio.ufscar.br:20.500.14289/16322
network_acronym_str SCAR
network_name_str Repositório Institucional da UFSCAR
repository_id_str 4322
spelling Barbirato, João Gabriel MeloCaseli, Helena de Medeiroshttp://lattes.cnpq.br/6608582057810385http://lattes.cnpq.br/7014175217181346c3486581-d3ae-4ba8-bc5a-ee972cb78dbd2022-06-27T18:00:33Z2022-06-27T18:00:33Z2022-04-27BARBIRATO, João Gabriel Melo. Construção automática de grafo de conhecimento no domínio do e-commerce. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/16322.https://repositorio.ufscar.br/handle/20.500.14289/16322Extracting knowledge efficiently, when large volumes of data are generated daily, is still a challenge. In most cases, these data are unstructured, that is, they are presented in textual or visual format without a clear delimitation of the information they contain and the relationships between this information. Thus, as important as correctly extracting knowledge is to represent it and store it so that it is useful. One of the ways to represent (store) this knowledge is through knowledge graphs. These structures represent semantic relationships (edges) between entities (vertices), as the semantic relationship is_a between the apple and fruit entities represented by the triple: is_a(apple,fruit). Thus, this work addresses the automatic construction of a knowledge graph for the e-commerce domain, where the vertices of this graph represent products and characteristics, while the edges connecting these vertices are used to describe the relationship between them. Among the challenges that this work faced is having to deal with unstructured, noisy and incomplete data generated by users in the e-commerce domain. Added to this fact are the semantic challenges of the domain, since e-commerce data carry more semantic value because they are real entities that came from very varied categories and contexts. In order to advance in the investigation of methods to deal with such challenges and peculiarities of the e-commerce domain, in this work two graph models were trained for product recommendation: one of them following distributive approach through the RedisGraph tool, and another that explores latent properties of the distributed methods of knowledge graph embeddings. The results show that the latter can contribute to tasks in the e-commerce domain that aim at product diversity.Extrair conhecimento de forma eficiente, quando grandes volumes de dados são gerados diariamente, ainda é um desafio. Na maioria dos casos esses dados são não estruturados, ou seja, são apresentados no formato textual ou visual sem a clara delimitação das informações que contém e das relações entre essas informações. Assim, tão importante quanto extrair corretamente o conhecimento é representá-lo e armazená-lo de modo que ele seja útil. Uma das formas de representar (armazenar) esse conhecimento é por meio de grafos de conhecimento. Essas estruturas representam relações semânticas (arestas) entre entidades (vértices), como a relação semântica é_um entre as entidades maçã e fruta representada pela tripla: é_um(maçã,fruta). Assim, este trabalho aborda a construção automática de um grafo de conhecimento para o domínio do e-commerce, onde os vértices desse grafo representam produtos e caraterísticas, enquanto as arestas conectando esses vértices são usadas para descrever a relação entre eles. Entre os desafios que este trabalho enfrentou está o de ter de lidar com dados não estruturados, ruidosos e incompletos gerados pelos usuários no domínio do e-commerce. A esse fato somam-se os desafios semânticos do domínio, uma vez que os dados do e-commerce carregam mais valor semântico por se tratarem de entidades reais e de categorias e contextos muito variados. Com o intuito de avançar na investigação de métodos para lidar com tais desafios e peculiaridades do domínio do e-commerce, neste trabalho foram treinados dois modelos de grafo para a recomendação de produtos: um deles seguindo métodos distributivos através da ferramenta RedisGraph, e outro que explora propriedades latentes dos métodos distribuídos de embeddings de grafo de conhecimento. Os resultados mostram que o último pode contribuir para tarefas no domínio do e-commerce que visam a diversidade de produtos.OutraporUniversidade Federal de São CarlosCâmpus São CarlosPrograma de Pós-Graduação em Ciência da Computação - PPGCCUFSCarAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessProcessamento de língua naturalRepresentação de conhecimentoGrafo de onhecimentoNatural language processingKnowledge representationKnowledge graphE-commerceCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOConstrução automática de grafo de conhecimento no domínio do e-commerceAutomatic knowledge graph construction in the e-commerce domaininfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis600600e36d4e63-960d-4f5c-9c93-f8b7f5f93d65reponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALOnto_Diss.pdfOnto_Diss.pdfDissertação de Mestradoapplication/pdf11724618https://repositorio.ufscar.br/bitstreams/87ce7d1c-bc8c-4419-9572-b7971a5e11fd/download3d2af6d0bc89fab382df507d0bf704daMD53trueAnonymousREADPPGCC_Template_dec_BCO.pdfPPGCC_Template_dec_BCO.pdfCarta de autorização de publicaçãoapplication/pdf66464https://repositorio.ufscar.br/bitstreams/8f484660-3c50-4618-a07d-9cbdee162832/download1a6f488ffd64d884cb9db7493d98f25fMD54falseCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufscar.br/bitstreams/3e838ff9-eb06-42f1-a297-42b73f436528/downloade39d27027a6cc9cb039ad269a5db8e34MD55falseAnonymousREADTEXTOnto_Diss.pdf.txtOnto_Diss.pdf.txtExtracted texttext/plain162371https://repositorio.ufscar.br/bitstreams/c350726c-142b-4dab-bfa8-257f2f28dc29/downloadc9e8c6394913282384d698bbad3b0c46MD510falseAnonymousREADPPGCC_Template_dec_BCO.pdf.txtPPGCC_Template_dec_BCO.pdf.txtExtracted texttext/plain1537https://repositorio.ufscar.br/bitstreams/0292d1a4-7f4c-4f21-a05f-cc06de8cfe65/download2ab512b0a997d4207ae5c2a1883f4af4MD512falseTHUMBNAILOnto_Diss.pdf.jpgOnto_Diss.pdf.jpgIM Thumbnailimage/jpeg6591https://repositorio.ufscar.br/bitstreams/70c0e241-7706-4980-9764-b78f0a649daa/download8c8e5c3de19016c59c9852b1b7856266MD511falseAnonymousREADPPGCC_Template_dec_BCO.pdf.jpgPPGCC_Template_dec_BCO.pdf.jpgIM Thumbnailimage/jpeg12726https://repositorio.ufscar.br/bitstreams/dd3f7895-9a43-4722-bf2b-8905040d9156/download8ff0859c315e258df3b35d6ce4c75bc2MD513false20.500.14289/163222025-02-05 21:50:57.66http://creativecommons.org/licenses/by-nc-nd/3.0/br/Attribution-NonCommercial-NoDerivs 3.0 Brazilopen.accessoai:repositorio.ufscar.br:20.500.14289/16322https://repositorio.ufscar.brRepositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestrepositorio.sibi@ufscar.bropendoar:43222025-02-06T00:50:57Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false
dc.title.por.fl_str_mv Construção automática de grafo de conhecimento no domínio do e-commerce
dc.title.alternative.eng.fl_str_mv Automatic knowledge graph construction in the e-commerce domain
title Construção automática de grafo de conhecimento no domínio do e-commerce
spellingShingle Construção automática de grafo de conhecimento no domínio do e-commerce
Barbirato, João Gabriel Melo
Processamento de língua natural
Representação de conhecimento
Grafo de onhecimento
Natural language processing
Knowledge representation
Knowledge graph
E-commerce
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Construção automática de grafo de conhecimento no domínio do e-commerce
title_full Construção automática de grafo de conhecimento no domínio do e-commerce
title_fullStr Construção automática de grafo de conhecimento no domínio do e-commerce
title_full_unstemmed Construção automática de grafo de conhecimento no domínio do e-commerce
title_sort Construção automática de grafo de conhecimento no domínio do e-commerce
author Barbirato, João Gabriel Melo
author_facet Barbirato, João Gabriel Melo
author_role author
dc.contributor.authorlattes.por.fl_str_mv http://lattes.cnpq.br/7014175217181346
dc.contributor.author.fl_str_mv Barbirato, João Gabriel Melo
dc.contributor.advisor1.fl_str_mv Caseli, Helena de Medeiros
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/6608582057810385
dc.contributor.authorID.fl_str_mv c3486581-d3ae-4ba8-bc5a-ee972cb78dbd
contributor_str_mv Caseli, Helena de Medeiros
dc.subject.por.fl_str_mv Processamento de língua natural
Representação de conhecimento
Grafo de onhecimento
topic Processamento de língua natural
Representação de conhecimento
Grafo de onhecimento
Natural language processing
Knowledge representation
Knowledge graph
E-commerce
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.eng.fl_str_mv Natural language processing
Knowledge representation
Knowledge graph
E-commerce
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description Extracting knowledge efficiently, when large volumes of data are generated daily, is still a challenge. In most cases, these data are unstructured, that is, they are presented in textual or visual format without a clear delimitation of the information they contain and the relationships between this information. Thus, as important as correctly extracting knowledge is to represent it and store it so that it is useful. One of the ways to represent (store) this knowledge is through knowledge graphs. These structures represent semantic relationships (edges) between entities (vertices), as the semantic relationship is_a between the apple and fruit entities represented by the triple: is_a(apple,fruit). Thus, this work addresses the automatic construction of a knowledge graph for the e-commerce domain, where the vertices of this graph represent products and characteristics, while the edges connecting these vertices are used to describe the relationship between them. Among the challenges that this work faced is having to deal with unstructured, noisy and incomplete data generated by users in the e-commerce domain. Added to this fact are the semantic challenges of the domain, since e-commerce data carry more semantic value because they are real entities that came from very varied categories and contexts. In order to advance in the investigation of methods to deal with such challenges and peculiarities of the e-commerce domain, in this work two graph models were trained for product recommendation: one of them following distributive approach through the RedisGraph tool, and another that explores latent properties of the distributed methods of knowledge graph embeddings. The results show that the latter can contribute to tasks in the e-commerce domain that aim at product diversity.
publishDate 2022
dc.date.accessioned.fl_str_mv 2022-06-27T18:00:33Z
dc.date.available.fl_str_mv 2022-06-27T18:00:33Z
dc.date.issued.fl_str_mv 2022-04-27
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv BARBIRATO, João Gabriel Melo. Construção automática de grafo de conhecimento no domínio do e-commerce. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/16322.
dc.identifier.uri.fl_str_mv https://repositorio.ufscar.br/handle/20.500.14289/16322
identifier_str_mv BARBIRATO, João Gabriel Melo. Construção automática de grafo de conhecimento no domínio do e-commerce. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/16322.
url https://repositorio.ufscar.br/handle/20.500.14289/16322
dc.language.iso.fl_str_mv por
language por
dc.relation.confidence.fl_str_mv 600
600
dc.relation.authority.fl_str_mv e36d4e63-960d-4f5c-9c93-f8b7f5f93d65
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus São Carlos
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciência da Computação - PPGCC
dc.publisher.initials.fl_str_mv UFSCar
publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus São Carlos
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFSCAR
instname:Universidade Federal de São Carlos (UFSCAR)
instacron:UFSCAR
instname_str Universidade Federal de São Carlos (UFSCAR)
instacron_str UFSCAR
institution UFSCAR
reponame_str Repositório Institucional da UFSCAR
collection Repositório Institucional da UFSCAR
bitstream.url.fl_str_mv https://repositorio.ufscar.br/bitstreams/87ce7d1c-bc8c-4419-9572-b7971a5e11fd/download
https://repositorio.ufscar.br/bitstreams/8f484660-3c50-4618-a07d-9cbdee162832/download
https://repositorio.ufscar.br/bitstreams/3e838ff9-eb06-42f1-a297-42b73f436528/download
https://repositorio.ufscar.br/bitstreams/c350726c-142b-4dab-bfa8-257f2f28dc29/download
https://repositorio.ufscar.br/bitstreams/0292d1a4-7f4c-4f21-a05f-cc06de8cfe65/download
https://repositorio.ufscar.br/bitstreams/70c0e241-7706-4980-9764-b78f0a649daa/download
https://repositorio.ufscar.br/bitstreams/dd3f7895-9a43-4722-bf2b-8905040d9156/download
bitstream.checksum.fl_str_mv 3d2af6d0bc89fab382df507d0bf704da
1a6f488ffd64d884cb9db7493d98f25f
e39d27027a6cc9cb039ad269a5db8e34
c9e8c6394913282384d698bbad3b0c46
2ab512b0a997d4207ae5c2a1883f4af4
8c8e5c3de19016c59c9852b1b7856266
8ff0859c315e258df3b35d6ce4c75bc2
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)
repository.mail.fl_str_mv repositorio.sibi@ufscar.br
_version_ 1834468934723567616