Construção automática de grafo de conhecimento no domínio do e-commerce
Main Author: | |
---|---|
Publication Date: | 2022 |
Format: | Master thesis |
Language: | por |
Source: | Repositório Institucional da UFSCAR |
Download full: | https://repositorio.ufscar.br/handle/20.500.14289/16322 |
Summary: | Extracting knowledge efficiently, when large volumes of data are generated daily, is still a challenge. In most cases, these data are unstructured, that is, they are presented in textual or visual format without a clear delimitation of the information they contain and the relationships between this information. Thus, as important as correctly extracting knowledge is to represent it and store it so that it is useful. One of the ways to represent (store) this knowledge is through knowledge graphs. These structures represent semantic relationships (edges) between entities (vertices), as the semantic relationship is_a between the apple and fruit entities represented by the triple: is_a(apple,fruit). Thus, this work addresses the automatic construction of a knowledge graph for the e-commerce domain, where the vertices of this graph represent products and characteristics, while the edges connecting these vertices are used to describe the relationship between them. Among the challenges that this work faced is having to deal with unstructured, noisy and incomplete data generated by users in the e-commerce domain. Added to this fact are the semantic challenges of the domain, since e-commerce data carry more semantic value because they are real entities that came from very varied categories and contexts. In order to advance in the investigation of methods to deal with such challenges and peculiarities of the e-commerce domain, in this work two graph models were trained for product recommendation: one of them following distributive approach through the RedisGraph tool, and another that explores latent properties of the distributed methods of knowledge graph embeddings. The results show that the latter can contribute to tasks in the e-commerce domain that aim at product diversity. |
id |
SCAR_72b760a58c08b8ce7045b46bfa408ddc |
---|---|
oai_identifier_str |
oai:repositorio.ufscar.br:20.500.14289/16322 |
network_acronym_str |
SCAR |
network_name_str |
Repositório Institucional da UFSCAR |
repository_id_str |
4322 |
spelling |
Barbirato, João Gabriel MeloCaseli, Helena de Medeiroshttp://lattes.cnpq.br/6608582057810385http://lattes.cnpq.br/7014175217181346c3486581-d3ae-4ba8-bc5a-ee972cb78dbd2022-06-27T18:00:33Z2022-06-27T18:00:33Z2022-04-27BARBIRATO, João Gabriel Melo. Construção automática de grafo de conhecimento no domínio do e-commerce. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/16322.https://repositorio.ufscar.br/handle/20.500.14289/16322Extracting knowledge efficiently, when large volumes of data are generated daily, is still a challenge. In most cases, these data are unstructured, that is, they are presented in textual or visual format without a clear delimitation of the information they contain and the relationships between this information. Thus, as important as correctly extracting knowledge is to represent it and store it so that it is useful. One of the ways to represent (store) this knowledge is through knowledge graphs. These structures represent semantic relationships (edges) between entities (vertices), as the semantic relationship is_a between the apple and fruit entities represented by the triple: is_a(apple,fruit). Thus, this work addresses the automatic construction of a knowledge graph for the e-commerce domain, where the vertices of this graph represent products and characteristics, while the edges connecting these vertices are used to describe the relationship between them. Among the challenges that this work faced is having to deal with unstructured, noisy and incomplete data generated by users in the e-commerce domain. Added to this fact are the semantic challenges of the domain, since e-commerce data carry more semantic value because they are real entities that came from very varied categories and contexts. In order to advance in the investigation of methods to deal with such challenges and peculiarities of the e-commerce domain, in this work two graph models were trained for product recommendation: one of them following distributive approach through the RedisGraph tool, and another that explores latent properties of the distributed methods of knowledge graph embeddings. The results show that the latter can contribute to tasks in the e-commerce domain that aim at product diversity.Extrair conhecimento de forma eficiente, quando grandes volumes de dados são gerados diariamente, ainda é um desafio. Na maioria dos casos esses dados são não estruturados, ou seja, são apresentados no formato textual ou visual sem a clara delimitação das informações que contém e das relações entre essas informações. Assim, tão importante quanto extrair corretamente o conhecimento é representá-lo e armazená-lo de modo que ele seja útil. Uma das formas de representar (armazenar) esse conhecimento é por meio de grafos de conhecimento. Essas estruturas representam relações semânticas (arestas) entre entidades (vértices), como a relação semântica é_um entre as entidades maçã e fruta representada pela tripla: é_um(maçã,fruta). Assim, este trabalho aborda a construção automática de um grafo de conhecimento para o domínio do e-commerce, onde os vértices desse grafo representam produtos e caraterísticas, enquanto as arestas conectando esses vértices são usadas para descrever a relação entre eles. Entre os desafios que este trabalho enfrentou está o de ter de lidar com dados não estruturados, ruidosos e incompletos gerados pelos usuários no domínio do e-commerce. A esse fato somam-se os desafios semânticos do domínio, uma vez que os dados do e-commerce carregam mais valor semântico por se tratarem de entidades reais e de categorias e contextos muito variados. Com o intuito de avançar na investigação de métodos para lidar com tais desafios e peculiaridades do domínio do e-commerce, neste trabalho foram treinados dois modelos de grafo para a recomendação de produtos: um deles seguindo métodos distributivos através da ferramenta RedisGraph, e outro que explora propriedades latentes dos métodos distribuídos de embeddings de grafo de conhecimento. Os resultados mostram que o último pode contribuir para tarefas no domínio do e-commerce que visam a diversidade de produtos.OutraporUniversidade Federal de São CarlosCâmpus São CarlosPrograma de Pós-Graduação em Ciência da Computação - PPGCCUFSCarAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessProcessamento de língua naturalRepresentação de conhecimentoGrafo de onhecimentoNatural language processingKnowledge representationKnowledge graphE-commerceCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOConstrução automática de grafo de conhecimento no domínio do e-commerceAutomatic knowledge graph construction in the e-commerce domaininfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis600600e36d4e63-960d-4f5c-9c93-f8b7f5f93d65reponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALOnto_Diss.pdfOnto_Diss.pdfDissertação de Mestradoapplication/pdf11724618https://repositorio.ufscar.br/bitstreams/87ce7d1c-bc8c-4419-9572-b7971a5e11fd/download3d2af6d0bc89fab382df507d0bf704daMD53trueAnonymousREADPPGCC_Template_dec_BCO.pdfPPGCC_Template_dec_BCO.pdfCarta de autorização de publicaçãoapplication/pdf66464https://repositorio.ufscar.br/bitstreams/8f484660-3c50-4618-a07d-9cbdee162832/download1a6f488ffd64d884cb9db7493d98f25fMD54falseCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufscar.br/bitstreams/3e838ff9-eb06-42f1-a297-42b73f436528/downloade39d27027a6cc9cb039ad269a5db8e34MD55falseAnonymousREADTEXTOnto_Diss.pdf.txtOnto_Diss.pdf.txtExtracted texttext/plain162371https://repositorio.ufscar.br/bitstreams/c350726c-142b-4dab-bfa8-257f2f28dc29/downloadc9e8c6394913282384d698bbad3b0c46MD510falseAnonymousREADPPGCC_Template_dec_BCO.pdf.txtPPGCC_Template_dec_BCO.pdf.txtExtracted texttext/plain1537https://repositorio.ufscar.br/bitstreams/0292d1a4-7f4c-4f21-a05f-cc06de8cfe65/download2ab512b0a997d4207ae5c2a1883f4af4MD512falseTHUMBNAILOnto_Diss.pdf.jpgOnto_Diss.pdf.jpgIM Thumbnailimage/jpeg6591https://repositorio.ufscar.br/bitstreams/70c0e241-7706-4980-9764-b78f0a649daa/download8c8e5c3de19016c59c9852b1b7856266MD511falseAnonymousREADPPGCC_Template_dec_BCO.pdf.jpgPPGCC_Template_dec_BCO.pdf.jpgIM Thumbnailimage/jpeg12726https://repositorio.ufscar.br/bitstreams/dd3f7895-9a43-4722-bf2b-8905040d9156/download8ff0859c315e258df3b35d6ce4c75bc2MD513false20.500.14289/163222025-02-05 21:50:57.66http://creativecommons.org/licenses/by-nc-nd/3.0/br/Attribution-NonCommercial-NoDerivs 3.0 Brazilopen.accessoai:repositorio.ufscar.br:20.500.14289/16322https://repositorio.ufscar.brRepositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestrepositorio.sibi@ufscar.bropendoar:43222025-02-06T00:50:57Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false |
dc.title.por.fl_str_mv |
Construção automática de grafo de conhecimento no domínio do e-commerce |
dc.title.alternative.eng.fl_str_mv |
Automatic knowledge graph construction in the e-commerce domain |
title |
Construção automática de grafo de conhecimento no domínio do e-commerce |
spellingShingle |
Construção automática de grafo de conhecimento no domínio do e-commerce Barbirato, João Gabriel Melo Processamento de língua natural Representação de conhecimento Grafo de onhecimento Natural language processing Knowledge representation Knowledge graph E-commerce CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
title_short |
Construção automática de grafo de conhecimento no domínio do e-commerce |
title_full |
Construção automática de grafo de conhecimento no domínio do e-commerce |
title_fullStr |
Construção automática de grafo de conhecimento no domínio do e-commerce |
title_full_unstemmed |
Construção automática de grafo de conhecimento no domínio do e-commerce |
title_sort |
Construção automática de grafo de conhecimento no domínio do e-commerce |
author |
Barbirato, João Gabriel Melo |
author_facet |
Barbirato, João Gabriel Melo |
author_role |
author |
dc.contributor.authorlattes.por.fl_str_mv |
http://lattes.cnpq.br/7014175217181346 |
dc.contributor.author.fl_str_mv |
Barbirato, João Gabriel Melo |
dc.contributor.advisor1.fl_str_mv |
Caseli, Helena de Medeiros |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/6608582057810385 |
dc.contributor.authorID.fl_str_mv |
c3486581-d3ae-4ba8-bc5a-ee972cb78dbd |
contributor_str_mv |
Caseli, Helena de Medeiros |
dc.subject.por.fl_str_mv |
Processamento de língua natural Representação de conhecimento Grafo de onhecimento |
topic |
Processamento de língua natural Representação de conhecimento Grafo de onhecimento Natural language processing Knowledge representation Knowledge graph E-commerce CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
dc.subject.eng.fl_str_mv |
Natural language processing Knowledge representation Knowledge graph E-commerce |
dc.subject.cnpq.fl_str_mv |
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO |
description |
Extracting knowledge efficiently, when large volumes of data are generated daily, is still a challenge. In most cases, these data are unstructured, that is, they are presented in textual or visual format without a clear delimitation of the information they contain and the relationships between this information. Thus, as important as correctly extracting knowledge is to represent it and store it so that it is useful. One of the ways to represent (store) this knowledge is through knowledge graphs. These structures represent semantic relationships (edges) between entities (vertices), as the semantic relationship is_a between the apple and fruit entities represented by the triple: is_a(apple,fruit). Thus, this work addresses the automatic construction of a knowledge graph for the e-commerce domain, where the vertices of this graph represent products and characteristics, while the edges connecting these vertices are used to describe the relationship between them. Among the challenges that this work faced is having to deal with unstructured, noisy and incomplete data generated by users in the e-commerce domain. Added to this fact are the semantic challenges of the domain, since e-commerce data carry more semantic value because they are real entities that came from very varied categories and contexts. In order to advance in the investigation of methods to deal with such challenges and peculiarities of the e-commerce domain, in this work two graph models were trained for product recommendation: one of them following distributive approach through the RedisGraph tool, and another that explores latent properties of the distributed methods of knowledge graph embeddings. The results show that the latter can contribute to tasks in the e-commerce domain that aim at product diversity. |
publishDate |
2022 |
dc.date.accessioned.fl_str_mv |
2022-06-27T18:00:33Z |
dc.date.available.fl_str_mv |
2022-06-27T18:00:33Z |
dc.date.issued.fl_str_mv |
2022-04-27 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
BARBIRATO, João Gabriel Melo. Construção automática de grafo de conhecimento no domínio do e-commerce. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/16322. |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufscar.br/handle/20.500.14289/16322 |
identifier_str_mv |
BARBIRATO, João Gabriel Melo. Construção automática de grafo de conhecimento no domínio do e-commerce. 2022. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/16322. |
url |
https://repositorio.ufscar.br/handle/20.500.14289/16322 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.confidence.fl_str_mv |
600 600 |
dc.relation.authority.fl_str_mv |
e36d4e63-960d-4f5c-9c93-f8b7f5f93d65 |
dc.rights.driver.fl_str_mv |
Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de São Carlos Câmpus São Carlos |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Ciência da Computação - PPGCC |
dc.publisher.initials.fl_str_mv |
UFSCar |
publisher.none.fl_str_mv |
Universidade Federal de São Carlos Câmpus São Carlos |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFSCAR instname:Universidade Federal de São Carlos (UFSCAR) instacron:UFSCAR |
instname_str |
Universidade Federal de São Carlos (UFSCAR) |
instacron_str |
UFSCAR |
institution |
UFSCAR |
reponame_str |
Repositório Institucional da UFSCAR |
collection |
Repositório Institucional da UFSCAR |
bitstream.url.fl_str_mv |
https://repositorio.ufscar.br/bitstreams/87ce7d1c-bc8c-4419-9572-b7971a5e11fd/download https://repositorio.ufscar.br/bitstreams/8f484660-3c50-4618-a07d-9cbdee162832/download https://repositorio.ufscar.br/bitstreams/3e838ff9-eb06-42f1-a297-42b73f436528/download https://repositorio.ufscar.br/bitstreams/c350726c-142b-4dab-bfa8-257f2f28dc29/download https://repositorio.ufscar.br/bitstreams/0292d1a4-7f4c-4f21-a05f-cc06de8cfe65/download https://repositorio.ufscar.br/bitstreams/70c0e241-7706-4980-9764-b78f0a649daa/download https://repositorio.ufscar.br/bitstreams/dd3f7895-9a43-4722-bf2b-8905040d9156/download |
bitstream.checksum.fl_str_mv |
3d2af6d0bc89fab382df507d0bf704da 1a6f488ffd64d884cb9db7493d98f25f e39d27027a6cc9cb039ad269a5db8e34 c9e8c6394913282384d698bbad3b0c46 2ab512b0a997d4207ae5c2a1883f4af4 8c8e5c3de19016c59c9852b1b7856266 8ff0859c315e258df3b35d6ce4c75bc2 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR) |
repository.mail.fl_str_mv |
repositorio.sibi@ufscar.br |
_version_ |
1834468934723567616 |