Transformer-based language models for semantic search and mobile applications retrieval

Detalhes bibliográficos
Autor(a) principal: Coelho, J.
Data de Publicação: 2021
Outros Autores: Neto, A., Tavares, M., Coutinho, C., Oliveira, J., Ribeiro, R., Batista, F.
Idioma: eng
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: http://hdl.handle.net/10071/23678
Resumo: Search engines are being extensively used by Mobile App Stores, where millions of users world-wide use them every day. However, some stores still resort to simple lexical-based search engines, despite the recent advances in Machine Learning, Information Retrieval, and Natural Language Processing, which allow for richer semantic strategies. This work proposes an approach for semantic search of mobile applications that relies on transformer-based language models, fine-tuned with the existing textual information about known mobile applications. Our approach relies solely on the application name and on the unstructured textual information contained in its description. A dataset of about 500 thousand mobile apps was extended in the scope of this work with a test set, and all the available textual data was used to fine-tune our neural language models. We have evaluated our models using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non- exact queries. The results show that our model surpasses the performance of all the other retrieval strategies reported in the literature. Tests with users have confirmed the performance of our semantic search approach, when compared with an existing deployed solution.
id RCAP_7fadbbc81cca3d2c487d42e597f5fee8
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/23678
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Transformer-based language models for semantic search and mobile applications retrievalSemantic searchWord embeddingsElasticSearchMobile applicationsTransformer-based modelsSearch engines are being extensively used by Mobile App Stores, where millions of users world-wide use them every day. However, some stores still resort to simple lexical-based search engines, despite the recent advances in Machine Learning, Information Retrieval, and Natural Language Processing, which allow for richer semantic strategies. This work proposes an approach for semantic search of mobile applications that relies on transformer-based language models, fine-tuned with the existing textual information about known mobile applications. Our approach relies solely on the application name and on the unstructured textual information contained in its description. A dataset of about 500 thousand mobile apps was extended in the scope of this work with a test set, and all the available textual data was used to fine-tune our neural language models. We have evaluated our models using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non- exact queries. The results show that our model surpasses the performance of all the other retrieval strategies reported in the literature. Tests with users have confirmed the performance of our semantic search approach, when compared with an existing deployed solution.SCITEPRESS – Science and Technology Publications, Lda2021-12-10T10:57:44Z2021-01-01T00:00:00Z20212021-12-10T10:53:49Zconference objectinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/10071/23678eng978-989-758-533-32184-322810.5220/0010657300003064Coelho, J.Neto, A.Tavares, M.Coutinho, C.Oliveira, J.Ribeiro, R.Batista, F.info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-07-07T03:34:08Zoai:repositorio.iscte-iul.pt:10071/23678Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T18:27:50.956807Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Transformer-based language models for semantic search and mobile applications retrieval
title Transformer-based language models for semantic search and mobile applications retrieval
spellingShingle Transformer-based language models for semantic search and mobile applications retrieval
Coelho, J.
Semantic search
Word embeddings
ElasticSearch
Mobile applications
Transformer-based models
title_short Transformer-based language models for semantic search and mobile applications retrieval
title_full Transformer-based language models for semantic search and mobile applications retrieval
title_fullStr Transformer-based language models for semantic search and mobile applications retrieval
title_full_unstemmed Transformer-based language models for semantic search and mobile applications retrieval
title_sort Transformer-based language models for semantic search and mobile applications retrieval
author Coelho, J.
author_facet Coelho, J.
Neto, A.
Tavares, M.
Coutinho, C.
Oliveira, J.
Ribeiro, R.
Batista, F.
author_role author
author2 Neto, A.
Tavares, M.
Coutinho, C.
Oliveira, J.
Ribeiro, R.
Batista, F.
author2_role author
author
author
author
author
author
dc.contributor.author.fl_str_mv Coelho, J.
Neto, A.
Tavares, M.
Coutinho, C.
Oliveira, J.
Ribeiro, R.
Batista, F.
dc.subject.por.fl_str_mv Semantic search
Word embeddings
ElasticSearch
Mobile applications
Transformer-based models
topic Semantic search
Word embeddings
ElasticSearch
Mobile applications
Transformer-based models
description Search engines are being extensively used by Mobile App Stores, where millions of users world-wide use them every day. However, some stores still resort to simple lexical-based search engines, despite the recent advances in Machine Learning, Information Retrieval, and Natural Language Processing, which allow for richer semantic strategies. This work proposes an approach for semantic search of mobile applications that relies on transformer-based language models, fine-tuned with the existing textual information about known mobile applications. Our approach relies solely on the application name and on the unstructured textual information contained in its description. A dataset of about 500 thousand mobile apps was extended in the scope of this work with a test set, and all the available textual data was used to fine-tune our neural language models. We have evaluated our models using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non- exact queries. The results show that our model surpasses the performance of all the other retrieval strategies reported in the literature. Tests with users have confirmed the performance of our semantic search approach, when compared with an existing deployed solution.
publishDate 2021
dc.date.none.fl_str_mv 2021-12-10T10:57:44Z
2021-01-01T00:00:00Z
2021
2021-12-10T10:53:49Z
dc.type.driver.fl_str_mv conference object
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10071/23678
url http://hdl.handle.net/10071/23678
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 978-989-758-533-3
2184-3228
10.5220/0010657300003064
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv SCITEPRESS – Science and Technology Publications, Lda
publisher.none.fl_str_mv SCITEPRESS – Science and Technology Publications, Lda
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833597434427604992