Semantic search of mobile applications using word embeddings

Bibliographic Details
Main Author: Coelho, J.
Publication Date: 2021
Other Authors: Neto, A., Tavares, M., Coutinho, C., Ribeiro, R., Batista, F.
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10071/23673
Summary: This paper proposes a set of approaches for the semantic search of mobile applications, based on their name and on the unstructured textual information contained in their description. The proposed approaches make use of word-level, character-level, and contextual word-embeddings that have been trained or fine-tuned using a dataset of about 500 thousand mobile apps, collected in the scope of this work. The proposed approaches have been evaluated using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non-exact queries. Our results show that both character-level embeddings trained on our data, and fine-tuned RoBERTa models surpass the performance of the other existing retrieval strategies reported in the literature.
id RCAP_7906551448a76b6f8b1d4be990ecc9fe
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/23673
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Semantic search of mobile applications using word embeddingsSemantic searchWord embeddingsElasticsearchMobile applicationsThis paper proposes a set of approaches for the semantic search of mobile applications, based on their name and on the unstructured textual information contained in their description. The proposed approaches make use of word-level, character-level, and contextual word-embeddings that have been trained or fine-tuned using a dataset of about 500 thousand mobile apps, collected in the scope of this work. The proposed approaches have been evaluated using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non-exact queries. Our results show that both character-level embeddings trained on our data, and fine-tuned RoBERTa models surpass the performance of the other existing retrieval strategies reported in the literature.Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing2021-12-10T10:00:53Z2021-01-01T00:00:00Z20212021-12-10T09:56:36Zconference objectinfo:eu-repo/semantics/publishedVersionhttp://hdl.handle.net/10071/23673eng978-3-95977-202-02190-680710.4230/OASIcs.SLATE.2021.12Coelho, J.Neto, A.Tavares, M.Coutinho, C.Ribeiro, R.Batista, F.info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-07-07T03:07:47Zoai:repositorio.iscte-iul.pt:10071/23673Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T18:16:29.674307Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Semantic search of mobile applications using word embeddings
title Semantic search of mobile applications using word embeddings
spellingShingle Semantic search of mobile applications using word embeddings
Coelho, J.
Semantic search
Word embeddings
Elasticsearch
Mobile applications
title_short Semantic search of mobile applications using word embeddings
title_full Semantic search of mobile applications using word embeddings
title_fullStr Semantic search of mobile applications using word embeddings
title_full_unstemmed Semantic search of mobile applications using word embeddings
title_sort Semantic search of mobile applications using word embeddings
author Coelho, J.
author_facet Coelho, J.
Neto, A.
Tavares, M.
Coutinho, C.
Ribeiro, R.
Batista, F.
author_role author
author2 Neto, A.
Tavares, M.
Coutinho, C.
Ribeiro, R.
Batista, F.
author2_role author
author
author
author
author
dc.contributor.author.fl_str_mv Coelho, J.
Neto, A.
Tavares, M.
Coutinho, C.
Ribeiro, R.
Batista, F.
dc.subject.por.fl_str_mv Semantic search
Word embeddings
Elasticsearch
Mobile applications
topic Semantic search
Word embeddings
Elasticsearch
Mobile applications
description This paper proposes a set of approaches for the semantic search of mobile applications, based on their name and on the unstructured textual information contained in their description. The proposed approaches make use of word-level, character-level, and contextual word-embeddings that have been trained or fine-tuned using a dataset of about 500 thousand mobile apps, collected in the scope of this work. The proposed approaches have been evaluated using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non-exact queries. Our results show that both character-level embeddings trained on our data, and fine-tuned RoBERTa models surpass the performance of the other existing retrieval strategies reported in the literature.
publishDate 2021
dc.date.none.fl_str_mv 2021-12-10T10:00:53Z
2021-01-01T00:00:00Z
2021
2021-12-10T09:56:36Z
dc.type.driver.fl_str_mv conference object
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10071/23673
url http://hdl.handle.net/10071/23673
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 978-3-95977-202-0
2190-6807
10.4230/OASIcs.SLATE.2021.12
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
publisher.none.fl_str_mv Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833597299709706240