Semantic search of mobile applications using word embeddings
Main Author: | |
---|---|
Publication Date: | 2021 |
Other Authors: | , , , , |
Language: | eng |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | http://hdl.handle.net/10071/23673 |
Summary: | This paper proposes a set of approaches for the semantic search of mobile applications, based on their name and on the unstructured textual information contained in their description. The proposed approaches make use of word-level, character-level, and contextual word-embeddings that have been trained or fine-tuned using a dataset of about 500 thousand mobile apps, collected in the scope of this work. The proposed approaches have been evaluated using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non-exact queries. Our results show that both character-level embeddings trained on our data, and fine-tuned RoBERTa models surpass the performance of the other existing retrieval strategies reported in the literature. |
id |
RCAP_7906551448a76b6f8b1d4be990ecc9fe |
---|---|
oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/23673 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
Semantic search of mobile applications using word embeddingsSemantic searchWord embeddingsElasticsearchMobile applicationsThis paper proposes a set of approaches for the semantic search of mobile applications, based on their name and on the unstructured textual information contained in their description. The proposed approaches make use of word-level, character-level, and contextual word-embeddings that have been trained or fine-tuned using a dataset of about 500 thousand mobile apps, collected in the scope of this work. The proposed approaches have been evaluated using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non-exact queries. Our results show that both character-level embeddings trained on our data, and fine-tuned RoBERTa models surpass the performance of the other existing retrieval strategies reported in the literature.Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing2021-12-10T10:00:53Z2021-01-01T00:00:00Z20212021-12-10T09:56:36Zconference objectinfo:eu-repo/semantics/publishedVersionhttp://hdl.handle.net/10071/23673eng978-3-95977-202-02190-680710.4230/OASIcs.SLATE.2021.12Coelho, J.Neto, A.Tavares, M.Coutinho, C.Ribeiro, R.Batista, F.info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-07-07T03:07:47Zoai:repositorio.iscte-iul.pt:10071/23673Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T18:16:29.674307Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
Semantic search of mobile applications using word embeddings |
title |
Semantic search of mobile applications using word embeddings |
spellingShingle |
Semantic search of mobile applications using word embeddings Coelho, J. Semantic search Word embeddings Elasticsearch Mobile applications |
title_short |
Semantic search of mobile applications using word embeddings |
title_full |
Semantic search of mobile applications using word embeddings |
title_fullStr |
Semantic search of mobile applications using word embeddings |
title_full_unstemmed |
Semantic search of mobile applications using word embeddings |
title_sort |
Semantic search of mobile applications using word embeddings |
author |
Coelho, J. |
author_facet |
Coelho, J. Neto, A. Tavares, M. Coutinho, C. Ribeiro, R. Batista, F. |
author_role |
author |
author2 |
Neto, A. Tavares, M. Coutinho, C. Ribeiro, R. Batista, F. |
author2_role |
author author author author author |
dc.contributor.author.fl_str_mv |
Coelho, J. Neto, A. Tavares, M. Coutinho, C. Ribeiro, R. Batista, F. |
dc.subject.por.fl_str_mv |
Semantic search Word embeddings Elasticsearch Mobile applications |
topic |
Semantic search Word embeddings Elasticsearch Mobile applications |
description |
This paper proposes a set of approaches for the semantic search of mobile applications, based on their name and on the unstructured textual information contained in their description. The proposed approaches make use of word-level, character-level, and contextual word-embeddings that have been trained or fine-tuned using a dataset of about 500 thousand mobile apps, collected in the scope of this work. The proposed approaches have been evaluated using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non-exact queries. Our results show that both character-level embeddings trained on our data, and fine-tuned RoBERTa models surpass the performance of the other existing retrieval strategies reported in the literature. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-12-10T10:00:53Z 2021-01-01T00:00:00Z 2021 2021-12-10T09:56:36Z |
dc.type.driver.fl_str_mv |
conference object |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10071/23673 |
url |
http://hdl.handle.net/10071/23673 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
978-3-95977-202-0 2190-6807 10.4230/OASIcs.SLATE.2021.12 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
publisher.none.fl_str_mv |
Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833597299709706240 |