Transformer-based language models for semantic search and mobile applications retrieval
| Autor(a) principal: | |
|---|---|
| Data de Publicação: | 2021 |
| Outros Autores: | , , , , , |
| Idioma: | eng |
| Título da fonte: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Texto Completo: | http://hdl.handle.net/10071/23678 |
Resumo: | Search engines are being extensively used by Mobile App Stores, where millions of users world-wide use them every day. However, some stores still resort to simple lexical-based search engines, despite the recent advances in Machine Learning, Information Retrieval, and Natural Language Processing, which allow for richer semantic strategies. This work proposes an approach for semantic search of mobile applications that relies on transformer-based language models, fine-tuned with the existing textual information about known mobile applications. Our approach relies solely on the application name and on the unstructured textual information contained in its description. A dataset of about 500 thousand mobile apps was extended in the scope of this work with a test set, and all the available textual data was used to fine-tune our neural language models. We have evaluated our models using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non- exact queries. The results show that our model surpasses the performance of all the other retrieval strategies reported in the literature. Tests with users have confirmed the performance of our semantic search approach, when compared with an existing deployed solution. |
| id |
RCAP_7fadbbc81cca3d2c487d42e597f5fee8 |
|---|---|
| oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/23678 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
Transformer-based language models for semantic search and mobile applications retrievalSemantic searchWord embeddingsElasticSearchMobile applicationsTransformer-based modelsSearch engines are being extensively used by Mobile App Stores, where millions of users world-wide use them every day. However, some stores still resort to simple lexical-based search engines, despite the recent advances in Machine Learning, Information Retrieval, and Natural Language Processing, which allow for richer semantic strategies. This work proposes an approach for semantic search of mobile applications that relies on transformer-based language models, fine-tuned with the existing textual information about known mobile applications. Our approach relies solely on the application name and on the unstructured textual information contained in its description. A dataset of about 500 thousand mobile apps was extended in the scope of this work with a test set, and all the available textual data was used to fine-tune our neural language models. We have evaluated our models using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non- exact queries. The results show that our model surpasses the performance of all the other retrieval strategies reported in the literature. Tests with users have confirmed the performance of our semantic search approach, when compared with an existing deployed solution.SCITEPRESS – Science and Technology Publications, Lda2021-12-10T10:57:44Z2021-01-01T00:00:00Z20212021-12-10T10:53:49Zconference objectinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/10071/23678eng978-989-758-533-32184-322810.5220/0010657300003064Coelho, J.Neto, A.Tavares, M.Coutinho, C.Oliveira, J.Ribeiro, R.Batista, F.info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-07-07T03:34:08Zoai:repositorio.iscte-iul.pt:10071/23678Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T18:27:50.956807Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
Transformer-based language models for semantic search and mobile applications retrieval |
| title |
Transformer-based language models for semantic search and mobile applications retrieval |
| spellingShingle |
Transformer-based language models for semantic search and mobile applications retrieval Coelho, J. Semantic search Word embeddings ElasticSearch Mobile applications Transformer-based models |
| title_short |
Transformer-based language models for semantic search and mobile applications retrieval |
| title_full |
Transformer-based language models for semantic search and mobile applications retrieval |
| title_fullStr |
Transformer-based language models for semantic search and mobile applications retrieval |
| title_full_unstemmed |
Transformer-based language models for semantic search and mobile applications retrieval |
| title_sort |
Transformer-based language models for semantic search and mobile applications retrieval |
| author |
Coelho, J. |
| author_facet |
Coelho, J. Neto, A. Tavares, M. Coutinho, C. Oliveira, J. Ribeiro, R. Batista, F. |
| author_role |
author |
| author2 |
Neto, A. Tavares, M. Coutinho, C. Oliveira, J. Ribeiro, R. Batista, F. |
| author2_role |
author author author author author author |
| dc.contributor.author.fl_str_mv |
Coelho, J. Neto, A. Tavares, M. Coutinho, C. Oliveira, J. Ribeiro, R. Batista, F. |
| dc.subject.por.fl_str_mv |
Semantic search Word embeddings ElasticSearch Mobile applications Transformer-based models |
| topic |
Semantic search Word embeddings ElasticSearch Mobile applications Transformer-based models |
| description |
Search engines are being extensively used by Mobile App Stores, where millions of users world-wide use them every day. However, some stores still resort to simple lexical-based search engines, despite the recent advances in Machine Learning, Information Retrieval, and Natural Language Processing, which allow for richer semantic strategies. This work proposes an approach for semantic search of mobile applications that relies on transformer-based language models, fine-tuned with the existing textual information about known mobile applications. Our approach relies solely on the application name and on the unstructured textual information contained in its description. A dataset of about 500 thousand mobile apps was extended in the scope of this work with a test set, and all the available textual data was used to fine-tune our neural language models. We have evaluated our models using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non- exact queries. The results show that our model surpasses the performance of all the other retrieval strategies reported in the literature. Tests with users have confirmed the performance of our semantic search approach, when compared with an existing deployed solution. |
| publishDate |
2021 |
| dc.date.none.fl_str_mv |
2021-12-10T10:57:44Z 2021-01-01T00:00:00Z 2021 2021-12-10T10:53:49Z |
| dc.type.driver.fl_str_mv |
conference object |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10071/23678 |
| url |
http://hdl.handle.net/10071/23678 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
978-989-758-533-3 2184-3228 10.5220/0010657300003064 |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
SCITEPRESS – Science and Technology Publications, Lda |
| publisher.none.fl_str_mv |
SCITEPRESS – Science and Technology Publications, Lda |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833597434427604992 |