Combining computational linguistics with sentence embedding to create a zero-shot NLIDB
| Main Author: | |
|---|---|
| Publication Date: | 2024 |
| Other Authors: | , |
| Format: | Article |
| Language: | eng |
| Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Download full: | http://hdl.handle.net/10362/174658 |
Summary: | Perezhohin, Y., Peres, F., & Castelli, M. (2024). Combining computational linguistics with sentence embedding to create a zero-shot NLIDB. Array, 24, 1-11. Article 100368. https://doi.org/10.1016/j.array.2024.100368 --- This work was supported by MyNorth AI Research. This work was partially supported by national funds through the FCT (Fundação para a Ciência e a Tecnologia) by the project UIDB/04152/2020 - Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMS. |
| id |
RCAP_8a6c84292de933f8234e1fb8e81442d6 |
|---|---|
| oai_identifier_str |
oai:run.unl.pt:10362/174658 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
Combining computational linguistics with sentence embedding to create a zero-shot NLIDBText to SQLNatural language processingComputational linguisticsSentence embeddingsComputer Science(all)SDG 9 - Industry, Innovation, and InfrastructurePerezhohin, Y., Peres, F., & Castelli, M. (2024). Combining computational linguistics with sentence embedding to create a zero-shot NLIDB. Array, 24, 1-11. Article 100368. https://doi.org/10.1016/j.array.2024.100368 --- This work was supported by MyNorth AI Research. This work was partially supported by national funds through the FCT (Fundação para a Ciência e a Tecnologia) by the project UIDB/04152/2020 - Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMS.Accessing relational databases using natural language is a challenging task, with existing methods often suffering from poor domain generalization and high computational costs. In this study, we propose a novel approach that eliminates the training phase while offering high adaptability across domains. Our method combines structured linguistic rules, a curated vocabulary, and pre-trained embedding models to accurately translate natural language queries into SQL. Experimental results on the SPIDER benchmark demonstrate the effectiveness of our approach, with execution accuracy rates of 72.03% on the training set and 70.83% on the development set, while maintaining domain flexibility. Furthermore, the proposed system outperformed two extensively trained models by up to 28.33% on the development set, demonstrating its efficiency. This research presents a significant advancement in zero-shot Natural Language Interfaces for Databases (NLIDBs), providing a resource-efficient alternative for generating accurate SQL queries from plain language inputs.NOVA Information Management School (NOVA IMS)Information Management Research Center (MagIC) - NOVA Information Management SchoolRUNPerezhohin, YuriyPeres, FernandoCastelli, Mauro2024-11-05T23:20:34Z2024-122024-12-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article11application/pdfhttp://hdl.handle.net/10362/174658eng2590-0056PURE: 101919824https://doi.org/10.1016/j.array.2024.100368info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-12-02T01:35:36Zoai:run.unl.pt:10362/174658Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T19:12:54.370116Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
Combining computational linguistics with sentence embedding to create a zero-shot NLIDB |
| title |
Combining computational linguistics with sentence embedding to create a zero-shot NLIDB |
| spellingShingle |
Combining computational linguistics with sentence embedding to create a zero-shot NLIDB Perezhohin, Yuriy Text to SQL Natural language processing Computational linguistics Sentence embeddings Computer Science(all) SDG 9 - Industry, Innovation, and Infrastructure |
| title_short |
Combining computational linguistics with sentence embedding to create a zero-shot NLIDB |
| title_full |
Combining computational linguistics with sentence embedding to create a zero-shot NLIDB |
| title_fullStr |
Combining computational linguistics with sentence embedding to create a zero-shot NLIDB |
| title_full_unstemmed |
Combining computational linguistics with sentence embedding to create a zero-shot NLIDB |
| title_sort |
Combining computational linguistics with sentence embedding to create a zero-shot NLIDB |
| author |
Perezhohin, Yuriy |
| author_facet |
Perezhohin, Yuriy Peres, Fernando Castelli, Mauro |
| author_role |
author |
| author2 |
Peres, Fernando Castelli, Mauro |
| author2_role |
author author |
| dc.contributor.none.fl_str_mv |
NOVA Information Management School (NOVA IMS) Information Management Research Center (MagIC) - NOVA Information Management School RUN |
| dc.contributor.author.fl_str_mv |
Perezhohin, Yuriy Peres, Fernando Castelli, Mauro |
| dc.subject.por.fl_str_mv |
Text to SQL Natural language processing Computational linguistics Sentence embeddings Computer Science(all) SDG 9 - Industry, Innovation, and Infrastructure |
| topic |
Text to SQL Natural language processing Computational linguistics Sentence embeddings Computer Science(all) SDG 9 - Industry, Innovation, and Infrastructure |
| description |
Perezhohin, Y., Peres, F., & Castelli, M. (2024). Combining computational linguistics with sentence embedding to create a zero-shot NLIDB. Array, 24, 1-11. Article 100368. https://doi.org/10.1016/j.array.2024.100368 --- This work was supported by MyNorth AI Research. This work was partially supported by national funds through the FCT (Fundação para a Ciência e a Tecnologia) by the project UIDB/04152/2020 - Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMS. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024-11-05T23:20:34Z 2024-12 2024-12-01T00:00:00Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/174658 |
| url |
http://hdl.handle.net/10362/174658 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
2590-0056 PURE: 101919824 https://doi.org/10.1016/j.array.2024.100368 |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
11 application/pdf |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833597947278786560 |