Lexicon annotation with LLM: a proof of concept with ChatGPT
| Autor(a) principal: | |
|---|---|
| Data de Publicação: | 2025 |
| Outros Autores: | , , , |
| Idioma: | eng |
| Título da fonte: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Texto Completo: | https://hdl.handle.net/1822/95171 |
Resumo: | Lexicon annotation is a critical yet time-consuming task that can hold back the progress of language-intensive projects. This paper explores the potential of Large Language Models (LLMs) to automate lexicon annotation, traditionally performed by humans. We present a proof of concept by evaluating ChatGPT's performance on annotating VADER's sentiment lexicon. Our findings demonstrate that ChatGPT achieves fair performance in this task, suggesting that LLMs can operate as a valuable tool for initial annotations, with subsequent refinements by domain specialists. This approach could significantly accelerate lexicon development and maintenance while balancing efficiency and accuracy. Our study provides insights into the capabilities and limitations of LLMs in lexicon annotation, leading the way for further research in automating linguistic resources development. |
| id |
RCAP_fd889da75f22bd099bc4864d30732819 |
|---|---|
| oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/95171 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
Lexicon annotation with LLM: a proof of concept with ChatGPTChatGPTLexicon annotationLLMsNLPCiências Naturais::Ciências da Computação e da InformaçãoLexicon annotation is a critical yet time-consuming task that can hold back the progress of language-intensive projects. This paper explores the potential of Large Language Models (LLMs) to automate lexicon annotation, traditionally performed by humans. We present a proof of concept by evaluating ChatGPT's performance on annotating VADER's sentiment lexicon. Our findings demonstrate that ChatGPT achieves fair performance in this task, suggesting that LLMs can operate as a valuable tool for initial annotations, with subsequent refinements by domain specialists. This approach could significantly accelerate lexicon development and maintenance while balancing efficiency and accuracy. Our study provides insights into the capabilities and limitations of LLMs in lexicon annotation, leading the way for further research in automating linguistic resources development.This work has been supported by FCT - Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020.Springer, ChamUniversidade do MinhoMarcondes, Francisco SupinoGala, Adelino de C.O.S.Rodrigues, ManuelAlmeida, J. J.Novais, Paulo20252025-01-01T00:00:00Zconference paperinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://hdl.handle.net/1822/95171engMarcondes, F.S., Gala, A., Rodrigues, M., Almeida, J.J., Novais, P. (2025). Lexicon Annotation with LLM: A Proof of Concept with ChatGPT. In: Quintián, H., et al. Hybrid Artificial Intelligent Systems. HAIS 2024. Lecture Notes in Computer Science, vol 14858. Springer, Cham. https://doi.org/10.1007/978-3-031-74186-9_16978-3-031-74185-20302-974310.1007/978-3-031-74186-9_16978-3-031-74186-9https://link.springer.com/chapter/10.1007/978-3-031-74186-9_16info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-04-05T01:20:50Zoai:repositorium.sdum.uminho.pt:1822/95171Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T06:21:10.230582Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
Lexicon annotation with LLM: a proof of concept with ChatGPT |
| title |
Lexicon annotation with LLM: a proof of concept with ChatGPT |
| spellingShingle |
Lexicon annotation with LLM: a proof of concept with ChatGPT Marcondes, Francisco Supino ChatGPT Lexicon annotation LLMs NLP Ciências Naturais::Ciências da Computação e da Informação |
| title_short |
Lexicon annotation with LLM: a proof of concept with ChatGPT |
| title_full |
Lexicon annotation with LLM: a proof of concept with ChatGPT |
| title_fullStr |
Lexicon annotation with LLM: a proof of concept with ChatGPT |
| title_full_unstemmed |
Lexicon annotation with LLM: a proof of concept with ChatGPT |
| title_sort |
Lexicon annotation with LLM: a proof of concept with ChatGPT |
| author |
Marcondes, Francisco Supino |
| author_facet |
Marcondes, Francisco Supino Gala, Adelino de C.O.S. Rodrigues, Manuel Almeida, J. J. Novais, Paulo |
| author_role |
author |
| author2 |
Gala, Adelino de C.O.S. Rodrigues, Manuel Almeida, J. J. Novais, Paulo |
| author2_role |
author author author author |
| dc.contributor.none.fl_str_mv |
Universidade do Minho |
| dc.contributor.author.fl_str_mv |
Marcondes, Francisco Supino Gala, Adelino de C.O.S. Rodrigues, Manuel Almeida, J. J. Novais, Paulo |
| dc.subject.por.fl_str_mv |
ChatGPT Lexicon annotation LLMs NLP Ciências Naturais::Ciências da Computação e da Informação |
| topic |
ChatGPT Lexicon annotation LLMs NLP Ciências Naturais::Ciências da Computação e da Informação |
| description |
Lexicon annotation is a critical yet time-consuming task that can hold back the progress of language-intensive projects. This paper explores the potential of Large Language Models (LLMs) to automate lexicon annotation, traditionally performed by humans. We present a proof of concept by evaluating ChatGPT's performance on annotating VADER's sentiment lexicon. Our findings demonstrate that ChatGPT achieves fair performance in this task, suggesting that LLMs can operate as a valuable tool for initial annotations, with subsequent refinements by domain specialists. This approach could significantly accelerate lexicon development and maintenance while balancing efficiency and accuracy. Our study provides insights into the capabilities and limitations of LLMs in lexicon annotation, leading the way for further research in automating linguistic resources development. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025 2025-01-01T00:00:00Z |
| dc.type.driver.fl_str_mv |
conference paper |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1822/95171 |
| url |
https://hdl.handle.net/1822/95171 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
Marcondes, F.S., Gala, A., Rodrigues, M., Almeida, J.J., Novais, P. (2025). Lexicon Annotation with LLM: A Proof of Concept with ChatGPT. In: Quintián, H., et al. Hybrid Artificial Intelligent Systems. HAIS 2024. Lecture Notes in Computer Science, vol 14858. Springer, Cham. https://doi.org/10.1007/978-3-031-74186-9_16 978-3-031-74185-2 0302-9743 10.1007/978-3-031-74186-9_16 978-3-031-74186-9 https://link.springer.com/chapter/10.1007/978-3-031-74186-9_16 |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Springer, Cham |
| publisher.none.fl_str_mv |
Springer, Cham |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833602660853350400 |