The impact of NLP techniques in the multilabel text classification problem

Gonçalves, Teresa; Quaresma, Paulo

The impact of NLP techniques in the multilabel text classification problem

Bibliographic Details
Main Author:	Gonçalves, Teresa
Publication Date:	2004
Other Authors:	Quaresma, Paulo
Format:	Article
Language:	eng
Source:	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full:	http://hdl.handle.net/10174/2558
Summary:	Support Vector Machines have been used successfully to classify text documents into sets of concepts. However, typically, linguistic information is not being used in the classification process or its use has not been fully evaluated. We apply and evaluate two basic linguistic procedures (stop-word removal and stemming/lemmatization) to the multilabel text classification problem. These procedures are applied to the Reuters dataset and to the Portuguese juridical documents from Supreme Courts and Attorney General’s Office.

Item metadata

id	RCAP_4f6e13ffb7f611e5be0ca3dc916f5273
oai_identifier_str	oai:dspace.uevora.pt:10174/2558
network_acronym_str	RCAP
network_name_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str	https://opendoar.ac.uk/repository/7160
spelling	The impact of NLP techniques in the multilabel text classification problemmachine learningText classificationSupport Vector Machines have been used successfully to classify text documents into sets of concepts. However, typically, linguistic information is not being used in the classification process or its use has not been fully evaluated. We apply and evaluate two basic linguistic procedures (stop-word removal and stemming/lemmatization) to the multilabel text classification problem. These procedures are applied to the Reuters dataset and to the Portuguese juridical documents from Supreme Courts and Attorney General’s Office.Springer-Verlag2011-02-15T11:25:04Z2011-02-152004-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article168602 bytesapplication/pdfhttp://hdl.handle.net/10174/2558http://hdl.handle.net/10174/2558eng424-428Advances in Soft Computinglivretcg@uevora.ptpq@uevora.ptIIPWM-04, Intelligent Information Processing and Web MiningKlopotek, M.Weirzchon, S.Trojanowski, K.498Gonçalves, TeresaQuaresma, Pauloinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-01-03T18:39:06Zoai:dspace.uevora.pt:10174/2558Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T11:51:22.029443Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv	The impact of NLP techniques in the multilabel text classification problem
title	The impact of NLP techniques in the multilabel text classification problem
spellingShingle	The impact of NLP techniques in the multilabel text classification problem Gonçalves, Teresa machine learning Text classification
title_short	The impact of NLP techniques in the multilabel text classification problem
title_full	The impact of NLP techniques in the multilabel text classification problem
title_fullStr	The impact of NLP techniques in the multilabel text classification problem
title_full_unstemmed	The impact of NLP techniques in the multilabel text classification problem
title_sort	The impact of NLP techniques in the multilabel text classification problem
author	Gonçalves, Teresa
author_facet	Gonçalves, Teresa Quaresma, Paulo
author_role	author
author2	Quaresma, Paulo
author2_role	author
dc.contributor.author.fl_str_mv	Gonçalves, Teresa Quaresma, Paulo
dc.subject.por.fl_str_mv	machine learning Text classification
topic	machine learning Text classification
description	Support Vector Machines have been used successfully to classify text documents into sets of concepts. However, typically, linguistic information is not being used in the classification process or its use has not been fully evaluated. We apply and evaluate two basic linguistic procedures (stop-word removal and stemming/lemmatization) to the multilabel text classification problem. These procedures are applied to the Reuters dataset and to the Portuguese juridical documents from Supreme Courts and Attorney General’s Office.
publishDate	2004
dc.date.none.fl_str_mv	2004-01-01T00:00:00Z 2011-02-15T11:25:04Z 2011-02-15
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10174/2558 http://hdl.handle.net/10174/2558
url	http://hdl.handle.net/10174/2558
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	424-428 Advances in Soft Computing livre tcg@uevora.pt pq@uevora.pt IIPWM-04, Intelligent Information Processing and Web Mining Klopotek, M. Weirzchon, S. Trojanowski, K. 498
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	168602 bytes application/pdf
dc.publisher.none.fl_str_mv	Springer-Verlag
publisher.none.fl_str_mv	Springer-Verlag
dc.source.none.fl_str_mv	reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP
instname_str	FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv	info@rcaap.pt
_version_	1833592299719753728

The impact of NLP techniques in the multilabel text classification problem

Similar Items