Using graph-kernels to represent semantic information in text classification

Bibliographic Details
Main Author: Gonçalves, Teresa
Publication Date: 2009
Other Authors: Quaresma, Paulo
Format: Article
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10174/2439
Summary: Most text classification systems use bag-of-words represen- tation of documents to find the classification target function. Linguistic structures such as morphology, syntax and semantic are completely ne- glected in the learning process. This paper proposes a new document representation that, while includ- ing its context independent sentence meaning, is able to be used by a structured kernel function, namely the direct product kernel. The proposal is evaluated using a dataset of articles from a Portuguese daily newspaper and classifiers are built using the SVM algorithm. The results show that this structured representation, while only partially de- scribing document’s significance has the same discriminative power over classes as the traditional bag-of-words approach.
id RCAP_7eb6286bcef958358d1a19b0e22dde86
oai_identifier_str oai:dspace.uevora.pt:10174/2439
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Using graph-kernels to represent semantic information in text classificationgraph-kernelstext classificationmachine learningMost text classification systems use bag-of-words represen- tation of documents to find the classification target function. Linguistic structures such as morphology, syntax and semantic are completely ne- glected in the learning process. This paper proposes a new document representation that, while includ- ing its context independent sentence meaning, is able to be used by a structured kernel function, namely the direct product kernel. The proposal is evaluated using a dataset of articles from a Portuguese daily newspaper and classifiers are built using the SVM algorithm. The results show that this structured representation, while only partially de- scribing document’s significance has the same discriminative power over classes as the traditional bag-of-words approach.Springer-Verlag2011-01-12T09:06:50Z2011-01-122009-07-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article295456 bytesapplication/pdfhttp://hdl.handle.net/10174/2439http://hdl.handle.net/10174/2439eng632-6465632Lecture Notes on Artificial Intelligencelivretcg@uevora.ptpq@uevora.ptMLDM'09 - International Conference on Machine Learning and Data Mining283Gonçalves, TeresaQuaresma, Pauloinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-01-03T18:37:53Zoai:dspace.uevora.pt:10174/2439Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T11:50:47.196521Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Using graph-kernels to represent semantic information in text classification
title Using graph-kernels to represent semantic information in text classification
spellingShingle Using graph-kernels to represent semantic information in text classification
Gonçalves, Teresa
graph-kernels
text classification
machine learning
title_short Using graph-kernels to represent semantic information in text classification
title_full Using graph-kernels to represent semantic information in text classification
title_fullStr Using graph-kernels to represent semantic information in text classification
title_full_unstemmed Using graph-kernels to represent semantic information in text classification
title_sort Using graph-kernels to represent semantic information in text classification
author Gonçalves, Teresa
author_facet Gonçalves, Teresa
Quaresma, Paulo
author_role author
author2 Quaresma, Paulo
author2_role author
dc.contributor.author.fl_str_mv Gonçalves, Teresa
Quaresma, Paulo
dc.subject.por.fl_str_mv graph-kernels
text classification
machine learning
topic graph-kernels
text classification
machine learning
description Most text classification systems use bag-of-words represen- tation of documents to find the classification target function. Linguistic structures such as morphology, syntax and semantic are completely ne- glected in the learning process. This paper proposes a new document representation that, while includ- ing its context independent sentence meaning, is able to be used by a structured kernel function, namely the direct product kernel. The proposal is evaluated using a dataset of articles from a Portuguese daily newspaper and classifiers are built using the SVM algorithm. The results show that this structured representation, while only partially de- scribing document’s significance has the same discriminative power over classes as the traditional bag-of-words approach.
publishDate 2009
dc.date.none.fl_str_mv 2009-07-01T00:00:00Z
2011-01-12T09:06:50Z
2011-01-12
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10174/2439
http://hdl.handle.net/10174/2439
url http://hdl.handle.net/10174/2439
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 632-646
5632
Lecture Notes on Artificial Intelligence
livre
tcg@uevora.pt
pq@uevora.pt
MLDM'09 - International Conference on Machine Learning and Data Mining
283
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 295456 bytes
application/pdf
dc.publisher.none.fl_str_mv Springer-Verlag
publisher.none.fl_str_mv Springer-Verlag
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833592282771619840