Using graph-kernels to represent semantic information in text classification
Main Author: | |
---|---|
Publication Date: | 2009 |
Other Authors: | |
Format: | Article |
Language: | eng |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | http://hdl.handle.net/10174/2439 |
Summary: | Most text classification systems use bag-of-words represen- tation of documents to find the classification target function. Linguistic structures such as morphology, syntax and semantic are completely ne- glected in the learning process. This paper proposes a new document representation that, while includ- ing its context independent sentence meaning, is able to be used by a structured kernel function, namely the direct product kernel. The proposal is evaluated using a dataset of articles from a Portuguese daily newspaper and classifiers are built using the SVM algorithm. The results show that this structured representation, while only partially de- scribing document’s significance has the same discriminative power over classes as the traditional bag-of-words approach. |
id |
RCAP_7eb6286bcef958358d1a19b0e22dde86 |
---|---|
oai_identifier_str |
oai:dspace.uevora.pt:10174/2439 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
Using graph-kernels to represent semantic information in text classificationgraph-kernelstext classificationmachine learningMost text classification systems use bag-of-words represen- tation of documents to find the classification target function. Linguistic structures such as morphology, syntax and semantic are completely ne- glected in the learning process. This paper proposes a new document representation that, while includ- ing its context independent sentence meaning, is able to be used by a structured kernel function, namely the direct product kernel. The proposal is evaluated using a dataset of articles from a Portuguese daily newspaper and classifiers are built using the SVM algorithm. The results show that this structured representation, while only partially de- scribing document’s significance has the same discriminative power over classes as the traditional bag-of-words approach.Springer-Verlag2011-01-12T09:06:50Z2011-01-122009-07-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article295456 bytesapplication/pdfhttp://hdl.handle.net/10174/2439http://hdl.handle.net/10174/2439eng632-6465632Lecture Notes on Artificial Intelligencelivretcg@uevora.ptpq@uevora.ptMLDM'09 - International Conference on Machine Learning and Data Mining283Gonçalves, TeresaQuaresma, Pauloinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-01-03T18:37:53Zoai:dspace.uevora.pt:10174/2439Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T11:50:47.196521Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
Using graph-kernels to represent semantic information in text classification |
title |
Using graph-kernels to represent semantic information in text classification |
spellingShingle |
Using graph-kernels to represent semantic information in text classification Gonçalves, Teresa graph-kernels text classification machine learning |
title_short |
Using graph-kernels to represent semantic information in text classification |
title_full |
Using graph-kernels to represent semantic information in text classification |
title_fullStr |
Using graph-kernels to represent semantic information in text classification |
title_full_unstemmed |
Using graph-kernels to represent semantic information in text classification |
title_sort |
Using graph-kernels to represent semantic information in text classification |
author |
Gonçalves, Teresa |
author_facet |
Gonçalves, Teresa Quaresma, Paulo |
author_role |
author |
author2 |
Quaresma, Paulo |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Gonçalves, Teresa Quaresma, Paulo |
dc.subject.por.fl_str_mv |
graph-kernels text classification machine learning |
topic |
graph-kernels text classification machine learning |
description |
Most text classification systems use bag-of-words represen- tation of documents to find the classification target function. Linguistic structures such as morphology, syntax and semantic are completely ne- glected in the learning process. This paper proposes a new document representation that, while includ- ing its context independent sentence meaning, is able to be used by a structured kernel function, namely the direct product kernel. The proposal is evaluated using a dataset of articles from a Portuguese daily newspaper and classifiers are built using the SVM algorithm. The results show that this structured representation, while only partially de- scribing document’s significance has the same discriminative power over classes as the traditional bag-of-words approach. |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009-07-01T00:00:00Z 2011-01-12T09:06:50Z 2011-01-12 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10174/2439 http://hdl.handle.net/10174/2439 |
url |
http://hdl.handle.net/10174/2439 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
632-646 5632 Lecture Notes on Artificial Intelligence livre tcg@uevora.pt pq@uevora.pt MLDM'09 - International Conference on Machine Learning and Data Mining 283 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
295456 bytes application/pdf |
dc.publisher.none.fl_str_mv |
Springer-Verlag |
publisher.none.fl_str_mv |
Springer-Verlag |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833592282771619840 |