Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented]
| Main Author: | |
|---|---|
| Publication Date: | 2022 |
| Other Authors: | , , , , |
| Format: | Article |
| Language: | eng |
| Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Download full: | https://hdl.handle.net/1822/81434 |
Summary: | Categorical Attribute traNsformation Environment (CANE) is a simpler but powerful data categorical preprocessing Python package. The package is valuable since there is currently a large range of Machine Learning (ML) algorithms that can only be trained using numerical data (e.g., Deep Learning, Support Vector Machines) and several real-world ML applications are associated with categorical data attributes. Currently, CANE offers three categorical to numeric transformation methods, namely: Percentage Categorical Pruned (PCP), Inverse Document Frequency (IDF) and a simpler One-Hot-Encoding method. Additionally, the CANE module is well documented with several code examples that can help in its adoption by non expert users. |
| id |
RCAP_698376a86f20dbf9cf3131a8035a2d32 |
|---|---|
| oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/81434 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented]CANEData preprocessingMachine learningPython programming languageScience & TechnologyCategorical Attribute traNsformation Environment (CANE) is a simpler but powerful data categorical preprocessing Python package. The package is valuable since there is currently a large range of Machine Learning (ML) algorithms that can only be trained using numerical data (e.g., Deep Learning, Support Vector Machines) and several real-world ML applications are associated with categorical data attributes. Currently, CANE offers three categorical to numeric transformation methods, namely: Percentage Categorical Pruned (PCP), Inverse Document Frequency (IDF) and a simpler One-Hot-Encoding method. Additionally, the CANE module is well documented with several code examples that can help in its adoption by non expert users.The authors are grateful for project NORTE-01-0247-FEDER-017497, supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF). This work was also supported by FCT Fundação para a Ciência e Tecnologia, Portugal within the Project Scope: UID/CEC/00319/2019. The authors are also grateful for all the contributors that assisted in making CANE more intuitive.ElsevierUniversidade do MinhoMatos, Luís MiguelAzevedo, JoãoMatta, ArthurPilastri, AndréCortez, PauloMendes, Rui2022-08-012022-08-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/1822/81434engMatos, L. M., Azevedo, J., Matta, A., Pilastri, A., Cortez, P., & Mendes, R. (2022, August). Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing. Software Impacts. Elsevier BV. http://doi.org/10.1016/j.simpa.2022.1003592665-963810.1016/j.simpa.2022.100359https://www.sciencedirect.com/science/article/pii/S2665963822000720info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-04-12T05:19:22Zoai:repositorium.sdum.uminho.pt:1822/81434Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T16:23:18.438872Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
| title |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
| spellingShingle |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] Matos, Luís Miguel CANE Data preprocessing Machine learning Python programming language Science & Technology |
| title_short |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
| title_full |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
| title_fullStr |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
| title_full_unstemmed |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
| title_sort |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
| author |
Matos, Luís Miguel |
| author_facet |
Matos, Luís Miguel Azevedo, João Matta, Arthur Pilastri, André Cortez, Paulo Mendes, Rui |
| author_role |
author |
| author2 |
Azevedo, João Matta, Arthur Pilastri, André Cortez, Paulo Mendes, Rui |
| author2_role |
author author author author author |
| dc.contributor.none.fl_str_mv |
Universidade do Minho |
| dc.contributor.author.fl_str_mv |
Matos, Luís Miguel Azevedo, João Matta, Arthur Pilastri, André Cortez, Paulo Mendes, Rui |
| dc.subject.por.fl_str_mv |
CANE Data preprocessing Machine learning Python programming language Science & Technology |
| topic |
CANE Data preprocessing Machine learning Python programming language Science & Technology |
| description |
Categorical Attribute traNsformation Environment (CANE) is a simpler but powerful data categorical preprocessing Python package. The package is valuable since there is currently a large range of Machine Learning (ML) algorithms that can only be trained using numerical data (e.g., Deep Learning, Support Vector Machines) and several real-world ML applications are associated with categorical data attributes. Currently, CANE offers three categorical to numeric transformation methods, namely: Percentage Categorical Pruned (PCP), Inverse Document Frequency (IDF) and a simpler One-Hot-Encoding method. Additionally, the CANE module is well documented with several code examples that can help in its adoption by non expert users. |
| publishDate |
2022 |
| dc.date.none.fl_str_mv |
2022-08-01 2022-08-01T00:00:00Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1822/81434 |
| url |
https://hdl.handle.net/1822/81434 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
Matos, L. M., Azevedo, J., Matta, A., Pilastri, A., Cortez, P., & Mendes, R. (2022, August). Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing. Software Impacts. Elsevier BV. http://doi.org/10.1016/j.simpa.2022.100359 2665-9638 10.1016/j.simpa.2022.100359 https://www.sciencedirect.com/science/article/pii/S2665963822000720 |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Elsevier |
| publisher.none.fl_str_mv |
Elsevier |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833595913631694848 |