Detalhes bibliográficos
Ano de defesa: |
2004 |
Autor(a) principal: |
Souza, Jerffeson Teixeira de |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
por |
Instituição de defesa: |
University of Ottawa
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://siduece.uece.br/siduece/trabalhoAcademicoPublico.jsf?id=83883
|
Resumo: |
<div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">Abstract</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">The Feature Selection problem involves discovering a subset of features, such that a classifier</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">built only with this subset would have better predictive accuracy than a classifier built</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">from the entire set of features. A large number of algorithms have already been proposed</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">for the feature selection problem. Although significantly different with regards to 1) the</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">search strategy they use to determine the right subset of features and 2) how each subset</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">is evaluated, feature selection algorithms are usually classified in three general groups:</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">Filters, Wrappers and Hybrid solutions.</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">In this thesis, we propose a new hybrid system for the problem of feature selection</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">in machine learning. The idea behind this new algorithm, FortalFS, is to extract and</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">combine the best characteristics of filters and wrappers in one algorithm. FortalFS uses</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">results from another feature selection system as a starting point in the search through</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">subsets of features that are evaluated by a machine learning algorithm. With an efficient</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">search heuristic, we can decrease the number of subsets of features to be evaluated by</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">the learning algorithm, consequently decreasing computational effort and still be able to</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">select an accurate subset. We have also designed a variant of the original algorithm in the</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">attempt to work with feature weighting algorithm.</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">In order to evaluate this new algorithm, a number of experiments were run and the</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">results compared to well-known feature selection filter and wrapper algorithms, such as</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">Focus, Relief, LVF, and others. Such experiments were run over a number of datasets from</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">the UCI Repository. Results showed that FortalFS outperforms most of the algorithms</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">significantly. However, it presents time-consuming performance similar to that of wrappers.</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">Additional experiments using specially designed artificial datasets demonstrated</span></font></div><div style=""><font face="Arial, Verdana"><span style="font-size: 13.3333px;">that FortalFS is able to identify and remove both irrelevant, redundant and randomly</span></font></div> |