Detalhes bibliográficos
Ano de defesa: |
2012 |
Autor(a) principal: |
Jr., Juscelino Izidoro de Oliveira |
Orientador(a): |
Rocha, Jose Carlos Ferreira da
 |
Banca de defesa: |
Mathias, Ivo Mario
,
Kikuti, Daniel
 |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
por |
Instituição de defesa: |
UNIVERSIDADE ESTADUAL DE PONTA GROSSA
|
Programa de Pós-Graduação: |
Programa de Pós Graduação Computação Aplicada
|
Departamento: |
Computação para Tecnologias em Agricultura
|
País: |
BR
|
Palavras-chave em Português: |
|
Palavras-chave em Inglês: |
|
Área do conhecimento CNPq: |
|
Link de acesso: |
http://tede2.uepg.br/jspui/handle/prefix/152
|
Resumo: |
Multivariate data analysis allows the researcher to verify the interaction among a lot of attributes that can influence the behavior of a response variable. That analysis uses models that can be induced from experimental data set. An important issue in the induction of multivariate regressors and classifers is the sample size, because this determines the reliability of the model for tasks of regression or classification of the response variable. This work approachs the sample size issue through the Theory of Probably Approximately Correct Learning, that comes from problems about machine learning for induction of models. Given the importance of agricultural modelling, this work shows two procedures to select variables. Variable Selection by Principal Component Analysis is an unsupervised procedure and allows the researcher to select the most relevant variables from the agricultural data by considering the variation in the data. Variable Selection by Supervised Principal Component Analysis is a supervised procedure and allows the researcher to perform the same process as in the previous procedure, but concentrating the focus of the selection over the variables with more influence in the behavior of the response variable. Both procedures allow the sample complexity informations to be explored in variable selection process. Those procedures were tested in five experiments, showing that the supervised procedure has allowed to induce models that produced better scores, by mean, than that models induced over variables selected by unsupervised procedure. Those experiments also allowed to verify that the variables selected by the unsupervised and supervised procedure showed reduced indices of multicolinearity. |