Analysis of the impacts of label dependence in multi-label learning
Ano de defesa: | 2021 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal do Espírito Santo
BR Doutorado em Ciência da Computação Centro Tecnológico UFES Programa de Pós-Graduação em Informática |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://repositorio.ufes.br/handle/10/15475 |
Resumo: | Conclusions in the field of multi-label learning are often drawn from experiments using real benchmark datasets, which is a good practice for comparing results. However, it hardly proves or clearly shows how dependencies among class labels impact on the performance and behaviour of multi-label algorithms. A reasonable approach to tackle this issue consists of adopting a mathematical or statistical formulation of the problem and using it to elaborate theoretical proofs. Another approach consists of elaborating experiments in a well-controlled environment where the dependence among labels can be easier controlled and analyzed, which is the case for many works based on artificial datasets. Both approaches are adopted in this thesis to understand the role of label dependence in multi-label learning. The work done in this thesis is composed of several contributions regarding the analysis of multi-label algorithms from a statistical perspective. One contribution is that calibrated label ranking is an algorithm that can perform extremely poor in particular scenarios where label dependence is present, due to the way that pairwise comparison of labels is done by the algorithm. Another contribution is that the label dependence present in multi-label learning makes the optimization of the expected coverage a NP-hard problem, even at restricted conditions. Finally, a proposal is presented on how to build an experimental environment where the label dependence can conveniently be controlled for comparing performance among multi-label learning algorithms. |