Using a fairness-utility trade-off metric to systematically benchmark non-generative fair adversarial learning strategies
Ano de defesa: | 2022 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal da Paraíba
Brasil Informática Programa de Pós-Graduação em Informática UFPB |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | https://repositorio.ufpb.br/jspui/handle/123456789/26323 |
Resumo: | Artificial intelligence systems for decision-making have become increasingly popular in several areas. However, it is possible to identify biased decisions in many applications, which have become a concern for the computer science, artificial intelligence, and law communities. Therefore, researches are proposing solutions to mitigate bias and discrimination in decision-makers. Some explored strategies are based on generative adversarial networks to generate fair data. Others are based on adversarial learning to achieve fairness in machine learning by encoding fairness constraints through an adversarial model. Moreover, it is usual for each proposal to assess its model with a specific metric, making the comparison of current approaches a complex task. Therefore, this work proposes a benchmark procedure with a systematical method to assess the fair machine learning models. In this sense, we define the FU-score metric to evaluate the utility-fairness trade-off, the utility and fairness metrics to compose this assessment, the used dataset and applied data preparation, and the statistical test. We also performed this benchmark evaluation for the non-generative adversarial models, analyzing the literature models from the same metric perspective. This assessment could not indicate a single model which better performs for all datasets. However, we built an understanding of how each model performs on each dataset which statistical confidence. |