Técnicas de agrupamento de dados aplicadas aos dados de acidente de trabalho
Ano de defesa: | 2020 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Uberlândia
Brasil Programa de Pós-graduação em Ciência da Computação |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | https://repositorio.ufu.br/handle/123456789/29531 http://doi.org/10.14393/ufu.di.2020.487 |
Resumo: | Brazil occupies the fourth place in the worldwide ranking of catalogued labor accidents. Among other misfortunes, such occurrences generate inconveniences to the injured ones, losses in work productivity and pressure the public budget referring to aids and indemnities due to accidents. This dissertation aims to search and characterize groups of labor accidents, granting interpretability to the obtained results, in order to extract information that can be relevant to public managers. Therefore, the methodological procedures go by the implementation of a set of steps, namely: data pre-processing; creation of subsets from the original dataset; selection of the best attributes to the clustering task; application of two hierarchical clustering, HDBSCAN* and COBWEB; evaluation of the results through the use of the Simplified Silhouette validation measure and the use of the PowerBI tool, to visualize graphics which may able the evaluation and the composition of the clusters found. Therefore, it was necessary to propose a measure to calculate the distance between two instances, composed as by numerical attributes as by categorical ones. This measure enabled, in the dataset of the present study, the execution of relational algorithms, such as HDBSCAN*, besides the calculus of validation measures which measures the distance between instances, such as the Simplified Silhouette. The results show that the distance measure here proposed made the search of clusters by the algorithm hard. Thus, to certain cases, no clusters were found, and, to the other ones, the algorithm clustered only identical instances. Not presenting such inconvenient, the Cobweb algorithm didn’t demand adaptations to work with the kind of data present in the basis, being able to aggregate not only identical instances, but also similar instances. The research demonstrated the susceptibility of male workers, with the age between 18 and 34 years old, the labor accidents which cause injures on the fingers, by handling machines and equipment and/or manual tools, moreover the ones who perform activities such as Fishing and Fish Farming. The occurrences of this nature gained prominence, such as in bigger clusters of each year as in Triângulo Mineiro/Alto Paranaíba and Metropolitan São Paulo, both analyzed mesoregions. Nevertheless, the clusters composed mostly by female victims have a slightly different delineation, especially those who work in the production of cellulose, paper and correlated products. Even though the fingers continue as the most affected body part, the female workers of this segment are likely to accidents occasioned by the management of chemical agents, biological ones and/or manual tools. |