Detalhes bibliográficos
Ano de defesa: |
2024 |
Autor(a) principal: |
Rozin, Bionda [UNESP] |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Universidade Estadual Paulista (Unesp)
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://hdl.handle.net/11449/257737
|
Resumo: |
Due to the great applicability of time series in diverse scenarios, such as medicine, agriculture, economics, and science, the analysis and processing of this kind of data is demanding. Tools such as information retrieval, classification, and clustering are crucial for analyzing time series in different contexts and with different objectives. Information retrieval tasks in time series data can identify patterns and rank data by similarity. At the same time, classification can label time series based on a training set, and clustering can group time series based on their similarities. Also, semi-supervised classification considers both labeled and unlabeled data to perform classification. In general, Machine Learning and Information Retrieval tasks are extremely dependent on a good computational representation of data, generating more effective results and assertive conclusions about the performed task. In this scenario, one of the main challenges is to obtain good features from Time Series. Also, similarity metrics usually consider only pairwise relations, not considering important information in the neighborhood of the analyzed items in the dataset. The objective of this research is to apply machine learning and information retrieval techniques for obtaining effective results in time series analysis. Four different methods are employed, and different feature extractors are evaluated in all tasks. First, a comparative study of univariate time series representation and ranking through contextual ranked-based distance learning is conducted in 10 different datasets, leading to mAP gains up to 31.78\%. Giving sequence to this research line, we propose multivariate time series analysis by processing each dimension of the series individually and using contextual rank aggregation methods to merge results and obtain a similarity representation used for retrieval and classification, obtaining competitive results to two SOTA methods. A clustering-based framework for data analysis based on temporal graph encoding is also proposed, where data is split using time segmentation criteria, and highly interpretative results are reached in this framework when applied to ball possession analysis in football matches. Last, semi-supervised classification of univariate time series using imaging methods and label propagation is proposed, reaching similar results to supervised classification. |