Detalhes bibliográficos
Ano de defesa: |
2020 |
Autor(a) principal: |
Pagliosa, Lucas de Carvalho |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-10062020-100009/
|
Resumo: |
Technology advances have allowed and inspired the study of data produced along time from applications such as health treatment, biology, sentiment analysis, and entertainment. Those types of data, typically referred to as time series or data streams, have motivated several studies mainly in the area of Machine Learning and Statistics to infer models for performing prediction and classification. However, several studies either employ batchdriven strategies to address temporal data or do not consider chaotic observations, thus missing recurrent patterns and other temporal dependencies especially in real-world data. In that scenario, we consider Dynamical Systems and Chaos Theory tools to improve datastream modeling and forecasting by investigating time-series phase spaces, reconstructed according to Takens embedding theorem. This theorem relies on two essential embedding parameters, known as embedding dimension and time delay , which are complex to be estimated for real-world scenarios. Such difficulty derives from inconsistencies related to phase space partitioning, computation of probabilities, the curse of dimensionality, and noise. Moreover, an optimal phase space may be represented by attractors with different structures for different systems, which also aggregates to the problem. Our research confirmed those issues, especially for entropy. Although we verified that a well-reconstructed phase space can be described in terms of low entropy of phase states, the inverse is not necessarily true: a set of phase states that presents low levels of entropy does not necessarily describe an optimal phase space. As a consequence, we learned that defining a set of features to describe an optimal phase space is not a trivial task. As alternative, this Ph.D. proposed a new approach to estimate embedding parameters using an artificial neural network training on an overestimated phase space. Then, without the need of explicitly defining any phase-space features, we let the network filter nonrelevant dimensions and learn those features implicitly, whatever they are. After training iterations, we infer and from the skeletal architecture of the neural network. As we show, this method was consistent with benchmarks datasets, and robust in regarding different random initializations of neurons weights and chosen parameters. After obtaining embedding parameters and reconstructing the phase space, we show how we can model time-series recurrences more effectively in a wider scope, thereby enabling a deeper analysis of the underlying data. |