Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst Parameter

Detalhes bibliográficos
Autor(a) principal: Ramos, Cristiano Duarte Gonçalves
Data de Publicação: 2017
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: http://hdl.handle.net/10400.6/7836
Resumo: The discovery of consistent dependencies between values in certain data series paved the way for the development of algorithms that could, somehow, classify the degree of self-similarity between values and derive considerations about the behavior of these series. This self-similarity metric is typically known as the Hurst Parameter, and allows the classification of the behavior of a data series as persistent, anti-persistent, or purely random. This discovery was highly relevant in the field of computer networks, inclusively helping companies to develop equipment and infrastructure that suit their needs more efficiently. The Hurst Parameter is relevant in many other fields, and it has been for exemple applied in the study of geologic phenomena [KTC07] or even on areas related with health sciencies[VAJ08, HPS+12]. There are several algorithms for estimating the Hurst Parameter [Hur51, Hig88, RPGC06], and each one of them has its strengths and weaknesses. The usage of these algorithms is sometimes difficult, motivating the creation of tools or libraries that provide them in a more user-friendly manner. Unfortunately, and despite of being an area that has been studied for decades, the tools available have limitations and do not implement all algorithms available in the literature. The work presented in this dissertation consists on the improvement of TestH, a library written in ANSI C for the study of self-similarity in time series, which was initially developed by Fernandes et al. [FNS+14]. These improvements are materialized as the addition of algorithms to estimate the Hurst Parameter and to generate self-similar sequences. Additionally, auxiliary functions were implemented, along with code refactoring, documentation of the application programming interface and the creation of a website for the project. This dissertation is mostly focused on the algorithms that were introduced in TestH, namely the Periodogram, the Higuchi method, the Hurst Exponent by Autocorrelation Function and the Detrended Fluctuation Analysis estimators, and the Davies and Hart method for generating selfsimilar sequences. In order to turn TestH into a robust and trustable library, several tests were performed comparing the results of these implementations with the values provided by similar tools. The overall results obtained in these tests are in line with expectations and the algorithms that are simultaneously implemented in TestH and in the other tools analyzed (for example, the Periodogram) returned very similar results, corroborating the belief that the methods were well implemented.
id RCAP_aff70ad9428ee48958e1b87890a1118f
oai_identifier_str oai:ubibliorum.ubi.pt:10400.6/7836
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst ParameterAuto-SemelhançaDependência de Longo AlcanceEstimadores do Parâmetro de HurstGeradores de Sequências Auto-SemelhantesMovimento Browniano FracionárioPasseio AleatórioRuído Gaussiano FracionárioTesthThe discovery of consistent dependencies between values in certain data series paved the way for the development of algorithms that could, somehow, classify the degree of self-similarity between values and derive considerations about the behavior of these series. This self-similarity metric is typically known as the Hurst Parameter, and allows the classification of the behavior of a data series as persistent, anti-persistent, or purely random. This discovery was highly relevant in the field of computer networks, inclusively helping companies to develop equipment and infrastructure that suit their needs more efficiently. The Hurst Parameter is relevant in many other fields, and it has been for exemple applied in the study of geologic phenomena [KTC07] or even on areas related with health sciencies[VAJ08, HPS+12]. There are several algorithms for estimating the Hurst Parameter [Hur51, Hig88, RPGC06], and each one of them has its strengths and weaknesses. The usage of these algorithms is sometimes difficult, motivating the creation of tools or libraries that provide them in a more user-friendly manner. Unfortunately, and despite of being an area that has been studied for decades, the tools available have limitations and do not implement all algorithms available in the literature. The work presented in this dissertation consists on the improvement of TestH, a library written in ANSI C for the study of self-similarity in time series, which was initially developed by Fernandes et al. [FNS+14]. These improvements are materialized as the addition of algorithms to estimate the Hurst Parameter and to generate self-similar sequences. Additionally, auxiliary functions were implemented, along with code refactoring, documentation of the application programming interface and the creation of a website for the project. This dissertation is mostly focused on the algorithms that were introduced in TestH, namely the Periodogram, the Higuchi method, the Hurst Exponent by Autocorrelation Function and the Detrended Fluctuation Analysis estimators, and the Davies and Hart method for generating selfsimilar sequences. In order to turn TestH into a robust and trustable library, several tests were performed comparing the results of these implementations with the values provided by similar tools. The overall results obtained in these tests are in line with expectations and the algorithms that are simultaneously implemented in TestH and in the other tools analyzed (for example, the Periodogram) returned very similar results, corroborating the belief that the methods were well implemented.A descoberta da dependência consistente entre valores em certas séries de dados, abriu caminho para o desenvolvimento de algoritmos que permitissem, de alguma forma, classificar o grau de auto-semelhança entre valores e tecer considerações sobre o comportamento da série. A esta estatística dá-se o nome de Parâmetro de Hurst, que permite analisar e classificar o comportamento de uma série de dados como persistente, antipersistente ou puramente aleatória. Esta descoberta tem sido bastante relevante na área das redes de computadores, onde serve, p.ex., de ajuda às empresas para desenvolverem equipamentos e infraestruturas adequadas às suas necessidades. Para além do elevado interesse que a referida área apresentou por esta métrica, existem outros campos ciêntificos onde algoritmos para estimar o Parâmetro de Hurst de sequências de valores estão a ser aplicados, como por exemplo no estudo de fenómenos geológicos [KTC07], bem como em fenómenos ligados às ciências da saúde [VAJ08, HPS+12]. Existem vários algoritmos para estimar o Parâmetro de Hurst [Hur51, Hig88, RPGC06], tendo cada um deles as suas virtudes e fraquezas. A utilização destes algoritmos é por vezes difícil, motivando a criação de ferramentas e bibliotecas que os congregam e disponibilizam de uma forma mais amigável ao utilizador. Infelizmente, e apesar de ser uma área que está a ser alvo de estudos há décadas, as ferramentas existentes, para além de não implementarem a totalidade dos algoritmos mais relevantes, apresentam ainda algumas limitações. Desta forma, o trabalho apresentado nesta dissertação consiste, principalmente, na melhoria da TestH, uma biblioteca escrita em ANSI C para o estudo de séries temporais auto-semelhantes, inicialmente desenvolvida por Fernandes et al. [FNS+14]. Estas melhorias materializam-se sobretudo na adição de algoritmos para estimar o Parâmetro de Hurst e gerar séries de dados auto-semelhantes. Adicionalmente foram introduzidas funções auxiliares, foi efetuada a refactorização do código, documentação das interfaces de programação e ainda a criação de um sítio web para divulgação do projeto. Esta dissertação dá enfase aos algoritmos de estimação do Parâmetro de Hurst e geração de séries auto-semelhantes. Relativamente à estimação, foram introduzidos na TestH, no âmbito deste trabalho, o Periodograma, o método de Higuchi, a estimação através da função de autocorrelação e o método de análise através da remoção das tendências. No que respeita à geração de séries, foi também introduzido o método de Davies e Hart. Com o objetivo de tornar a TestH robusta e credível, foram realizados vários testes, comparando os resultados destas implementações com os valores fornecidos por ferramentas semelhantes. Os resultados obtidos estão alinhados com o esperado e, inclusivamente, os algoritmos que se encontram implementados na TestH e restantes ferramentas analisadas (como por exemplo, o Periodograma), apresentaram valores bastante semelhantes entre si, corroborando a crença da correção da implementação dos vários métodos.Inácio, Pedro Ricardo MoraisuBibliorumRamos, Cristiano Duarte Gonçalves2019-12-16T16:22:39Z2017-11-202017-10-32017-11-20T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.6/7836urn:tid:202336921enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-03-11T14:29:09Zoai:ubibliorum.ubi.pt:10400.6/7836Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T01:18:40.444233Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst Parameter
title Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst Parameter
spellingShingle Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst Parameter
Ramos, Cristiano Duarte Gonçalves
Auto-Semelhança
Dependência de Longo Alcance
Estimadores do Parâmetro de Hurst
Geradores de Sequências Auto-Semelhantes
Movimento Browniano Fracionário
Passeio Aleatório
Ruído Gaussiano Fracionário
Testh
title_short Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst Parameter
title_full Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst Parameter
title_fullStr Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst Parameter
title_full_unstemmed Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst Parameter
title_sort Improvement of TestH: a C Library for Generating Self-Similar Series and for Estimating the Hurst Parameter
author Ramos, Cristiano Duarte Gonçalves
author_facet Ramos, Cristiano Duarte Gonçalves
author_role author
dc.contributor.none.fl_str_mv Inácio, Pedro Ricardo Morais
uBibliorum
dc.contributor.author.fl_str_mv Ramos, Cristiano Duarte Gonçalves
dc.subject.por.fl_str_mv Auto-Semelhança
Dependência de Longo Alcance
Estimadores do Parâmetro de Hurst
Geradores de Sequências Auto-Semelhantes
Movimento Browniano Fracionário
Passeio Aleatório
Ruído Gaussiano Fracionário
Testh
topic Auto-Semelhança
Dependência de Longo Alcance
Estimadores do Parâmetro de Hurst
Geradores de Sequências Auto-Semelhantes
Movimento Browniano Fracionário
Passeio Aleatório
Ruído Gaussiano Fracionário
Testh
description The discovery of consistent dependencies between values in certain data series paved the way for the development of algorithms that could, somehow, classify the degree of self-similarity between values and derive considerations about the behavior of these series. This self-similarity metric is typically known as the Hurst Parameter, and allows the classification of the behavior of a data series as persistent, anti-persistent, or purely random. This discovery was highly relevant in the field of computer networks, inclusively helping companies to develop equipment and infrastructure that suit their needs more efficiently. The Hurst Parameter is relevant in many other fields, and it has been for exemple applied in the study of geologic phenomena [KTC07] or even on areas related with health sciencies[VAJ08, HPS+12]. There are several algorithms for estimating the Hurst Parameter [Hur51, Hig88, RPGC06], and each one of them has its strengths and weaknesses. The usage of these algorithms is sometimes difficult, motivating the creation of tools or libraries that provide them in a more user-friendly manner. Unfortunately, and despite of being an area that has been studied for decades, the tools available have limitations and do not implement all algorithms available in the literature. The work presented in this dissertation consists on the improvement of TestH, a library written in ANSI C for the study of self-similarity in time series, which was initially developed by Fernandes et al. [FNS+14]. These improvements are materialized as the addition of algorithms to estimate the Hurst Parameter and to generate self-similar sequences. Additionally, auxiliary functions were implemented, along with code refactoring, documentation of the application programming interface and the creation of a website for the project. This dissertation is mostly focused on the algorithms that were introduced in TestH, namely the Periodogram, the Higuchi method, the Hurst Exponent by Autocorrelation Function and the Detrended Fluctuation Analysis estimators, and the Davies and Hart method for generating selfsimilar sequences. In order to turn TestH into a robust and trustable library, several tests were performed comparing the results of these implementations with the values provided by similar tools. The overall results obtained in these tests are in line with expectations and the algorithms that are simultaneously implemented in TestH and in the other tools analyzed (for example, the Periodogram) returned very similar results, corroborating the belief that the methods were well implemented.
publishDate 2017
dc.date.none.fl_str_mv 2017-11-20
2017-10-3
2017-11-20T00:00:00Z
2019-12-16T16:22:39Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.6/7836
urn:tid:202336921
url http://hdl.handle.net/10400.6/7836
identifier_str_mv urn:tid:202336921
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833600918365405184