Multiresolution motif discovery in time series

Detalhes bibliográficos
Autor(a) principal: Castro, Nuno Constantino
Data de Publicação: 2010
Outros Autores: Azevedo, Paulo J.
Idioma: eng
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: http://hdl.handle.net/1822/36013
Resumo: Time series motif discovery is an important problem with applications in a variety of areas that range from telecommunications to medicine. Several algorithms have been proposed to solve the problem. However, these algorithms heavily use expensive random disk accesses or assume the data can't into main memory. They only consider motifs at a single resolution and are not suited to interactivity. In this work, we tackle the motif discovery problem as an approximate Top-K frequent subsequence discovery problem. We fully exploit state of the art iSAX representation multiresolution capability to obtain motifs at diferent resolutions. This property yields interactivity, allowing the user to navigate along the Top-K motifs structure. This permits a deeper understanding of the time series database. Further, we apply the Top-K space saving algorithm to our frequent subsequences approach. A scalable algorithm is obtained that is suitable for data stream like applications where small memory devices such as sensors are used. Our approach is scalable and disk-eficient since it only needs one single pass over the time series database. We provide empirical evidence of the validity of the algorithm in datasets from diferent areas that aim to represent practical applications.
id RCAP_b7b48b4a9ee3d141b0ccbf5a438bece0
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/36013
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Multiresolution motif discovery in time seriesTime seriesMotif discoveryFrequent patternsMultiresolutionTime series motif discovery is an important problem with applications in a variety of areas that range from telecommunications to medicine. Several algorithms have been proposed to solve the problem. However, these algorithms heavily use expensive random disk accesses or assume the data can't into main memory. They only consider motifs at a single resolution and are not suited to interactivity. In this work, we tackle the motif discovery problem as an approximate Top-K frequent subsequence discovery problem. We fully exploit state of the art iSAX representation multiresolution capability to obtain motifs at diferent resolutions. This property yields interactivity, allowing the user to navigate along the Top-K motifs structure. This permits a deeper understanding of the time series database. Further, we apply the Top-K space saving algorithm to our frequent subsequences approach. A scalable algorithm is obtained that is suitable for data stream like applications where small memory devices such as sensors are used. Our approach is scalable and disk-eficient since it only needs one single pass over the time series database. We provide empirical evidence of the validity of the algorithm in datasets from diferent areas that aim to represent practical applications.(undefined)Universidade do MinhoCastro, Nuno ConstantinoAzevedo, Paulo J.20102010-01-01T00:00:00Zconference paperinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/1822/36013eng10.1137/1.9781611972801.73http://epubs.siam.org/doi/pdf/10.1137/1.9781611972801.73info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-11T05:34:05Zoai:repositorium.sdum.uminho.pt:1822/36013Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T15:22:36.075321Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Multiresolution motif discovery in time series
title Multiresolution motif discovery in time series
spellingShingle Multiresolution motif discovery in time series
Castro, Nuno Constantino
Time series
Motif discovery
Frequent patterns
Multiresolution
title_short Multiresolution motif discovery in time series
title_full Multiresolution motif discovery in time series
title_fullStr Multiresolution motif discovery in time series
title_full_unstemmed Multiresolution motif discovery in time series
title_sort Multiresolution motif discovery in time series
author Castro, Nuno Constantino
author_facet Castro, Nuno Constantino
Azevedo, Paulo J.
author_role author
author2 Azevedo, Paulo J.
author2_role author
dc.contributor.none.fl_str_mv Universidade do Minho
dc.contributor.author.fl_str_mv Castro, Nuno Constantino
Azevedo, Paulo J.
dc.subject.por.fl_str_mv Time series
Motif discovery
Frequent patterns
Multiresolution
topic Time series
Motif discovery
Frequent patterns
Multiresolution
description Time series motif discovery is an important problem with applications in a variety of areas that range from telecommunications to medicine. Several algorithms have been proposed to solve the problem. However, these algorithms heavily use expensive random disk accesses or assume the data can't into main memory. They only consider motifs at a single resolution and are not suited to interactivity. In this work, we tackle the motif discovery problem as an approximate Top-K frequent subsequence discovery problem. We fully exploit state of the art iSAX representation multiresolution capability to obtain motifs at diferent resolutions. This property yields interactivity, allowing the user to navigate along the Top-K motifs structure. This permits a deeper understanding of the time series database. Further, we apply the Top-K space saving algorithm to our frequent subsequences approach. A scalable algorithm is obtained that is suitable for data stream like applications where small memory devices such as sensors are used. Our approach is scalable and disk-eficient since it only needs one single pass over the time series database. We provide empirical evidence of the validity of the algorithm in datasets from diferent areas that aim to represent practical applications.
publishDate 2010
dc.date.none.fl_str_mv 2010
2010-01-01T00:00:00Z
dc.type.driver.fl_str_mv conference paper
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1822/36013
url http://hdl.handle.net/1822/36013
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1137/1.9781611972801.73
http://epubs.siam.org/doi/pdf/10.1137/1.9781611972801.73
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833595277221560320