Stochastic modeling of data storage systems for evaluating performance, dependability, and energy consumption

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: BORBA, Eric Rodrigues
Orientador(a): TAVARES, Eduardo Antonio Guimarães
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
Programa de Pós-Graduação: Programa de Pos Graduacao em Ciencia da Computacao
Departamento: Não Informado pela instituição
País: Brasil
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/53808
Resumo: Improvements in data storage systems may be limited by the low performance of hard disk drives (HDDs) and the high cost per gigabyte of solid-state drives (SSDs). To mitigate these issues, several architectures based on hybrid storage systems have been proposed. However, energy consumption is usually neglected, and new approaches may not consider the impact on the mechanical components of HDDs, which can result in malfunctions and data loss. Similarly, the lifetime of SSDs can be reduced owing to their limited number of flash memory operations. This thesis presents an approach based on generalized stochastic Petri nets (GSPNs) to eval- uate the performance and energy consumption of homogeneous (HDD and SSD) and hybrid storage systems. Two analytical models have been proposed to represent distinct workloads and estimate throughput, energy consumption, and response time. In addition, a performability model has been conceived using the GSPN and reliability block diagram (RBD) formalisms to evaluate the impacts of failures on the performance of storage systems. Hierarchical modeling approach has been adopted, and the proposed model can estimate the availability and response time. A benchmark tool is adopted in this study to generate workloads and collect data to characterize storage devices. Simultaneously, this investigation estimates the power demand of HDDs and SSDs from measurements. The results are utilized to validate the GSPN models using statistical analysis and experiments based on industry-standard benchmarks. A design of experiment (DoE) is performed to investigate the most important factors assumed in this study. An exploratory analysis was conducted using industry datasets from Alibaba and Back- blaze to investigate the distinct effects of applications on storage failures. Results demonstrate the feasibility of the proposed models and provide important observations regarding storage solutions for different applications.