Fault Injection to Generate Failure Data for Failure Prediction: A Case Study

Detalhes bibliográficos
Autor(a) principal: Campos, João R.
Data de Publicação: 2020
Outros Autores: Costa, Ernesto
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: https://hdl.handle.net/10316/117481
https://doi.org/10.1109/ISSRE5003.2020.00020
Resumo: Due to the complexity of modern software, identifying every fault before deployment is extremely difficult or even not possible. Such residual faults can ultimately lead to failures, often incurring considerable risks or costs. Online Failure Prediction (OFP) is a fault-tolerance technique that attempts to predict the occurrence of failures in the near future and thus prevent/mitigate their consequences. Combined with recent technological developments, Machine Learning (ML) has been successfully used to create predictive models for OFP. However, as failures are rare events, failure data are often not available for building accurate models. Although fault injection has been accepted as a viable solution to generate realistic failure data, fault injectors are difficult to implement/update and thus research on Operating System (OS)-level OFP has become stale, with most works using data from outdated OSs. In this paper, we conduct a comprehensive fault injection campaign on an up-to-date Linux kernel and thoroughly study its behavior in the presence of faults. We then transform the data to explore and assess the predictive performance of various ML techniques for OFP. Finally, we study the influence of different OFP parameters (i.e., lead-time, prediction-window) and compare the results with existing related work. Results suggest that the various failures observed can be grouped into categories that can then be accurately predicted and distinguished by diverse ML models.
id RCAP_00f51032ff8c884d70a6040e47c2a7ee
oai_identifier_str oai:estudogeral.uc.pt:10316/117481
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Fault Injection to Generate Failure Data for Failure Prediction: A Case StudyDependabilityFault InjectionFailure PredictionMachine LearningDue to the complexity of modern software, identifying every fault before deployment is extremely difficult or even not possible. Such residual faults can ultimately lead to failures, often incurring considerable risks or costs. Online Failure Prediction (OFP) is a fault-tolerance technique that attempts to predict the occurrence of failures in the near future and thus prevent/mitigate their consequences. Combined with recent technological developments, Machine Learning (ML) has been successfully used to create predictive models for OFP. However, as failures are rare events, failure data are often not available for building accurate models. Although fault injection has been accepted as a viable solution to generate realistic failure data, fault injectors are difficult to implement/update and thus research on Operating System (OS)-level OFP has become stale, with most works using data from outdated OSs. In this paper, we conduct a comprehensive fault injection campaign on an up-to-date Linux kernel and thoroughly study its behavior in the presence of faults. We then transform the data to explore and assess the predictive performance of various ML techniques for OFP. Finally, we study the influence of different OFP parameters (i.e., lead-time, prediction-window) and compare the results with existing related work. Results suggest that the various failures observed can be grouped into categories that can then be accurately predicted and distinguished by diverse ML models.Work partially funded by FCT grant SFRH/BD/140221/2018 and project AIDA - Adaptive, Intelligent and Distributed Assurance Platform (reference POCI-01-0247-FEDER-045907) co-financed by the ERDF and COMPETE 2020 and by the FCT under CMU Portugal.IEEE2020info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttps://hdl.handle.net/10316/117481https://hdl.handle.net/10316/117481https://doi.org/10.1109/ISSRE5003.2020.00020eng978-1-7281-9870-5https://ieeexplore.ieee.org/document/9251077Campos, João R.Costa, Ernestoinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-12-27T16:28:05Zoai:estudogeral.uc.pt:10316/117481Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T06:11:25.245709Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Fault Injection to Generate Failure Data for Failure Prediction: A Case Study
title Fault Injection to Generate Failure Data for Failure Prediction: A Case Study
spellingShingle Fault Injection to Generate Failure Data for Failure Prediction: A Case Study
Campos, João R.
Dependability
Fault Injection
Failure Prediction
Machine Learning
title_short Fault Injection to Generate Failure Data for Failure Prediction: A Case Study
title_full Fault Injection to Generate Failure Data for Failure Prediction: A Case Study
title_fullStr Fault Injection to Generate Failure Data for Failure Prediction: A Case Study
title_full_unstemmed Fault Injection to Generate Failure Data for Failure Prediction: A Case Study
title_sort Fault Injection to Generate Failure Data for Failure Prediction: A Case Study
author Campos, João R.
author_facet Campos, João R.
Costa, Ernesto
author_role author
author2 Costa, Ernesto
author2_role author
dc.contributor.author.fl_str_mv Campos, João R.
Costa, Ernesto
dc.subject.por.fl_str_mv Dependability
Fault Injection
Failure Prediction
Machine Learning
topic Dependability
Fault Injection
Failure Prediction
Machine Learning
description Due to the complexity of modern software, identifying every fault before deployment is extremely difficult or even not possible. Such residual faults can ultimately lead to failures, often incurring considerable risks or costs. Online Failure Prediction (OFP) is a fault-tolerance technique that attempts to predict the occurrence of failures in the near future and thus prevent/mitigate their consequences. Combined with recent technological developments, Machine Learning (ML) has been successfully used to create predictive models for OFP. However, as failures are rare events, failure data are often not available for building accurate models. Although fault injection has been accepted as a viable solution to generate realistic failure data, fault injectors are difficult to implement/update and thus research on Operating System (OS)-level OFP has become stale, with most works using data from outdated OSs. In this paper, we conduct a comprehensive fault injection campaign on an up-to-date Linux kernel and thoroughly study its behavior in the presence of faults. We then transform the data to explore and assess the predictive performance of various ML techniques for OFP. Finally, we study the influence of different OFP parameters (i.e., lead-time, prediction-window) and compare the results with existing related work. Results suggest that the various failures observed can be grouped into categories that can then be accurately predicted and distinguished by diverse ML models.
publishDate 2020
dc.date.none.fl_str_mv 2020
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10316/117481
https://hdl.handle.net/10316/117481
https://doi.org/10.1109/ISSRE5003.2020.00020
url https://hdl.handle.net/10316/117481
https://doi.org/10.1109/ISSRE5003.2020.00020
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 978-1-7281-9870-5
https://ieeexplore.ieee.org/document/9251077
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv IEEE
publisher.none.fl_str_mv IEEE
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833602607474540544