Fault Injection to Generate Failure Data for Failure Prediction: A Case Study
Main Author: | |
---|---|
Publication Date: | 2020 |
Other Authors: | |
Format: | Article |
Language: | eng |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | https://hdl.handle.net/10316/117481 https://doi.org/10.1109/ISSRE5003.2020.00020 |
Summary: | Due to the complexity of modern software, identifying every fault before deployment is extremely difficult or even not possible. Such residual faults can ultimately lead to failures, often incurring considerable risks or costs. Online Failure Prediction (OFP) is a fault-tolerance technique that attempts to predict the occurrence of failures in the near future and thus prevent/mitigate their consequences. Combined with recent technological developments, Machine Learning (ML) has been successfully used to create predictive models for OFP. However, as failures are rare events, failure data are often not available for building accurate models. Although fault injection has been accepted as a viable solution to generate realistic failure data, fault injectors are difficult to implement/update and thus research on Operating System (OS)-level OFP has become stale, with most works using data from outdated OSs. In this paper, we conduct a comprehensive fault injection campaign on an up-to-date Linux kernel and thoroughly study its behavior in the presence of faults. We then transform the data to explore and assess the predictive performance of various ML techniques for OFP. Finally, we study the influence of different OFP parameters (i.e., lead-time, prediction-window) and compare the results with existing related work. Results suggest that the various failures observed can be grouped into categories that can then be accurately predicted and distinguished by diverse ML models. |
id |
RCAP_00f51032ff8c884d70a6040e47c2a7ee |
---|---|
oai_identifier_str |
oai:estudogeral.uc.pt:10316/117481 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
Fault Injection to Generate Failure Data for Failure Prediction: A Case StudyDependabilityFault InjectionFailure PredictionMachine LearningDue to the complexity of modern software, identifying every fault before deployment is extremely difficult or even not possible. Such residual faults can ultimately lead to failures, often incurring considerable risks or costs. Online Failure Prediction (OFP) is a fault-tolerance technique that attempts to predict the occurrence of failures in the near future and thus prevent/mitigate their consequences. Combined with recent technological developments, Machine Learning (ML) has been successfully used to create predictive models for OFP. However, as failures are rare events, failure data are often not available for building accurate models. Although fault injection has been accepted as a viable solution to generate realistic failure data, fault injectors are difficult to implement/update and thus research on Operating System (OS)-level OFP has become stale, with most works using data from outdated OSs. In this paper, we conduct a comprehensive fault injection campaign on an up-to-date Linux kernel and thoroughly study its behavior in the presence of faults. We then transform the data to explore and assess the predictive performance of various ML techniques for OFP. Finally, we study the influence of different OFP parameters (i.e., lead-time, prediction-window) and compare the results with existing related work. Results suggest that the various failures observed can be grouped into categories that can then be accurately predicted and distinguished by diverse ML models.Work partially funded by FCT grant SFRH/BD/140221/2018 and project AIDA - Adaptive, Intelligent and Distributed Assurance Platform (reference POCI-01-0247-FEDER-045907) co-financed by the ERDF and COMPETE 2020 and by the FCT under CMU Portugal.IEEE2020info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttps://hdl.handle.net/10316/117481https://hdl.handle.net/10316/117481https://doi.org/10.1109/ISSRE5003.2020.00020eng978-1-7281-9870-5https://ieeexplore.ieee.org/document/9251077Campos, João R.Costa, Ernestoinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-12-27T16:28:05Zoai:estudogeral.uc.pt:10316/117481Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T06:11:25.245709Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
Fault Injection to Generate Failure Data for Failure Prediction: A Case Study |
title |
Fault Injection to Generate Failure Data for Failure Prediction: A Case Study |
spellingShingle |
Fault Injection to Generate Failure Data for Failure Prediction: A Case Study Campos, João R. Dependability Fault Injection Failure Prediction Machine Learning |
title_short |
Fault Injection to Generate Failure Data for Failure Prediction: A Case Study |
title_full |
Fault Injection to Generate Failure Data for Failure Prediction: A Case Study |
title_fullStr |
Fault Injection to Generate Failure Data for Failure Prediction: A Case Study |
title_full_unstemmed |
Fault Injection to Generate Failure Data for Failure Prediction: A Case Study |
title_sort |
Fault Injection to Generate Failure Data for Failure Prediction: A Case Study |
author |
Campos, João R. |
author_facet |
Campos, João R. Costa, Ernesto |
author_role |
author |
author2 |
Costa, Ernesto |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Campos, João R. Costa, Ernesto |
dc.subject.por.fl_str_mv |
Dependability Fault Injection Failure Prediction Machine Learning |
topic |
Dependability Fault Injection Failure Prediction Machine Learning |
description |
Due to the complexity of modern software, identifying every fault before deployment is extremely difficult or even not possible. Such residual faults can ultimately lead to failures, often incurring considerable risks or costs. Online Failure Prediction (OFP) is a fault-tolerance technique that attempts to predict the occurrence of failures in the near future and thus prevent/mitigate their consequences. Combined with recent technological developments, Machine Learning (ML) has been successfully used to create predictive models for OFP. However, as failures are rare events, failure data are often not available for building accurate models. Although fault injection has been accepted as a viable solution to generate realistic failure data, fault injectors are difficult to implement/update and thus research on Operating System (OS)-level OFP has become stale, with most works using data from outdated OSs. In this paper, we conduct a comprehensive fault injection campaign on an up-to-date Linux kernel and thoroughly study its behavior in the presence of faults. We then transform the data to explore and assess the predictive performance of various ML techniques for OFP. Finally, we study the influence of different OFP parameters (i.e., lead-time, prediction-window) and compare the results with existing related work. Results suggest that the various failures observed can be grouped into categories that can then be accurately predicted and distinguished by diverse ML models. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10316/117481 https://hdl.handle.net/10316/117481 https://doi.org/10.1109/ISSRE5003.2020.00020 |
url |
https://hdl.handle.net/10316/117481 https://doi.org/10.1109/ISSRE5003.2020.00020 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
978-1-7281-9870-5 https://ieeexplore.ieee.org/document/9251077 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
IEEE |
publisher.none.fl_str_mv |
IEEE |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833602607474540544 |