How dependable are distributed f fault/intrusion-tolerant systems?
Main Author: | |
---|---|
Publication Date: | 2005 |
Other Authors: | , |
Format: | Report |
Language: | por |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | http://hdl.handle.net/10451/14135 |
Summary: | Fault-tolerant protocols, asynchronous and synchronous alike, make stationary fault assumptions: only a fraction f of the total n nodes may fail. Whilst a synchronous protocol is expected to have a bounded execution time, an asynchronous one may execute for an arbitrary amount of time, possibly sufficient for f+1 nodes to fail. This can compromise the safety of the protocol and ultimately the safety of the system. Recent papers propose asynchronous protocols that can tolerate any number of faults over the lifetime of the system, provided that at most f nodes become faulty during a given window of time. This is achieved through the so-called proactive recovery, which consists of periodically rejuvenating the system. Proactive recovery in asynchronous systems, though a major breakthrough, has some limitations which had not been identified before. In this paper, we introduce a system model expressive enough to represent these problems which remained in oblivion with the classical models. We introduce a classification of system correctness based on the predicate exhaustion-safe, meaning freedom from resource exhaustion. Based on it, we predict the extent to which fault/intrusion-tolerant distributed systems (synchronous and asynchronous) can be made to work correctly. Namely, our model predicts the impossibility of guaranteeing correct behavior of asynchronous proactive recovery systems as exist today. To prove our point, we give an example of how these problems impact an existing fault/intrusion tolerant distributed system, and having identified the problem, we suggest one (certainly not the only) way to tackle it |
id |
RCAP_f4718bcae0239423780fbd5a2cdd2f5f |
---|---|
oai_identifier_str |
oai:repositorio.ulisboa.pt:10455/3029 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
How dependable are distributed f fault/intrusion-tolerant systems?Dependability assessmentfault tolerancesynchrony assumptionsproactive recoverywormholesFault-tolerant protocols, asynchronous and synchronous alike, make stationary fault assumptions: only a fraction f of the total n nodes may fail. Whilst a synchronous protocol is expected to have a bounded execution time, an asynchronous one may execute for an arbitrary amount of time, possibly sufficient for f+1 nodes to fail. This can compromise the safety of the protocol and ultimately the safety of the system. Recent papers propose asynchronous protocols that can tolerate any number of faults over the lifetime of the system, provided that at most f nodes become faulty during a given window of time. This is achieved through the so-called proactive recovery, which consists of periodically rejuvenating the system. Proactive recovery in asynchronous systems, though a major breakthrough, has some limitations which had not been identified before. In this paper, we introduce a system model expressive enough to represent these problems which remained in oblivion with the classical models. We introduce a classification of system correctness based on the predicate exhaustion-safe, meaning freedom from resource exhaustion. Based on it, we predict the extent to which fault/intrusion-tolerant distributed systems (synchronous and asynchronous) can be made to work correctly. Namely, our model predicts the impossibility of guaranteeing correct behavior of asynchronous proactive recovery systems as exist today. To prove our point, we give an example of how these problems impact an existing fault/intrusion tolerant distributed system, and having identified the problem, we suggest one (certainly not the only) way to tackle itDepartment of Informatics, University of LisbonRepositório da Universidade de LisboaSousa, PauloNeves, Nuno FerreiraVeríssimo, Paulo2009-02-10T13:12:00Z2005-022005-02-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/reportapplication/pdfhttp://hdl.handle.net/10451/14135porinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-03-17T13:12:36Zoai:repositorio.ulisboa.pt:10455/3029Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T02:37:32.915646Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
How dependable are distributed f fault/intrusion-tolerant systems? |
title |
How dependable are distributed f fault/intrusion-tolerant systems? |
spellingShingle |
How dependable are distributed f fault/intrusion-tolerant systems? Sousa, Paulo Dependability assessment fault tolerance synchrony assumptions proactive recovery wormholes |
title_short |
How dependable are distributed f fault/intrusion-tolerant systems? |
title_full |
How dependable are distributed f fault/intrusion-tolerant systems? |
title_fullStr |
How dependable are distributed f fault/intrusion-tolerant systems? |
title_full_unstemmed |
How dependable are distributed f fault/intrusion-tolerant systems? |
title_sort |
How dependable are distributed f fault/intrusion-tolerant systems? |
author |
Sousa, Paulo |
author_facet |
Sousa, Paulo Neves, Nuno Ferreira Veríssimo, Paulo |
author_role |
author |
author2 |
Neves, Nuno Ferreira Veríssimo, Paulo |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Repositório da Universidade de Lisboa |
dc.contributor.author.fl_str_mv |
Sousa, Paulo Neves, Nuno Ferreira Veríssimo, Paulo |
dc.subject.por.fl_str_mv |
Dependability assessment fault tolerance synchrony assumptions proactive recovery wormholes |
topic |
Dependability assessment fault tolerance synchrony assumptions proactive recovery wormholes |
description |
Fault-tolerant protocols, asynchronous and synchronous alike, make stationary fault assumptions: only a fraction f of the total n nodes may fail. Whilst a synchronous protocol is expected to have a bounded execution time, an asynchronous one may execute for an arbitrary amount of time, possibly sufficient for f+1 nodes to fail. This can compromise the safety of the protocol and ultimately the safety of the system. Recent papers propose asynchronous protocols that can tolerate any number of faults over the lifetime of the system, provided that at most f nodes become faulty during a given window of time. This is achieved through the so-called proactive recovery, which consists of periodically rejuvenating the system. Proactive recovery in asynchronous systems, though a major breakthrough, has some limitations which had not been identified before. In this paper, we introduce a system model expressive enough to represent these problems which remained in oblivion with the classical models. We introduce a classification of system correctness based on the predicate exhaustion-safe, meaning freedom from resource exhaustion. Based on it, we predict the extent to which fault/intrusion-tolerant distributed systems (synchronous and asynchronous) can be made to work correctly. Namely, our model predicts the impossibility of guaranteeing correct behavior of asynchronous proactive recovery systems as exist today. To prove our point, we give an example of how these problems impact an existing fault/intrusion tolerant distributed system, and having identified the problem, we suggest one (certainly not the only) way to tackle it |
publishDate |
2005 |
dc.date.none.fl_str_mv |
2005-02 2005-02-01T00:00:00Z 2009-02-10T13:12:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/report |
format |
report |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10451/14135 |
url |
http://hdl.handle.net/10451/14135 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Department of Informatics, University of Lisbon |
publisher.none.fl_str_mv |
Department of Informatics, University of Lisbon |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833601431481876480 |