How dependable are distributed f fault/intrusion-tolerant systems?

Sousa, Paulo; Neves, Nuno Ferreira; Veríssimo, Paulo

How dependable are distributed f fault/intrusion-tolerant systems?

Bibliographic Details
Main Author:	Sousa, Paulo
Publication Date:	2005
Other Authors:	Neves, Nuno Ferreira, Veríssimo, Paulo
Format:	Report
Language:	por
Source:	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full:	http://hdl.handle.net/10451/14135
Summary:	Fault-tolerant protocols, asynchronous and synchronous alike, make stationary fault assumptions: only a fraction f of the total n nodes may fail. Whilst a synchronous protocol is expected to have a bounded execution time, an asynchronous one may execute for an arbitrary amount of time, possibly sufficient for f+1 nodes to fail. This can compromise the safety of the protocol and ultimately the safety of the system. Recent papers propose asynchronous protocols that can tolerate any number of faults over the lifetime of the system, provided that at most f nodes become faulty during a given window of time. This is achieved through the so-called proactive recovery, which consists of periodically rejuvenating the system. Proactive recovery in asynchronous systems, though a major breakthrough, has some limitations which had not been identified before. In this paper, we introduce a system model expressive enough to represent these problems which remained in oblivion with the classical models. We introduce a classification of system correctness based on the predicate exhaustion-safe, meaning freedom from resource exhaustion. Based on it, we predict the extent to which fault/intrusion-tolerant distributed systems (synchronous and asynchronous) can be made to work correctly. Namely, our model predicts the impossibility of guaranteeing correct behavior of asynchronous proactive recovery systems as exist today. To prove our point, we give an example of how these problems impact an existing fault/intrusion tolerant distributed system, and having identified the problem, we suggest one (certainly not the only) way to tackle it

Item metadata

id	RCAP_f4718bcae0239423780fbd5a2cdd2f5f
oai_identifier_str	oai:repositorio.ulisboa.pt:10455/3029
network_acronym_str	RCAP
network_name_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str	https://opendoar.ac.uk/repository/7160
spelling	How dependable are distributed f fault/intrusion-tolerant systems?Dependability assessmentfault tolerancesynchrony assumptionsproactive recoverywormholesFault-tolerant protocols, asynchronous and synchronous alike, make stationary fault assumptions: only a fraction f of the total n nodes may fail. Whilst a synchronous protocol is expected to have a bounded execution time, an asynchronous one may execute for an arbitrary amount of time, possibly sufficient for f+1 nodes to fail. This can compromise the safety of the protocol and ultimately the safety of the system. Recent papers propose asynchronous protocols that can tolerate any number of faults over the lifetime of the system, provided that at most f nodes become faulty during a given window of time. This is achieved through the so-called proactive recovery, which consists of periodically rejuvenating the system. Proactive recovery in asynchronous systems, though a major breakthrough, has some limitations which had not been identified before. In this paper, we introduce a system model expressive enough to represent these problems which remained in oblivion with the classical models. We introduce a classification of system correctness based on the predicate exhaustion-safe, meaning freedom from resource exhaustion. Based on it, we predict the extent to which fault/intrusion-tolerant distributed systems (synchronous and asynchronous) can be made to work correctly. Namely, our model predicts the impossibility of guaranteeing correct behavior of asynchronous proactive recovery systems as exist today. To prove our point, we give an example of how these problems impact an existing fault/intrusion tolerant distributed system, and having identified the problem, we suggest one (certainly not the only) way to tackle itDepartment of Informatics, University of LisbonRepositório da Universidade de LisboaSousa, PauloNeves, Nuno FerreiraVeríssimo, Paulo2009-02-10T13:12:00Z2005-022005-02-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/reportapplication/pdfhttp://hdl.handle.net/10451/14135porinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-03-17T13:12:36Zoai:repositorio.ulisboa.pt:10455/3029Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T02:37:32.915646Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv	How dependable are distributed f fault/intrusion-tolerant systems?
title	How dependable are distributed f fault/intrusion-tolerant systems?
spellingShingle	How dependable are distributed f fault/intrusion-tolerant systems? Sousa, Paulo Dependability assessment fault tolerance synchrony assumptions proactive recovery wormholes
title_short	How dependable are distributed f fault/intrusion-tolerant systems?
title_full	How dependable are distributed f fault/intrusion-tolerant systems?
title_fullStr	How dependable are distributed f fault/intrusion-tolerant systems?
title_full_unstemmed	How dependable are distributed f fault/intrusion-tolerant systems?
title_sort	How dependable are distributed f fault/intrusion-tolerant systems?
author	Sousa, Paulo
author_facet	Sousa, Paulo Neves, Nuno Ferreira Veríssimo, Paulo
author_role	author
author2	Neves, Nuno Ferreira Veríssimo, Paulo
author2_role	author author
dc.contributor.none.fl_str_mv	Repositório da Universidade de Lisboa
dc.contributor.author.fl_str_mv	Sousa, Paulo Neves, Nuno Ferreira Veríssimo, Paulo
dc.subject.por.fl_str_mv	Dependability assessment fault tolerance synchrony assumptions proactive recovery wormholes
topic	Dependability assessment fault tolerance synchrony assumptions proactive recovery wormholes
description	Fault-tolerant protocols, asynchronous and synchronous alike, make stationary fault assumptions: only a fraction f of the total n nodes may fail. Whilst a synchronous protocol is expected to have a bounded execution time, an asynchronous one may execute for an arbitrary amount of time, possibly sufficient for f+1 nodes to fail. This can compromise the safety of the protocol and ultimately the safety of the system. Recent papers propose asynchronous protocols that can tolerate any number of faults over the lifetime of the system, provided that at most f nodes become faulty during a given window of time. This is achieved through the so-called proactive recovery, which consists of periodically rejuvenating the system. Proactive recovery in asynchronous systems, though a major breakthrough, has some limitations which had not been identified before. In this paper, we introduce a system model expressive enough to represent these problems which remained in oblivion with the classical models. We introduce a classification of system correctness based on the predicate exhaustion-safe, meaning freedom from resource exhaustion. Based on it, we predict the extent to which fault/intrusion-tolerant distributed systems (synchronous and asynchronous) can be made to work correctly. Namely, our model predicts the impossibility of guaranteeing correct behavior of asynchronous proactive recovery systems as exist today. To prove our point, we give an example of how these problems impact an existing fault/intrusion tolerant distributed system, and having identified the problem, we suggest one (certainly not the only) way to tackle it
publishDate	2005
dc.date.none.fl_str_mv	2005-02 2005-02-01T00:00:00Z 2009-02-10T13:12:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/report
format	report
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10451/14135
url	http://hdl.handle.net/10451/14135
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Department of Informatics, University of Lisbon
publisher.none.fl_str_mv	Department of Informatics, University of Lisbon
dc.source.none.fl_str_mv	reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP
instname_str	FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv	info@rcaap.pt
_version_	1833601431481876480

How dependable are distributed f fault/intrusion-tolerant systems?

Similar Items