Adaptive Replication of Large-Scale Multi-Agent Systems - Towards a Fault-Tolerant Multi-Agent Platform

Abstract : In order to construct and deploy large-scale multi-agent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. This means that fault-tolerance is an inevitable issue for large-scale multi-agent systems. In this paper, we discuss the issues and propose an approach for fault-tolerance of multi-agent systems. The starting idea is the application of replication strategies to agents, the most critical agents being replicated to prevent failures. As criticality of agents may evolve during the course of computation and problem solving, and as resources are bounded, we need to dynamically and automatically adapt the number of replicas of agents, in order to maximize their reliability and availability. We will describe our approach and related mechanisms for evaluating the criticality of a given agent (based on application-level semantic information, e.g. interdependences, and also system-level statistical information, e.g., communication load) and for deciding what strategy to apply (e.g., active replication, passive) how to parameterize it (e.g., number of replicas). We also will report on experiments conducted with our prototype architecture (named DimaX).
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/hal-00684961
Contributor : Ist Rennes <>
Submitted on : Tuesday, April 3, 2012 - 3:47:36 PM
Last modification on : Thursday, March 21, 2019 - 2:43:56 PM

Identifiers

Citation

Zahia Guessoum, Nora Faci, Jean-Pierre Briot. Adaptive Replication of Large-Scale Multi-Agent Systems - Towards a Fault-Tolerant Multi-Agent Platform. ACM Electronic Proceedings of the ICSE'05 4th International Workshop on Software Engineering for Large-Scale Multi-Agent Systems (SELMAS'05), May 2005, Saint Louis, United States. ⟨10.1145/1082960.1082977⟩. ⟨hal-00684961⟩

Share

Metrics

Record views

140