Philip Heidelberger, Perwez Shahabuddin, et al.
ACM Transactions on Modeling and Computer Simulation (TOMACS)
In this paper we consider three aspects in modeling of multiversion software. First, we propose the Beta-Binomial distribution to model correlated failures in multiversion software. Second, we present a combinatorial model to predict the reliability of a multiversion software configuration. This model can take as inputs failure distributions either from measurements or from a selected distribution (e.g., Beta-Binomial). Various recovery methods can be incorporated in this model. Third, we investigate the effectiveness of the Community Error Recovery method based on checkpointing as suggested in [13]. This method appears to be effective only when the failure behavior of program versions are lightly correlated. We also consider two different types of checkpoint failures: an omission failure where the correct output is recognized at a checkpoint but the checkpoint fails to correct the wrong outputs, and a destructive failure where the good versions get corrupted at a checkpoint. The former just reduces the effectiveness of the checkpoints while the latter has a catastrophic effect on the reliability. © 1990 IEEE