Publication
HICSS 1986
Conference paper
EFFECTIVE CONCURRENT RECOVERY MECHANISMS FOR SOFT ERRORS IN MINs.
Abstract
In parallel computer systems an interconnection network is used to either share memory between processors and/or exchange information between the processors. This means that a lot of the system's data and control information is communicated across this network. Therefore, to avoid severe performance degradation it is important for the network to be resilient to soft errors (transient and intermittent errors). In this paper we propose mechanisms for recovery from soft errors in multistage interconnection networks (MINs). In order to reduce the work done by these mechanisms, localized concurrent error detection and recovery is proposed.