Publication
DSN 2008
Conference paper

Silent data corruption - Myth or reality?

Download paper

Abstract

The higher complexity of the hardware and software employed by modern computing systems, as well as semiconductor technology scaling, are increasing the likelihood of Silent Data Corruption (SDC). SDC occurs when incorrect data is provided to the user, e.g., written to the memory or I/O system, and no error is triggered. Such events may have catastrophic effects, in the case of life critical applications, and represent a significant cost penalty for businesses. The purpose of this panel is to provide real examples of silent corruption, and discuss solutions for avoiding it. The presentations address SDC generated at the semiconductor device level, as well as the virilization software level. Techniques for reducing SDC, from the circuit to system level, will be covered. Results of an extensive SDC study, carried out at Los Alamos National Laboratory (LANL) on high-performance computing (HPC) platforms are also given. © 2008 IEEE.

Date

Publication

DSN 2008

Authors

Resources

Share