Classifying sensor data with CALCHAS
Abstract
Learning how to classify sensor data is one of the basic learning tasks in engineering. Data from sensors are usually made available over time, and are classified according to the behavior they exhibit in specific time intervals. This paper addresses the problem of classifying finite, univariate time series that are governed by unknown deterministic processes contaminated by noise. Time series in the same class are allowed to follow different processes. In this context, the appropriateness of using induction algorithms not specifically designed for temporal data is investigated. The paper presents CALCHAS, a simple supervised induction algorithm that uses serial correlation as its inductive bias in a Bayesian framework, and compares it empirically to a popular general-purpose classifier, in a NASA telemetry monitoring application. Two comparisons were performed: one in which the general purpose classifier was applied directly to the data, and another in which features that captured serial correlations were extracted before the induction. Serial correlation appeared to be an important form of inductive bias, most effectively utilized as an integral part of the learning algorithm. Feature extraction occurs too early in the training process to utilize correlation knowledge effectively.