Binary Classification by Stochastic Neural Nets
Abstract
We classify points in Rd (feature vectors) by func- tions related to feedforward artificial neural networks (ANNs). These functions, dubbed “stochastic neural nets,” arise in a natural way from probabilistic as well as statistical considerations. The probabilistic idea is to define a classifying bit locally by using the sign of a hidden state-dependent noisy linear function of the feature vector as a new d + 1st coordinate of the vector. This d + 1-dimensional distribution is approximated by a mixture distribution. The statistical idea is that the approximating mixtures, and hence the a posteriori class probability functions (stochastic neural nets) defined by them, can be conveniently trained either by maximum likelihood or by a Bayes criterion through the use of an appropriate Expectation-Maximization (EM) algorithm. © 1995 IEEE