Recurrent genetic programming
Abstract
A typical pattern recognition system consists of two stages: the pre-processing stage to extract features from the data, and the classification stage to assign the feature vector a class label. There are two kinds of feature extraction techniques with respect to the kind of data: the fixed number of features per sample generating a fixed length feature vector, and the fixed number of features per sub-sample generating a variable length feature vector due to variable number of sub-samples (frames) for each input pattern. The first kind is the most commonly used feature vector for classification methods. The second kind is usually extracted in domains where the input sample is time-variant. Examples for such domains are speech recognition, on-line handwriting recognition, time-series data and click-stream analysis in web-mining. Traditionally a separate class of machine learning algorithms consisting of Hidden Markov Models, Recurrent-Neural Networks, etc. have been employed for classification of time variant signals. Evolutionary computation techniques like genetic algorithms and genetic programming have also been used previously to optimize the architecture for HMMs or learning the weights for recurrent-neural networks. It is difficult to model such problems using genetic programming because traditional implementation of genetic programming requires a fixed length feature vector. A trivial solution is to pro-compute the cardinality of the feature vector and pad-up the empty features by a constant and use the feature vector as input to the classifier. This severely affects the learning performance of genetic programming since the temporal variation (which is the most important aspect in these applications) is difficult to represent. In this paper we describe a recurrent framework for genetic programming(GP). This framework helps place GP in the class of machine learning algorithms alongside recurrent neural networks and hidden Markov models. We describe the application of recurrent genetic programming (henceforth called R-GP) for the classification of on-line handwritten numerals obtained from tablet-based input.