Efficient implementation of morphological finite-state transition networks employing their statistical properties
Abstract
Here we study numerically the structure of directed state transition graphs for several types of finite-state devices representing morphology of 16 languages. In all numerical experiments we have found that the distribution of incoming and outcoming links is highly skewed and is modeled well by the power law, not by the Poisson distribution typical for classical random graphs. Studied for three languages, distribution of nodes according to the traffic they experience during corpora processing obeys the power law as well. Traffic and out-degree are the parameters, which affect performance of finite-state devices. We discuss how specific properties of power law, like distribution of these parameters (coexistence of small number of "hubs" with large number of "small events"), can be exploited for efficient computer implementation of finite-state devices used in morphology.