European Thematic Network for Doctoral Education in Computing

Recurrent neural networks can be used for processing structured data such as sequences, trees or graphs. In practice it is often difficult to train the network to an actual task. Problems with training algorithms and lack of understanding of the inner dynamics have prevented recurrent networks to be widespread and generally accepted computing devices. Recently a lot of attention was devoted to the neural models with contractive dynamics based on fixed point attractors. It has been proposed by several authors that even the untrained recurrent neural network can be the source of useful representation of history of symbols.
In this thesis, the behavior of an untrained, randomly initialized recurrent neural network is studied and the phenomenon called Markovian architectural bias is explained. Models based on this property are then described including the novel echo state network. The thesis demonstrates that the history of symbols presented to the network is exploited similarly as in Markov models and the correspondence between them and variable length Markov models is shown.
Theory is supported by large number of experiments. Different training approaches and different architectures are compared on tasks of the next symbol prediction using different complex symbolic sequences. Advanced training approaches based on the Kalman filtration often outperform standard training algorithms also when processing symbolic sequences. On the other hand many results achieved with advanced and thorough training process can be often attributed to the Markovian bias present also in untrained networks.

PhD DATABASE