Adaptive processing of data structures


Paolo Frasconi DSI, University of Florence, Italy
Christoph Goller University of München, Germany
Marco Gori DII, University of Siena, Italy
Andreas Küchler University of Ulm, Germany
Alessandro Sperduti DI, University of Pisa, Italy
Ah Chung Tsoi University of Wollongong, Australia
Michelangelo Diligenti DI, University of Siena, Italy
 


Description

Connectionist and probabilistic models for machine learning are often limited to processing relatively poor data types. For example, feedforward neural networks and adaptive mixture models assume that instances are represented by fixed-size (static) tuples of attributes (when using neural networks, a tuple is typically encoded by a real valued array). While some domains naturally lead to static data representations, other domains are better described by variable length representations. In speech recognition, for example,  utterances are normally represented as a discrete time sequences of acoustic features (e.g., 12 kepstral parameters are computed every 10 milliseconds). Similarly, in molecular biology, proteins are described by variable length strings of symbols (each symbol is one of 20 amino acids). There are two important novelties when using sequences instead of static patterns. First, the number of atomic entities in a sequence is not fixed a priori, i.e. a sequence is a dynamic entity. Second, atoms in a sequence are serially ordered and the serial order relation provide us with some information which is not contained into the variables themselves. Models for sequence processing, such as recurrent neural networks (RNNs) and hidden Markov models (HMMs) are a generalization of feedforward neural networks and probabilistic mixture models that can exploit such additional information.

However, data in the real world often come with a much richer structure and serial order can be generalized to more complex relations. These relations can be naturally represented by graphs. A data structure is therefore a dynamic collection of related variables and can be conveniently represented as a graph whose nodes are labeled by variables. A sequence is just a very special case of data structure: the relation among variables is a (quite simple) graph whose topology is restricted to the class of linear chains:

Clearly, it is not difficult to imagine a generalization from serial order to partial orders (directed acyclic graphs):

Learning domains yielding  structured instances are not as unusual as one might think. The reason why structured representations are not very popular in connectionism is perhaps the lack of adaptive models that can effectively deal with graphical inputs.
For example, complex chemical compounds can be naturally represented using graphs whose nodes are labeled by atoms or simple molecules.

 

Another interesting example is syntactic pattern recognition. In syntactic pattern recognition, objects are described as labeled graphs, where labels can be chosen from a finite alphabet of symbols. In the simplest case, recognition is a merely symbolic procedure based on a formal generative definition of languages. A set of production rules (defining a formal grammar) are applied to ascertain whether the object under examination belongs to a given language and the yes/no answer is used to predict the object class. While the most widely known theory on formal language is restricted to strings (for example, the theory commonly used for programming languages), researchers in the syntactic pattern recognition community more often use languages defined on graphical domains, i.e. labeled trees or labeled webs are used instead of strings. This is because graphs have a much richer description power.

This is an example of tree representation of a logo image:

 

In this case the image was recursively decomposed into a tree whose nodes correspond to sub-objects and edges express the contained-in relation. For each sub-object, some local features are measured. In this example, shape, perimeter and area of each sub-object are local features attached to nodes.

While neural network and probabilistic architectures have been extensively studied for both static and sequential data types, architectures for data structures have received relatively little attention until recently.
 



 

References


Related events

World conference on computational intelligence WCCI-98 (Anchorage, AK). Tutorial on "Learning data structures," May 5, 1998.
NIPS*97 Postconference workshop on "Learning dynamical data structures: From sequences to graphs", Brekenridge (CO), December 1997.
The 2nd International summer school on Neural Networks ,Vietri sul Mare (Italy), September 1997. See also the book published by Springer: "Adaptive processing of sequences and data structures".
ECAI '96 Workshop on Neural Networks and Structured Knowledge (NNSK) August 12, 1996. Budapest, Hungary.

Paolo Frasconi

Last modified: Tue 30 Dec 14:38:13 MET DST 1997