Adaptive processing of data
structures
| Paolo Frasconi |
DSI, University of
Florence, Italy |
| Christoph Goller |
University of München,
Germany |
| Marco Gori |
DII, University of Siena,
Italy |
| Andreas Küchler |
University of Ulm, Germany |
| Alessandro Sperduti |
DI, University of Pisa,
Italy |
| Ah Chung Tsoi |
University of Wollongong,
Australia |
| Michelangelo Diligenti |
DI, University of Siena,
Italy |
Description
Connectionist and probabilistic
models for machine learning are often limited to processing relatively
poor data types. For example, feedforward neural networks and adaptive
mixture models assume that instances are represented by fixed-size
(static) tuples of attributes (when using neural networks, a tuple
is typically encoded by a real valued array). While some domains naturally
lead to static data representations, other domains are better described
by variable length representations. In speech recognition, for example,
utterances are normally represented as a discrete time sequences of acoustic
features (e.g., 12 kepstral parameters are computed every 10 milliseconds).
Similarly, in molecular biology, proteins are described by variable length
strings of symbols (each symbol is one of 20 amino acids). There are two
important novelties when using sequences instead of static patterns. First,
the number of atomic entities in a sequence is not fixed a priori, i.e.
a sequence is a dynamic entity. Second, atoms in a sequence are serially
ordered and the serial order relation provide us with some information
which is not contained into the variables themselves. Models for sequence
processing, such as recurrent neural networks (RNNs) and hidden Markov
models (HMMs) are a generalization of feedforward neural networks and probabilistic
mixture models that can exploit such additional information.
However, data in the real
world often come with a much richer structure and serial order can be generalized
to more complex relations. These relations can be naturally represented
by graphs. A data structure is therefore a dynamic collection of
related variables and can be conveniently represented as a graph whose
nodes are labeled by variables. A sequence is just a very special case
of data structure: the relation among variables is a (quite simple) graph
whose topology is restricted to the class of linear chains:
Clearly, it is not difficult
to imagine a generalization from serial order to partial orders (directed
acyclic graphs):
Learning domains yielding
structured instances are not as unusual as one might think. The reason
why structured representations are not very popular in connectionism is
perhaps the lack of adaptive models that can effectively deal with graphical
inputs.
For example, complex chemical
compounds can be naturally represented using graphs whose nodes are labeled
by atoms or simple molecules.
Another interesting example
is syntactic pattern recognition. In syntactic pattern recognition, objects
are described as labeled graphs, where labels can be chosen from a finite
alphabet of symbols. In the simplest case, recognition is a merely symbolic
procedure based on a formal generative definition of languages. A set of
production rules (defining a formal grammar) are applied to ascertain whether
the object under examination belongs to a given language and the yes/no
answer is used to predict the object class. While the most widely known
theory on formal language is restricted to strings (for example, the theory
commonly used for programming languages), researchers in the syntactic
pattern recognition community more often use languages defined on graphical
domains, i.e. labeled trees or labeled webs are used instead of strings.
This is because graphs have a much richer description power.
This is an example of tree
representation of a logo image:
In this case the image was
recursively decomposed into a tree whose nodes correspond to sub-objects
and edges express the contained-in relation. For each sub-object, some
local features are measured. In this example, shape, perimeter and area
of each sub-object are local features attached to nodes.
While neural network and
probabilistic architectures have been extensively studied for both static
and sequential data types, architectures for data structures have received
relatively little attention until recently.
References
- P. Frasconi and M. Gori and
A. Sperduti (1997). A general framework for adaptive processing of data structures
Tech. Report 15/97, Dipartimento di Sistemi e Informatica, Universita' di
Firenze. (gziped PostScript. 183k) .
-
A. Sperduti and A. Starita (1997)
"Supervised Neural Networks for the Classification of Structures," IEEE
Transactions on Neural Networks, 8(3).
- P. Frasconi and M. Gori and
A. Sperduti (1997). "On the Efficient Classification
of Data Structures by Neural Networks", Proceedings of the International
Joint Conference on Artificial Intelligence (IJCAI-97).
-
A. Sperduti, A. Starita, and
C. Goller (1995). "Learning
Distributed Representations for the Classification of Terms" In: Proceedings
of the 14th International Joint Conference on Artificial Intelligence
(IJCAI-95).
-
A. Sperduti (1995). "Stability
Properties of Labeling Recursive Auto-Associative Memory" IEEE Transactions
on Neural Networks 6(6), 1452-1460.
-
A. Sperduti (1994). "Labeling
RAAM" Connection Science, 6(4), 429-459.
-
C. Goller (1997). A
Connectionist Approach for Learning Search-Control Heuristics for Automated
Deduction Systems, Ph.D. Thesis,Technische Universität
München (Germany).
-
T. Schmitt, C. Goller (1997).
"Relating
Chemical Structure to Activity: An Application of the Neural Folding Architecture",
Submitted.
-
A.D. Blair (1997). "Scaling
Up RAAMs". Brandeis University Computer Science Technical Report
CS-97-192.
-
C. Goller, A. Küchler (1996).
"Learning
Task-Dependent Distributed Representations by Backpropagation Through Structure"
International Conference on Neural Networks (ICNN-96) .
-
J. B. Pollack (1990). "Recursive
Distributed Representations". Artificial Intelligence 46,
1, 77-105.
Related
events
World
conference on computational intelligence WCCI-98 (Anchorage, AK). Tutorial
on "Learning data structures," May 5, 1998.
NIPS*97
Postconference workshop on "Learning dynamical data structures: From sequences
to graphs", Brekenridge (CO), December 1997.
The
2nd International
summer school on Neural Networks ,Vietri sul Mare (Italy), September
1997. See also the book published by Springer: "Adaptive processing of
sequences and data structures".
ECAI
'96 Workshop on Neural Networks and Structured Knowledge (NNSK) August
12, 1996. Budapest, Hungary.
Paolo Frasconi
Last
modified: Tue 30 Dec 14:38:13 MET DST 1997