Suppose that many of the decisions we and other animals make are fundamentally non-deterministic. Suppose, that is, that the mechanisms in our nervous systems making the decisions are irreducibly stochastic: no amount of information about the structure of a nervous system, its components, its past history, its current sensory inputs, etc., would allow us to predict the outcome of a decision more definitely than a collection of probabilities.
Suppose further that the decision-making apparatus is a network of simple but inter-communicating stochastic decision-making parts. Can we imagine and define such a part that is simple enough to be understandable, but complex enough to embody the essence of stochastic decision making? Can these parts be interconnected into complex networks capable of doing complicated things? Can we understand what such networks do? Can we describe precisely what they do mathematically? Can we design such networks to behave in ways that we specify? Can we learn anything about real, living neural networks by creating and studying models consisting of networks of these simple parts?
Answers to these questions are the subject matter of this tutorial. We will find that we can, indeed, answer all of these questions affirmatively – while, at the same time, developing a very powerful and mathematically rigorous body of theoretical concepts.
The objective of this tutorial is to give participants enough information about networks of stochastic artificial neurons that they can decide if they want to learn about the concepts in greater depth and possibly apply them to their own work or even build on and extend these ideas further.
This tutorial is expected to be of interest to:
We will begin by reviewing key concepts from discrete probability theory: the concept of a trial, the concept of a sample space, and the concept of a measure. We will highlight the key role that time plays in these concepts, a role that can often be safely ignored when dealing with deterministic systems. We will emphasize the fact that any theory dealing with stochastic ANNs has to be supported by a mathematical formalism that allows both sample spaces and measures to be manipulated at the same time.
Next, we will define the notion of a stochastic artificial neuron (SAN). We will formalize the abstract concept of a SAN with a precise definition. This will require that we introduce a mathematical description that completely abstracts the decision-making behavior of the SAN.
This mathematical description is a stochastic matrix which we will call a Stochastic Neural Function (SNF). We will note that a SNF, in the form of a matrix, contains information about sample spaces in the positions of its elements and measures in the actual numbers.
Next, we will consider feed-forward (FF) networks of SANs. Here again, we have to pay careful attention to time. We will introduce a formal definition of a FF network that involves a layered structure with the decisions of individual SANs being made in sequence through the layers. Feed-forward means that the inputs to a SAN in the network can come only from the inputs to the entire network or from the outputs of SANs in previous layers of the network. We will note that there is a SNF in the form of a stochastic matrix that describes the behavior of the entire network.
At this point, we will address the question: if we know the structure of a FF network and the SNFs describing its individual SANs, what does the whole network do? That is, what is the SNF that describes the entire network? Informal, ad hoc methods can be used to attack this question when networks are simple or have structures that make analysis easy. But we need a method that gives an answer in any and all cases. This is the topic we take up next.
This is a large topic. It requires the development of the algebra of stochastic matrices in some new ways and it requires the introduction of some new, but very useful concepts. We will find that the algebra of stochastic matrices is very powerful: it allows us to manipulate sample spaces and measures at the same time; it allows us to describe networks that have one output and networks that have more than one output; it allows us to deal seamlessly with networks that are a mixture of deterministic and stochastic ANs; it allows us to deal seamlessly with stochastic (or deterministic) networks interacting with stochastic (or deterministic) environments, and; it allows us to deal seamlessly with networks that have no inputs, i.e., information sources, and with networks that have no outputs, i.e., information sinks.
Next, we will address the question: if we can specify what we want a FF network to do, can we design a network that meets this specification? Again, informal, ad hoc methods can be used if the SNF is simple or is structured in ways that make synthesis easy. But we also need methods that will always work for any SNF. Here, we will introduce several methods that are generalizations of methods from deterministic switching theory. We will include methods that can be used for networks with a single output and methods that can be used when more than one output is required. We will find that there are many networks that can realize any given SNF. And we will highlight several important insights, such as the fact that it is always possible, given any SNF, to find a network that realizes that SNF that consists of a purely deterministic network with some inputs that are the outputs of simple information sources.
At this point, we will be ready to discuss recurrent networks. These are networks that have one or more SANs whose outputs loop back to become inputs to SANs in earlier layers of the network. Recurrent networks have the ability to store information in "internal states". This means that previous inputs, previous outputs, and/or previous internal states can influence their current output decisions, not just current inputs, as is the case for FF networks. We will carefully define what we mean by a recurrent network and then we will discuss two key results. The first is that any recurrent network is always equivalent to a FF network with some outputs of the entire network fed back as inputs to the entire network. This reduces the analysis of a recurrent network to the analysis of a FF network. The second key result is that if we know what we want a recurrent network to do, we can always describe the problem as a FF network with some outputs fed back as inputs. So we can always turn the problem of synthesizing a recurrent network into a problem of synthesizing a FF network.
As an example of how we can apply the concepts outlined above to model the observed behavior of a biological system, we will next discuss the distributions of step lengths in the random walks of foraging animals. Experimental data frequently support theoretical considerations that indicate that the step lengths in the random walks of animals should tend to fall into one of two patterns: an exponential distribution of step lengths or a power-law distribution of step lengths.
We will find that for each of these distributions, there are simple stochastic ANNs that can make the decisions that lead to the observed behavior. We will find that a single neuron, acting as an information source, can produce the decisions leading to an exponential distribution of step lengths. We will find that a simple recurrent network can make the decisions leading to a power-law distribution. Using such models, we will find that we are able to make inferences about the underlying neural structures of the animals while at the same time uncovering possible explanations for details of the observed behaviors that are otherwise puzzling. These models can also be used to make predictions about other characteristics of foraging behavior and thus suggest future experiments.
We will next note some of the connections of the nondeterministic theory presented here with deterministic ANN theory. In particular, we will discuss the nondeterministic generalizations of deterministic threshold logic and adaptive threshold logic. These concepts will lead into a discussion of using nondeterministic ANNs to model the adaptive behavior of players in simple repetitive games such as "matching pennies."
We will conclude with an overview of the broader context within which to view networks of SANs. We will emphasize how our models inter-work smoothly and naturally with other well-established areas of knowledge, filling gaps in them and linking them together. Some of these areas are ones we have already touched upon and are, by definition, non-deterministic. These include information theory, game theory, and the theory of stochastic automata. Some areas are deterministic, such as switching theory and deterministic automata theory. Finally, coming back to human beings and, by implication, other animals, we will briefly touch on some of key events in the long history of philosophical thought regarding the possibility that animal and human behavior may be inherently nondeterministic. Historically, this idea has been intimately intertwined with the longstanding, still argued, philosophical debate regarding free will and determinism. We will find that our paradigm for modeling complex automata using simple nondeterministic parts suggests that this controversy can be resolved in a way that has seldom, if ever, been considered.
Richard C. (Dick) Windecker earned his Ph.D. in experimental solid state physics at the University of Illinois at Urbana-Champaign. He retired from Lucent Technologies (Bell Labs). In between, he taught physics to undergraduate physics majors (Chiang Mai University, Thailand), he did research in the area of stochastic ANNs (University of Guelph, Ontario), he was a systems engineer, and he was a manager of systems engineers (AT&T and Lucent Technologies). After retiring, he did systems engineering consulting for the U.S. Army (Ft. Monmouth). Currently, he is continuing to do research in the area of stochastic ANNs. He may be contacted at or +1 732 233-0838.