Reviews of this paper
129 Corrie Road, Cambridge, CB1 3QQ, UK.
It is widely accepted that symbolic systems are useful in understanding the working of the brain, and there are many symbolic models of functions of the brain. This is based on the assumption, commonly implicit, that in the brain itself there is a symbolic system. In this article I challenge this belief, by showing that symbolic systems cannot be implemented by neurons in the brain. I based the argument on textbook knowledge from neurobiology, and the basic requirements for implementing symbolic systems. In particular, I show that there is no way to implement symbol tokens in neuronal substrate, where the individual connections of individual neurons (as opposed to cell populations) are not well defined.
It is common for cognitive scientists and researchers in related areas to assume that the human cognition can be described as a symbolic system. For example, Eysneck and Keane (1989), in a `Student's Handbook', list the basic characteristics of the information-processing framework, which they say is agreed as the appropriate way to study human cognition (p. 9). The second and the third items on this list are:
(They do mention connectionism later).
Other cognitive psychology textbooks are less blunt, but they also tend to regard the symbolic view as the appropriate way of looking at the brain, with connectionism as an alternative. For example, Stillings et al (1995), spend more than 2/3 of the chapter titled "The architecture of mind" (pp. 15-63) on discussion of the "symbolic paradigm", and the rest of the chapter (pp. 63-86) on connectionism and comparison between the two approach. Note that this is also "an introduction" book, rather than a speculative effort. In line with all other textbooks, they don't discuss the question of implementation in the brain at all.
The general appeal of symbolic systems stems mainly from two related reasons:
However, these two characteristics are irrelevant to the brain, because brain systems are not necessarily simple or easy to implement on computers. The processing in the brain is done by neurons (possibly with some modulation by neuroglia), and every mechanism the brain is using must be implemented by neurons. Therefore, models of the mechanisms of the brain must be, in principle, implementable by neurons with the characteristics of the neurons in the brain. In this article I will show that symbolic systems are in principle unimplementable with neurons with these characteristics, and hence that symbolic systems are unlikely to be relevant to the brain.
Symbolic systems are based on symbols. Symbols are, according to Newell (1990) (p. 77):
"Patterns that provide access to distal structures" and "A symbol token is the occurrence of a pattern in a structure". Thus the implementation of a symbolic system requires tokens, which must have this two characteristics:
For a symbolic system to work, the operation of storing a token must satisfy two requirements:
These requirements are normally less explicit in discussions of symbolic systems. The first of them is most explicitly stated in Newell and Simon (1976, P.116): "A symbol can be used to designate any expression whatever". It is sometimes expressed in other terms (e.g. Newell (1990) talks about completeness, P. 77). The second one is taken for granted. Nevertheless, they are essential, and all implementation of symbolic models use them. It is this two requirements which are I will argue are not implementable by neurons in the brain. Systems that require only one of these, or none, are not discussed in this article.
As mentioned above, the question of implementation of symbol tokens is rarely even mentioned. For example, in Newell & reviewers (1992), which discuss the symbolic system SOAR (Newell, 1990), none of the participants raises the question of implementation of SOAR in real neurons. In Vera & Simon and reviewers (1993), which is also a multi- author discussion concerning symbolic systems, it is mentioned briefly: "The way in which symbols are represented in the brain is not known. Presumably they are patterns of neural arrangement of some kind" (Vera & Simon 1993b, p. 9).
However, implicitly it is assumed that symbolic systems are implemented in the brain, as Vera & Simon (1993c, P. 120) say: "The symbolic theories implicitly assert that there are also symbol structures (essentially changing patterns of neurons and neuronal relations) in the human brain that bear one-to-one relations to the symbols of category 4 [symbols in computer memory] in the corresponding program." These authors say later in the same article (P.126): "We are aware of no evidence (nor does Clancey provide any) that research at neuropsychological level is in conflict with the symbol system hypothesis". In the next four sections I will show that our knowledge at the neurobiological level is in conflict with the symbol system hypothesis.
It is not my intention to give a full description of what is know about neurons in the brain. The interested reader can find more in any textbook about the brain (for example, Brodal 1992, Dowling 1992, Gutnick & Mody 1995, kandel et al 1991, Nicholls, Martin & Wallace 1992, Shepherd (ed.) 1990, Shepherd 1994). Instead, I will list those characteristics that are essential to my argument. It is important to note that the characteristics listed here are `textbook' knowledge, supported by large body of consistent experimental evidence, accumulated by over 100 years of research.
In the following text, I use the term `brain' to mean `vertebrate brain', and the characteristics listed here are not necessarily true for simpler brains. When numerical values are mentioned, they are mainly based on the structure of the cerebral cortex, which is the main site of thinking in the brain. Other parts of the brain deviate from these values, but these deviations do not introduce any new principles.
The characteristics that are relevant to the argument are:
At the scale of organs, brains have a well-defined structure. Parts of the brain have a reasonable well-defined structure in smaller scale, in the region of 1mm. The connectivity at lower scale (low-level connectivity), however, is not well specified.
For example, when an axon from the Lateral Geniculate Nucleus enters the visual cortex, it is directed to some location in the cortex, to preserve the topographic mapping of the information. This is commonly given as an example of highly ordered connection (e.g. Shepherd (1990), p.395). However, in the cortex the axon branches to an `axon tree' which span more than 1mm squared, and is made of hundreds of branches (Shepherd (1990), p.396). Within this region the neuron forms contacts with only part of the neurons, depending on the type of the target neuron and location of its dendrites (mostly layer 4, in this case). This still leaves a choice of several tens of thousands of neurons to choose from (or even more), and the axon forms connections with few thousands of these. The selection of these few thousands is essentially stochastic, by which I mean it is not related in a consistent way to the selection that other neurons do, in the same brain or in other brains.
The evidence for this is from comparison of the axon trees of different neurons, within the same brain and from brains of different animals of the same species. It is clear that the structure of the axon trees of individual neurons is not well specified. When it come to comparison between brains, or between the two hemispheres in the same brain, it is not even possible to match individual neurons between brains, because they are too different. Since the low-level connectivity is different between individuals, it cannot be specified during development (by the genes or otherwise), and hence must be stochastic.
This conclusion tells us more than just about differences between individual brains. It tells us that the set of neurons which will tend to become active as a result of activity of some specific neuron is stochastic, i.e. uncorrelated to the set that will tend to become active as a result of the activity of any other neuron, even in the same brain. It follows immediately that the set of neurons that will become active as a result of the activity of some specific set of neurons is stochastic. In other words, the relation between a some pattern of activity |X| and the pattern of activity |Y| that it will activate (the transformation |X|-> |Y|) is stochastic, i.e. uncorrelated to the relation between any other pattern of activity |X'| and the pattern of activity |Y'| that it will activate (the transformation |X'| -> |Y'|).
It should be noted that this lack of relations applies within the same brain, and this is what is meant by the term stochastic connectivity in this article. In particular, the term does not mean variations over time, and does not mean lack of correlation between relations between patterns of activity (|X| -> |Y|) and relations between entities in the outside world.
It can be argued is that even though the connectivity as defined by the axon trees are not well defined, some process reduce the strength of irrelevant synapses, so they become insignificant. The problem with this possibility, however, is that this process requires that the information about the correct connectivity be stored somewhere, and then affect the modification of the synapses. This information cannot be stored in the neurons themselves (because of their stochastic connectivity), and there is no other place in the brain, body or outside the body where this information (i.e. which synapses needed to be eliminated to get the right connectivity) can be stored.
Note what the argument above does not say:
In the Peripheral Nervous System (PNS) the individual connections are less stochastic, but even there, in most of the cases, the low-level connectivity is not well specified. For example, normally each muscle fiber is innervated by a single axon. Initially the fiber is innervated by several axons, and then there is a process of selection, which causes all of these, except one, to retract. Which axon stays is a stochastic choice, a conclusion that is again based on comparison between individual animals.
The stochastic nature of the low-level connectivity is almost never mentioned explicitly in neurobiological textbooks, probably because they don't believe that this fact has any consequences. Instead, these books emphasize the order that exists in coarser resolution, many times in a confusing way.
For example, Nicholls, Martin & Wallace (1992, p. 341) ask: "What cellular mechanism enable one neuron to select another out of myriad of choices, to grow toward it, and to form synapses?". They later bring examples of specific connectivity. However, in all the examples that concern vertebrate Central Nervous System (CNS), the specificity is in the level of cell populations, rather than individual connections. Thus the answer to the question is that in the CNS a neuron does not "select another". Rather, it selects a region and cell types, which still leaves quiet a large spectrum for individual choices.
Maybe the worst example is in kandel et al (1991). On page 20 appears, as part of the `principle of connectional specificity' which is supposed to be general property of neurons, this assertion: ".. (3) Each cell makes specific connections of precise and specialized points of synaptic contacts - with some postsynaptic target cells but not with others." The `specific connections' is true in some invertebrate systems, but it is simply false when applied to the vertebrate brain. In chapter 58, `Cell migration and axon guidance', the author tries to support this assertion, but all the examples of specific connectivity are from invertebrates. There are some examples from vertebrates, but they all show connectivity between cell populations, rather than individual cells. In addition, they are all about peripheral neural system, except one example from the spine of bullfrog. The vertebrate brain is not even mentioned in this chapter. It is obvious that this is because there are no example of specific connectivity there, but the text does not actually say this. The next chapter, `Neural Survival and synapse formation', discusses only neuron-muscle junctions, and there is no further discussion on the question of specific connectivity.
Disappointingly, This is true even in books that are explicitly about the computational aspect of the brain, e.g. Churchland & Sejnowski (1992), Baron (1987), Gutnick & Mody (Eds.)(1995). For example, in Gutnick & Mody (Eds.)(1995), Section iii is about "The Cortical Neuron as Part of a Network". However, only the chapter about modeling this network (Bush & Sejnowski, 1995) mention individual connections, by saying that they assume them to be random in their simulations (P. 187). Even they don't actually discuss the point, and none of the other chapters in this section, or in the rest of the book, touches the point.
Even though the stochastic nature is not explicitly stated, it is clear from the data that is presented in these books that this is the case. One of the `distal' targets of this article is to show the significance of this fact, and hence to convince neurobiologists (and others) to pay attention to it.
How are symbol tokens implemented?
Since it must be possible to store symbol tokens in arbitrary structures during computation (in other words, they are dynamic), they cannot be implemented by static features. This means that symbol tokens cannot be implemented by patterns of neurons and the connections between them, because these are static in the time scale of thinking. The dynamic features of the brain are the activity of neurons, and to some extent the strength of the synapses. Thus symbol tokens must be implemented by patterns of activity or strength of synapses, or both.
First, let us assume that patterns of activity are used, and see if they can fulfil the requirements for symbol tokens (section 2 above). I denote symbol tokens as |x|, |y|, and the corresponding patterns of activity as |X|, |Y|.
To store a token in some arbitrary structure, it would require to take the token |x|, i.e. the pattern of activity |X|, and propagate it to the appropriate `location'. Note that the need to propagate the pattern is always true, no matter what the `location' is. The propagation must happen by the pattern of activity |X| activating another pattern of activity |Y|, because there is no other way in which a pattern of activity can have any effect (in the time-scale of thinking). For the transformation (|X| -> |Y|) to be regarded as moving the symbol token |x|, the result of the propagation, i.e. the symbol token |y| which corresponds to the pattern of activity |Y|, must be a `copy' of |x|, i.e. must points to the same location as |x| does.
However, as discussed in the previous section, in the brain this propagation is stochastic. This means that if the propagation was successful for some symbol token |x|, i.e. the transformation (|X| -> |Y|) causes |y| to point to the same location as |x|, it will not work for any other symbol token |x'|, which will propagate in a different way (|X'| -> |Y'|).
The stochastic propagation of patterns of activity is the most crucial point to grasp in the whole argument. It is worth noting here that this is in stark contrast with the situation in computers (more generally, artificial devices). In these, the connectivity is defined exactly and completely, and the relation between a pattern of activity |X| and the pattern of activity that it will activate |Y| is well-defined for all |X| at any location. As a result, it is possible to propagate any pattern, to any place, without any restriction, and without changing the pattern itself.
The other crucial point to note is that the stochastic propagation is not a noise that is added to the signal. It is the signal itself that is transformed stochastically. This contrasts with noisy channels, where the signal is not transformed, but is contaminated by noise.
The fact that propagation of arbitrary data on computer is not a problem is probably the reason that most of people intuitively assume that there is no problem to implement symbolic systems in the brain. The problem with this intuition is that it does not take the stochastic nature of the low- level connectivity in the brain into account.
It can be argued that the way to propagate symbol tokens is learned, or acquired by some other process. This, however, would require some part of the brain to know (in some sense) in advance the appropriate transformation (|X| -> |Y|) for each |X| between each pair of locations, so it can direct the acquisition process. In a system with stochastic low-level connectivity, there is no way to know this transformations in advance, so this is not a possible explanation.
Hence there is no way to propagate patterns of activity to arbitrary locations, so they cannot be used to implement symbol tokens.
The other possibility of moving symbol tokens, by propagating synapse strengths, is also eliminated by the argument above, because synapse strengths can be propagated only by patterns of activity, so it is stochastic too.
This also means that pattern of synapse strengths cannot be used, because the only way to propagate them is through patterns of activity.
The `cope out' solution, of regarding any pattern in the target location as a `copy' of the source pattern is obviously unacceptable, because this will not fulfil the other requirement of symbol tokens, i.e. that they point (allow access) to some structure. Two patterns that are related to each other in a stochastic way cannot, in general, point to the same structure.
Thus we have reached the conclusion that there isn't any feature in the brain that can be used as symbol tokens. It is important to note that the argument is general, is not dependent on a specific implementation details, and is applicable both to innate (genetically programmed) and learned mechanisms.
A possible objection to the argument above is that there may be a higher level of organization that may support symbol tokens. However, this is clearly not the case.
Up to a level of ~1mm, the connectivity is clearly stochastic, so the argument in section 5 applies. This eliminates any implementation that relies on primitive elements which are smaller than ~1mm, whether localized or distributed. Thus implementing symbol tokens in higher levels of organizations means that it is based on primitive elements of dimension of 1mm or larger, and the total activity of the whole element, rather than its pattern at higher resolution, is the significant variable.
At that level, however, we can easily tell that there is no coherent connectivity between different elements. By `coherent connectivity' I mean a connectivity that allows one primitive element to affect separately other elements. To do this, the output from an element to other elements have to be separable in some way, so it can be controlled separately. However, when we look at any 1mm square of the cortex, the neurons that send processes to other elements are all mixed up together, on a very small scale (tens of microns, at most). Because at that level the connectivity is stochastic, these neurons cannot be controlled separately.
It is important to note that this is true for all the connections inside the cortex. Hence it is independent of what are the actual elements that are postulated to be the base for implementation of the symbolic system, provided these elements are large (1mm or larger).
The lack of coherent connectivity means that the state of the element as a whole cannot be propagated inside the cortex. Instead, it is distributed approximately equally to all its neighbors and sometimes to further elements by intracortical projections. In the neighbors, or the further elements connected by projections, it is mixed with local activity and activity from other neighbors, in a stochastic fashion. As a result, an element which is further away, and is not connected by a projection, can never `see' (be affected by) the activity in the original element. Instead, it always `sees' a stochastic mixture of activity of many elements. In that sense, the connectivity at the 1mm level is stochastic as well, and the argument of section 5 applies to that level as well.
A more coherent connectivity is seen outside the cortex, and in connections in and out of the cortex, mainly sensory input and motor output. However, these connections clearly are not coherent enough to transfer specific activity across the cortex (in the case of sensory and motor connections, it does not transfer activity across the cortex at all).
Hence, in the brain, there is no higher level of organization that can support symbol tokens.
A possible counter-argument to the argument in the previous section is as follow: If this argument is correct, it can be used to prove that people cannot handle symbols. But people can handle symbols, so the argument must be wrong. This counter-argument is wrong, because the person as a whole is different qualitatively from the components of the brain, by two fundamental properties (at least):
Components in the brain do not have these learning capabilities, so they are limited in a way that the whole person isn't. The possibility of learning how to deal with symbol tokens inside the brain is discussed in section 5.
Since components of the brain do not have sensory input and the learning capabilities of the whole person, there are many tasks that the whole person can do that components cannot do. Thus, that the person can perform some task (e.g. written communication, symbolic operations) does not prove that components of the brain can do it.
In sections 4-6 it was shown that there is no way to implement symbol tokens in the brain, so it cannot be a symbolic system. This means that the brain is not a symbolic system, and theoretical analysis of symbolic systems is not applicable to it. However, it can still be argued that experimenting with symbolic systems is useful for understanding the brain. A typical argument would be: Both the brain and the symbolic models are information-processing systems, so experimenting with symbolic systems will tell us something about the brain.
This argument is flawed, because there is no general way to know which of the features of symbolic systems are applicable to information- processing systems in general, and therefore to the brain. Hence every feature that is found in symbolic systems have to be first tested on the brain (possibly indirectly through behavior) before we know if it is applicable to the brain.
In theory, symbolic systems can still be used to direct research on the brain by suggesting hypotheses which are worth testing, and the argument in sections 4-6 is silent about this possibility. This, however, is a heuristic approach, which may or may not work. The experience with symbolic systems in the last ~40 years suggests that this approach does not work.
In general, a model is useful when it generates useful insights into the system under investigation. Symbolic systems clearly did not generate any insight into the neurobiology or anatomy of the brain, but it can be claimed that they generated useful insights into human thinking.
It is problematic to decide what is a `useful insight', but a plausible heuristic is that useful insights will be mentioned in the basic textbooks of the relevant subject. Inspection of textbooks in cognitive psychology (Eysneck and Keane 1989, Matlin 1994, Mayer 1992, Stillings et al 1995) and even more symbolic systems model specific books (e.g. Baars 1988, Johnson-Laird 1993, Newell 1990) does not show any insight into human behavior or thinking which was generated by testing symbolic system models hypotheses.
These books are full of models of human behavior, but in all the cases the behavior was first noticed, or postulated based on the researcher's knowledge, and then modeled. Thus the model was not useful in finding the behavior. It can be argued that the model was useful for testing the mechanisms underlying the behavior, but if the brain does not implement a symbolic system, these tests are invalid.
The illusion that symbolic models are useful is mostly based on the implicit assumption that the brain implements a symbolic system too, and hence that if a model can reproduce the behavior of humans in some situation, or generate hypotheses that can be tested, it is necessarily useful. When it is realized that the brain cannot be implementing a symbolic system, and the symbolic models are evaluated by the real parameter, i.e. generating insights, they seem much less useful.
Few counter arguments that were raised in formal discussions (seminars, mail groups) of the ideas in this text are:
The discussion in the sections 4-6 is fairly short and straightforward, based on well-known facts, and leads to quite important conclusion. Why wasn't it noticed before?
First, it is not true that it was not noticed before. It is more accurate to say that it was never put so explicitly. For example, Robinson (1995) reaches a similar conclusion. The main difference is that in this text I explicitly discuss the neurobiological evidence that I used to reach my conclusion, while Robinson (1995) assumes more or less the same without giving neurobiological evidence. This allows me to reach much stronger conclusion, and open the way for a discussion of the validity of the argument.
The connectionists (Rumelhart & McClelland, 1986) obviously see problems with symbolic systems, but I haven't seen anyone explicitly stating the impossibility of implementing symbolic systems. For example, Churchland and Sejnowski (1992) discuss the computational aspects of the brain, but do not touch the question of determination of individual connections. On the other hand, there were several efforts to merge the two approaches (e.g. Smolensky 1988).
Even some of those who believe in symbolic systems seem to realize that there are some problems. For example, The last item in the list of the main characteristics of the information- processing framework which is given by Eysneck and Keane (1989) (p. 9, see in the introduction above) reads: "This symbol system depends on a neurological substrate, but is not wholly constrained by it". It is not obvious what `not wholly constrained' in this statement means, but a plausible interpretation is that the authors realize that symbolic system cannot be implemented by neurons, but don't think it is an important point.
The question why the argument was not put explicitly still remains. I think the best explanation is that the argument requires understanding both neurobiology and implementation of symbolic systems. Neurobiologists do not realize the importance of the stochastic connectivity for theories of cognition, while computer scientists don't know enough neurobiology to realize that there is a problem.
The latter is not helped by the fact that neurobiological texts tend to strongly emphasize the coarser order, and use terms like "specific" and "precise" to describe it. For computer scientists, these terms imply that all the connections are well specified, the way they are in computers. Clarification of these terms would be a great help in understanding this domain.
When it comes to Artificial Intelligent, the argument in section 4-6 is of no great consequences. Even if the brain is not a symbolic system, symbolic systems may still be the best way of building artificial systems. It is also possible in principle that there are living intelligent creatures somewhere in the universe that have thinking systems based on symbols.
When it comes to research about the way the brain works, the argument has crucial implications. It shows that symbolic systems are incompatible with what we currently know about the brain
Thus, these systems need very strong supporting evidence before they can be regarded as real candidates for modeling brain mechanisms. Since this evidence is lacking, symbolic systems do not deserve the attention they get, and researchers of the brain would do better to explore other avenues.
Baars, Bernard J. (1988), A Cognitive Theory Of Consciousness. Cambridge, UK: Cambridge University Press.
Baron, Robert J. (1987), The Cereblar Computer: An introduction to the computational structure of the human brain. Hillsdale, NJ: Lawrence Erlbaum Associates.
Brodal, Per(1992) The Central Nervous System, structure and function. New York, NY: Oxford University Press.
Bush, Paul & Sejnowski, Terrence J. (1995), `Models of Cortical Networks', in Gutnick, Michael J. & Mody, Istvan (Eds.) The cortical neuron. New york, NY: Oxford University Press, pp.174-189.
Churchland, Patricia S. & Sejnowski, Terrence J. (1992). The Computational Brain. Cambridge, MA:MIT Press.
Dowling, John E. (1992). Neurons and Networks: An introduction to neuroscience. Cambridge, MA: Harvard University Press.
Eysneck, Michael W. & Keane, Mark T. (1990),Cognitive Psychology A Student's Handbook. Hove and London: Lawrence Erlbaum Associates.
Gutnick, Michael J. & Mody, Istvan (Eds.)(1995), The cortical neuron. New York, NY: Oxford University Press.
Johnson-Laird, Philip (1993), The Computer And The Mind (second edition). London, UK: Fontana Press.
Kandel, Eric P., Schwartz, James H., Jessel, Thomas M. (Eds.) (1991), Principles of Neural Sciences (third edition). New York: Elsevier.
Matlin, Margaret W. (1994), Cognition (third edition). Fort Worth: Harcourt Brach Publishers.
Mayer, Richard E. (1992). Thinking, problem solving, cognition (second edition). New York: W.H. Freeman and Company.
Newell, A., & Simon, H.A. (1976), `Computer Science as Empirical Enquiry: Symbols and Search.' Communications of the association for computing machinery 19, pp. 113-126.
Newell, Allen (1990), Unified Theories of Cognition. Cambridge, MA: Harvard University Press.
Newell Allen and reviewers(1992), `Precis of Unified theories of Cognition'. Brain and Behaviour Science 15, pp. 425-492.
Nicholls, John G., Martin, Robert A. & Wallace, Bruce G. (1992), From Neuron to Brain (third edition). Sunderland, MA: Snauer Associates Inc.
Robinson, William S. (1995), `Brain Symbols and Computationalist explanation'. Minds and Machines 5 pp. 25-44.
Rumelhart, David E., & McClelland, James E. (1986), Parallel Distributed Processing. Cambridge, MA: MIT Press.
Shepherd, Gordon M. (Ed.)(1990), The synaptic organization of the brain (third edition). New York, NY: Oxford University Press.
Shepherd, Gordon M. (1994). Neurobiology (third edition). New York, NY: Oxford University Press.
Smolensky, Paul (1988) `On the proper treatment of connectionism'. Brain and Behaviour Science 11, 1-74.
Stillings et al. (1995), Cognitive Science an introduction. Cambridge, MA: MIT press.
Vera, A. & Simon H.A. and reviewers(1993a). `Special Issue: Situated Action.' Cognitive Science, 17, pp. 1-133.
Vera, A. & Simon H.A. (1993b), `Situated Action: A symbolic interpretation.' Cognitive Science 17, pp. 7-48.
Vera, A. & Simon H.A. (1993c), `Situated Action: Reply to Clancey.' Cognitive Science 17, pp. 117-133.