[1.2] Here I am going to argue that this approach is incomplete, and needs additional conditions to be useful. The argument is based on three points:
[2.1] The reason that computer models are regarded as useful test for cognitive models is that for a model to actually run on computer, it must be fully specified. When a model is constructed, there is a serious danger of underspecification, i.e. omitting essential assumptions. However, when the model is implemented on computer, if an essential assumption is omitted, the model does not work properly, so the problem of underspecification is alleviated.
[2.2] Since models of human cognition are necessarily complex, the danger of underspecification is large, and a method to eliminate this danger would be very useful. Testing the models on computer seems, at first glance, to give an appropriate test. In the following text I will try to show why this is not true, unless some additional conditions are fulfilled.
[2.3] A common misconception of modeling is that if the model generates the same results as the system being modeled, this supports strongly the hypothesis that the modeled system operation is based on the same principles as the model. This is true only if the number of parameters of the model is small and they have small range (small parameter space), and if the number of reasonable models is small (small model space).
[2.4] The second condition (small model space) is commonly neglected, yet it is important as the first condition. It can be neglected only when the model space is obviously small. However, When constructing models for the operation of the human cognition, the model space is huge, and it is essential to restrict it as much as possible.
[2.5] Without additional restrictions, the correlation between the behavior of the human cognition and some specific model cannot be used as an evidence for the correctness (according to the definition in [1.1] above) of the model. This is because in a large model space it is likely that some models will replicate the behavior of the system in some situations by chance. If the space model is large enough, it is possible to find models that would replicate any set of observations, yet would still be wrong, i.e. either fail to predict anything, or generate wrong predictions in other situations.
[2.6] Considering the size of the data base of human behavior, and the size of the model space for intelligent systems, checking all of these models would take infinite amount of time. It is therefore essential to put more constraints on the model space. These constraints can come from looking at the brain itself.
[3.1] It is quite common to claim that the brain is like a computer in being 'information processing system', and in that level of generality this is true. However, a closer look reveals immediately fundamental differences.
[3.2] The first significant difference is at the basic operation level. The basic operations of computers are:
[3.3] These are based on two central components:
[3.5] Note that fulfilling these conditions is not intrinsic to the r-address, and is dependent on a system which interprets the r-address. Thus what is an r-address is dependent on the way the computer memory is implemented.
[3.6] For a device to be used as a computer memory, it must be capable of storing a value in an r-address, and later, when given the r-address, return that value. Any device which can achieve this (in acceptable speed) can serve as a computer memory, independently of its physical makeup. Conversely, a device which cannot respond in this way cannot serve as a computer memory.
[3.7] Even with our current limited understanding of neurons and the brain, it is already clear that there is no way to implement r-addresses in neurons, so the human cognition cannot be based on r-addresses and computer memory. (I am using the term 'neurons' to include all the cells which takes part in the action in the brain, which may also include neuroglia).
[3.8] The logic behind the claim is as follows: Whatever an r-address in the brain is, it must be made of a combination of neurons, neuronal activity patterns, synapse strength and maybe diffusible signals.
[3.9] Diffusible signals are obviously not useful for this purpose. Neurons and synapses cannot move in the time scale of thinking, which leaves neuronal activity patterns.
[3.10] neuronal activity patterns cannot both move and continue to point to the same location. This is because the low-level connectivity of neurons is stochastic, so the transformation of information as it moves along is not pre-defined. This is discussed in full in brain symbols.
[3.11] An immediate conclusion is that neurons cannot be used for implementing computer memory. Thus any model which relies for its implementation on computer memory, or in other words, on using r-addresses, cannot be implemented by neurons,and therefore is incorrect (according to the definition in [1.1]). This is an important conclusion, as normally researchers assume that everything that can be implemented on computers can run in the brain.
[3.12] In general, models which run on computer must be fully specified for computers, i.e. they must rely on computer memory. Thus not only they are not more likely to be correct, they are necessarily wrong.
[3.13] This is true for the fully specified model. It is not true if the model can be implemented, in some level above the basic operations, by primitives which can be implemented by neurons (e.g. connectionist models). However, models which do not explicitly aim to be based on this kind of level are unlikely to fulfill this requirement, and therefore are likely to be unimplementable in neurons, and hence wrong.
[3.14] It can be argued that if the model does contain a level implementable in neurons it avoids this problem, and to some extent this is true. However, this opens again the problem of full specification, as the possibility (and implications) of implementing this level in the brain cannot be evaluated by running the model on a computer.
[3.19] A possible objection is that the connectivity of the neurons is such that that the neural code of each representation is mixed only with itself in a conservative way (keeping at least the 'gist' of it). However, the 'gist' of a representation must include (or be completely made of) association to other representations. These must point to the other representations, so to be transferable they must be r-addresses, which as discussed above cannot be implemented in the brain.
[3.20] It follows that transfer of the attributes of an arbitrary representation is not possible in the brain. Since the addresses of representation cannot be transferred either (that would require r-addresses), the attributes of an arbitrary representation cannot be transferred.
[3.21] This means that any operation on representation which relies on its attributes must happen at the location the representation is. Most importantly, comparison between arbitrary representation cannot be executed at the implementation level.
[4.2] Another characteristic of the brain is at it is a self-adaptive, i.e. any change in its functionality, in particularity learning, is done solely by the brain itself (without guidance from external source).
[4.3] Thus, when a model is construct for an activity of human cognition, we are trying to mimic a complex self adaptive system. The question is, therefore, can the simple system mimic the behavior of complex self-adaptive system.
[4.4] Let us assume that a model explains the operation of responding (O) to input (I) by the sequence:
(a) I => A => B => C => O
Where A,B and C are some internal entities, and '=>' denote some relation between them (e.g 'a => b' may mean 'a activates b').
[4.4] Then for the model to be useful for understanding the complex system, the complex system has to do the operation either by the sequence above, or by the sequence:
(b) I => A(1,2,..) => B(1,2,..) => C(1,2,..) => O
Where 'A(1,2,..)' means many items which can be grouped (by some attribute).
[4.5] The simple model would be wrong if the complex system has a different simple sequence, or alternatively when the complex system generates the response O by a complex sequence:
(c) I => [A,B,C](1,2,...) <=> [A,B,C](1,2,...) => O
Where the terms in the middle mean 'many complex interactions between many items'.
[4.6] The belief that the simple model is likely to be correct for the complex system is based on two assumptions:
[4.9] Is overlap of operations deleterious or advantageous? It is advantageous when there are interdependencies between the operations, because it allows the interaction without any additional cost. It is deleterious for operations which has no interdependencies. In the case of learned operations, even this is not true, as an operation may always need to become interdependent with other operations as the result of further learning (for an innate operation, evolution may have 'concluded' that it will never need to form interdependencies with other operations).
[4.10] In addition, a reduction in number of items in an operation make the operation more sensitive to damage in each of these items, which offsets to large extent the possible economical gain.
[4.11] If the system performs the operation using many items, would it use sequence (b) or sequence (c)? Intuitively, we prefer sequence (b), but this preference is based on working with externally designed systems. For these systems, simplicity is important because it allows the external designer to evaluate changes in the system more easily.
[4.12] For this argument to be relevant to a self adaptive system, we must postulate an internal designer, which maintains and develops the operations of the system. However, the internal designer (if there is any) would keep the operations simple according to the designer's notion of simplicity, which may differ radically from our notion of simplicity.
[4.13] Without internal preference for sequence (b), it is extremely unlikely to be used, as it is ordered in a non-functional way. This order can arise either by chance (extremely unlikely), or because of some order in the underlying system. The latter may be true in few very specific cases, but not in the general case.
[4.14] In addition, sequence (c) opens many more possibilities for changes than sequence (b), because in sequence (b) a change to any of the A group cells would yield the same effect, so the number of possible changes is limited. In sequence (c), where each items perform a different action, each change would yield different effect. From the same reason, sequence (c) allows many more kinds of interdependence with other operations.
[5.2] A model which is not constraint by these conditions is more likely to be wrong than not, i.e. it is unlikely to give a suitable description of the system. This is true even if it can nicely explain some restricted set of observation, because in an unrestricted model space, there is an infinite number of models which can explain any restricted set of observations, and the probability of finding the right one is very small.
[5.3] the conclusions of section 3, as summarised in condition 1 above, put a very severe constraints on possible models, and thus increasing significantly their probability of being correct.
[5.4] The discussion in section 4, summarized in condition 2 above, suggests that for learned operations, and possibly for many innate ones, there cannot be a simple model, because their implementation is not simple. In addition, complex operations are likely to vary across individuals. These conclusions seem daunting, yet they are in good agreement with the extreme difficulty of finding any useful generic models of the operations of the human cognition.