Psycholinguistics blatant nonsense examples

1. Introduction

"Innatist" psycholinguistics (psycholinguistics who argue for the innateness of language) often present arguments for innate universals which are blatant nonsense, and in this text I present and analyze few of these. By 'blatant nonsense' I mean a case in which a simple analysis shows that the argument is wrong.

An interesting question is how it is possible for blatant nonsense arguments to be part of what is supposedly scientific writing. The answer has several parts:

  1. many people do the conclusion-validation error (see in Reasoning errors), i.e. they like the conclusion of the argument, so they accept the argument itself. The conclusion (that there is something innate) is part of the foundations of Psycholinguistics, so only people that accept it enter psycholinguistics in the first place. This explains how experts in the psycholinguistics can accept (and present) blatant nonsense, even though they could easily see that it is nonsense.

  2. The Blatant nonsense effect causes people to accept blatant nonsense arguments, specially if they are themselves not experts in the field. That explains how psycholinguistics nonsense arguments can escape criticism from outsiders.

The examples here are the most compact that I could found. They are repeated many times in the psycholinguistic literature, but in most of the cases they are hidden in longer and more diffused discussions, which make it difficult to identify them.

2.1 Binding principles.

Chomsky (1986) argues for a theory of binding, which is these three rules, which he believes are probably innate in some way (P. 166):

  1. An anaphor is bound in a local domain.
  2. a pronominal is free in a local domain.
  3. An R-expression is free (in the domain of the head of its chain).

(Bound means referring to the same entity (noun-phrase) as another noun-phrase, and free is the inverse).

This is blatant nonsense, because these 'innate rules' are not rules, they are tautologies. For example, we call 'anaphors' those words that must be bound in the local domain (approximately), so obviously they are almost always bound in the local domain. The same applies to pronominals and being free in the local domain, and R-expression being always free {1}.

This case relies on the fact that the definitions of anaphor, pronominal, and R-expression are not given in the same terms that are used to express the rules. Instead, they are given in intuitional way, which camouflages the identity between the definitions and the rules.

These rules are reproduced many times with different terms, e.g. 'governing category' instead of 'local domain', and appear in introductory textbooks(e.g. akmajian et al (1995), P. 491).

2.2 A 'lack of negative evidence' example

In developing the binding theory above, chomsky (1986, p. 7-8) presents this example:

The point of the example is that in the first sentence the pronoun them may refer to the men, while in the second it cannot. Chomsky claims that these facts 'are known without relevant experience to differentiate the case'.


Let's see. Apart from the basic knowledge of English, the hearer needs to know that:

  1. The word 'who' introduces a question, and is followed by a sentence in which a noun-phrase specifying a person is missing, and the question is about the identity of this missing noun-phrase.

  2. The combination 'expected to' is a shorthand for 'expected X to', where X is the same entity that is the subject of 'expected'.

  3. When the subject and the object of the verb 'see' refer to the same entity, the object must be one of Xself words.
With rule (1), when sentence (A) is heard the listener looks for a missing noun-phrase in the sentence following the word 'who', and the only possibility is 'I wonder who the men expected [missing phrase] to see them'. Hence the subject of 'to see' is [missing phrase]. With rule (2), sentence (B) is interpreted as 'The men expected [the men (themselves)] to see them'. Hence the subject of the verb 'to see' is the men. Rule (2) is not applicable to sentence (A) because after the application of rule (1), it does not have the combination 'expected to' (it is now 'expected [missing phrase] to').

In both of the sentences them is the object of 'to see'. Therefore, if them in sentence (B) refers to the men, the subject and and object of 'to see' are both the men, so the object must be 'themselves', which it isn't, so them cannot refer to the men. In sentence (A), this problem does not arise, so them may refer to the men.

Hence, Chomsky's claim that the facts are known 'without relevant experience' means that english speakers don't have experience with at least one of the rules above. Which one?

It can be argued that the analysis above uses implicitly many other pieces of knowledge. This is definitely true, but what we are looking for is pieces of knowledge that english speaker did not have an experience with. Can you find any of the latter?

Chomsky himself gives his own analysis of the problem (P. 106), which is in broad lines the same as the analysis above, but differs in two important aspects:

  1. It is much more formal. This makes it difficult to realize that the knowledge that is required is quite straightforward.

  2. Chomsky assumes implicitly that the hearer always uses generic rules, applied to the specific case. For example, he assumes that people know how to use the verb 'see' because they know the generic rules of how to use verbs, and apply these rules to 'see'. In the case of 'see', however, that is unlikely, because people have huge amount of experience with the usage of the verb 'see', and probably learn to use it specifically.

    This assumption (that people always use generic rules) makes it difficult to see how people could have learned them without direct instruction. It is obvious, however, that when people (mainly children) learn language by experience, they first learn and use specific rules, and then generalize them.

These two aspects of Chomsky's explanation make it looks complex, and hence difficult to learn without innate knowledge.

This blatant nonsense is relying on the reader not trying too hard to analyze Chomsky's claim. Examples of this kind are used all over in psycholinguistics texts. Chomsky himself uses this kind of argument to prove everything, including things like the innateness of moral values (Chomksy, 1988, P. 152).

Here is another example (Culicover, 1997, P.4):

  1. Who did you buy a picture of?
  2. *Who did you buy a Mary's picture of?

It does not appear that children are provided with specific information during the course of language learning that will implicate to them relative grammaticality of such examples.

That is nonsense, because children (and adults) obviously know that the second sentence is illegal because they never hear sentences with matching structures. Humans are natural pattern matchers, and they obviously use this skill in language learning too. This is a typical demonstration of refusal to admit a general skill into language learning.

2.3 compounds and mice

Akmajian et al (1995,p. 468-470) quote Gordon (1985) as 'provides compelling evidence for positing specific morphological principles as part of the an LAD (language acquisition device)'. It also gives a list of other citations of this work. The 'principles' in questions are the rules for formation of compound words in English, based on the notion of classes of suffixes. These rules may be useful to describe compounding in English.

The nonsense gets in when Gordon (1985) tries to show that they are innate. He does it by showing that children know that a monster that eats mice is a mice-eater (mice in plural), but monster that eats rats is a rat-eater (rat in singular). The logic is that 'mice' is a word of the level I of compounds, and children know this, and therefore know that it has to stay plural in the mice-eater.

It is already nonsense to deduce language universals from an idiosyncracy of English, no matter how surprising it is. It gains the distinction of blatant nonsense by the fact that there is a very simple explanation for the performance of the children (and adults) in this case: when you change '{verb}er of X' to X-{verb}er, you remove regular pluralization, if any, from X. In fact, there is even a simpler rule: If there is a simpler word which is very very close to X both semantically and phonologically (e.g rat to rats), use it instead of X. This rule does not require any modification of words. Of course, children may be using some mixture of these rules, potentially with other rules, and the mixture can vary between children. There may be cases when this rule does not work, but Gordon (1985) did not test any such example.

In the discussion of the children performance, Gordon (1985) and texts that quote him (Pinker (1994, p. 147), Akmajian (1995,p.470) "poverty of the stimulus argument" in MITECS ) emphasize that the children were not exposed to many irregular nouns. However, learning one of the rules above does not require any exposure to irregular nouns. Once the child knows either of these rules, he can get it right both for mice-eater and rat-eater without even knowing the meaning of any of 'mouse', 'mice', 'rat' or 'rats' (if he knows the first rule), or without understanding the relation of plural to single (if he knows the second rule). The linguists seems to be so blinded by their theorizing, that they overlook the simpler and more straightforward explanations.

[11 Nov 2003] For a more serious discussion of the question, see here.

2.4 The innateness of recursion

In some cases psycholinguists argue for innateness simply by stating their assumption. For example Hoekstra and Kooij (1988) first example of a principle that must be innate is recursion. Their argument for the innateness of recursion is as follows (P. 33):

"The fact that all languages are characterized by recursive property reflects a predetermined faculty of the species and cannot be argued to derive from any other factor, for example, from the infinity of messages that the species is required to convey. There are several other ways in which this communicative infiniteness might be served."
That is all the argument. Nowhere is the text before or after this do the authors give any hint of an explanation of why recursion cannot be argued to derive from any other factor, and they don't give any reference either.

That makes this a nonsense. it becomes a blatant nonsense by the fact that there is a very simple explanation for the fact that all languages are recursive: It is necessary for communication about the way people perceive and think about the world. Every description of the thought or perception of a situation the world (e.g 'I saw that the door is open', or 'He thought you are in liverpool') is recursive, and these descriptions are very useful in communication. Hence, to be useful for communication, all languages must be able to express these descriptions, and therefore are recursive.

2.5 Imaginary problems

In the same article as in the previous item (Hoekstra and Kooij, 1988), the authors give this example (P.37):

and claim that without innate knowledge, people cannot learn that the last sentence is ungrammatical. After some blurb about lack of negative evidence, they say:

"In any event, it is up to the proponents of the hypothesis of unbiased learning to provide an explanation as to why the pattern in (1b) and (1c) do not generalize to (1d)."
That, Of course, is a nonsense. The generalization to (1d) must be something like: if (1a) can be modified to (1c), then (1b) can be modified to (1d). However, for this generalization, (1b) has to match (1a). Thus, as long (1a) is not regarded as equivalent to (1b), the generalization will not happen, and there is nothing to explain.

It is a blatant nonsense, because (1a) and (1b) are extremely unlikely to be matched by any system that gets anywhere in understanding language. (1a) is a complete sentence, with subject, verb and object, while (1b) isn't. Thus the problem that Hoekstra and Kooij point to will never arise.

That doesn't explain how the system learns what are subject, verb and object, but this is not what Hoekstra and Kooij discuss. The reason is that they cannot claim that the learning child does not have an experience with the concepts of subject, verb and object, so this problem is not good for their argument. Instead, they discuss an imaginary problem that will never arise.

2.6 the psychological reality of traces

To fit with the linguistic theory, many sentences have to be treated as if they are missing an anaphor. In the place of the missing anaphor, linguistic theorists put an 'empty category object' (trace). There are several kinds of these objects . They have a syntactic role, but are not pronounced in any way. As a way of analyzing the text, this is sometimes useful.

The nonsense sets in when the psycholinguists claim that these traces have a psychological reality, i.e. that the brain actually uses some internal entities that correspond to these empty category objects. This is, of course, nonsense, because the way the data is held in the brain is very much different from the linear representation of the spoken language. In particular, the brain does not have a problem in using the same concept in several roles (i.e. associate its internal "representation" with several other concepts), while the linear representation can do this only by anaphors and traces.

It becomes blatant nonsense when the psycholinguists try to prove the psychological reality of the traces. A typical example is from Akmajian et al (1995). (p.437). The subjects in the experiment are presented with sentences of these types, and then asked if the word 'astute' appear in the sentence.

  1. The astute lawyer who faced the female judge hated the long speech during the trial.
  2. The astute lawyer who faced the female judge hoped he would speak during the trial.
  3. The astute lawyer who faced the female judge strongly hoped [PRO] to argue during the trial.
  4. The astute lawyer who faced the female judge was certain [e] to argue during the trial.
  5. The astute lawyer was hard for the judge to control [e] during the very long trial.

The results show that the trend (in time it takes to answer the question, and number of errors) in the last three sentences, which contain traces ([PRO] and [e]), is the same as in the second sentence, which contain an anaphor (he), as opposed to the first sentence, which is simple.

This is blatant nonsense, because it was not shown that anaphors have psychological reality, i.e. that the brain use some internal entities to represent them. As discussed above, the brain do not need these internally.

Another argument that is sometimes brought is that processing the trace takes time, and showing that human spend time at the place that trace is supposed to be. This is another blatant nonsense, because the time is clearly spent on figuring out how to make sense of the sentence, which is involves a 'search' for the missing noun.

2.7 Lakoff 'theory' of ICM

In his book (Lakoff, 1987), Lakoff not only attacks the traditional view (Objectivism), but also tries to present his own theory. The problem with this that it is not a theory, because it leaves open all the important details, except that they have to 'make sense'. For example, in his study of there-constructions (which is the main example of his theory, PP. 463-585), he says in page 552:

Given all of the above predictions, we can represent the central existential there-construction in a remarkably minimal fashion:

The central existential
Based on: The Central Deictic
Semantic Element
1':a mental space

This is all that need to be said, The rest follows from the principle of inheritance, principles of language in general, and independently needed principle particular to english.

That is blatant nonsense, because the only reason that it works is that it is based on the previous ~80 pages of analysis. This analysis is there-construction specific, and cannot be deduced from the principles of inheritance and language plus general principles of English. Thus 'all that has to be said' is not only the above representation, but also the preceding ~80 pages of analysis. Lakoff seems to hope that the reader will somehow accept this analysis as general.

2.8 The kay-Kempton experiment

In his book (Lakoff, 1987), Lakoff wants to show "that the structure of a language could influence nonlinguistic behaviour" (P. 330). While this statement looks to me obviously true, the experiment he uses to prove it is ridiculous.

Lakoff quotes an experiment by Kay and Kempton (1984). In this experiment the subjects perform two tasks. in the first, they judged which of three chips is further away in colour from the other two. In the second task, they did the same judgement, but the judgement was preceded by an introduction by the experimenter (verbal manipulation), and the chips were displayed such that the observer could see only two at a time (presentational manipulation).

In introducing the experiment, Lakoff state the following as one of the two conditions which are needed to verify the statement above (P. 330):

- If that difference in performance disappears in task 2, which differs from task 1 only in that the naming difference cannot be utilized, the Whorfian effect is confirmed.

But in the experiment of Kay and Kempton the tasks differ substantially in the presentation of the stimuli, because in task two the subjects could not see all the three of them together, while in task 1 they did. Thus the experiment does not even stand the standards that Lakoff himself sets.

At minimum, Kay and Kempton should have uncoupled the two manipulations (the verbal and presentational), by doing:

Neither Lakoff nor kay and Kempton bother to even consider the possible effect of the different presentation, and the way to test it. This is quite amazing, considering the simplicity of the tests. It seems these researchers are blinded by their beliefs. As I said above, I believe that what they try to prove is actually true. However, the way they try to prove it shows that neither Lakoff nor Kay and kempton are capable of identifying even the most obvious confounding factors.

2.9 'Universals' do not have to be universal

The following quote is the last sentence of the abstract of a research article in a respected journal (Orsolini and Marsden-Wilson, 1997):

The results of both experiments contrast with previous research in english which used the same techniques, and suggest that claims about universal morphological patterns need to be extended to include procedure which combine both high productivity and lexical specificity.

This demonstrates, in one sentence, one of the major methods by which psycholinguists find 'universals':

This approach is obviously nonscientific approach, but it is widely accepted in psycholinguists. That this blatant demonstration of it passed the review process of Language and Cognitive Processes shows how pervasive it is.

2.10 What a child learns

One of the fundamental errors of "innatist" psycholinguists is their view of the learning process. Here is a typical quote (Culicover, 1977, P.5):

The learner is presented with data from a language, and has to make a decision as to what is the grammar.

This is supposed to describe the way a child learns. It is obvious nonsense, because what the child learns is to communicate, and learning the grammar happens as part of learning to communicate.

The same idea is expressed in "poverty of the stimulus argument" in MITECS:

The trouble that the child faces is thus a problem of under-determination: any finite set of example sentences is compatible with an infinite number of grammars. The child's task is to pick among those grammars.
Clearly, the child does not 'have a task'. Rather, he is trying to communicate with other people around him, and learns the best way of doing this, which includes learning the grammar.

Because "innatist" psycholinguistics ignore the question of effectiveness of communication, which, as far as the child is concerned, is the important criterion, their arguments and theories are irrelevant to child acquisition of language.

2.11 Does the theory have to be constrained by the data ?

In the end of the discussion of the problem of generalizability in other theories of language acquisition, Pinker (1996) says (P. 24):

It is to avoid this problem that I will often attribute more structure to the child's grammar, and thus to his or her acquisition mechanisms, than the examples I discuss strictly require.

Obviously, if there are known examples that require the structure that Pinker attributes, there is no reason why Pinker would not use them. Hence, we can deduce that there are no examples which require Pinker's attributions. In addition, there is nothing to force Pinker to attribute structure when he does not want to. Thus, the declaration above is equivalent to declaring that he is going to feel free to attribute whatever he likes, independently of the evidence. That exactly what he does in the rest of the book.

2.12 How do you refute a theory

Pinker (1996) writes (P.33):

I take it to be noncontroversial that a theory that can explain facts in some domain has a prima facie claim to being considered true. To refute such a claim, one would be better off proposing an alternative theory than reiterating one's skepticisms or appealing to apropristic arguments.

Pinker repeats this idea, that to refute his theory other people must come with other theories to explain the same things, in many places.

That is obvious nonsense, because the proper way to refute a theory is to bring empirical evidence that is incompatible with it. Pinker pretends to be unaware of this, which explains why he felt free to make the declaration in the previous item. {2}

2.13 What the child uses for language acquisition

Pinker (1996) says (P. 31):

Several general properties of the learning mechanisms either are noncontroversial or have already been motivated by the arguments in chapter 1. First, I assume that he child has no memory for the input other than the current sentence-plus-inferred-meaning and whatever information about past inputs is encoded in the grammar at that point, that is, that the strategy is one-memory limited (See Osherson and Weinstein, in press; Osherson, Stow and Weinstein, 1982).

This assumption is obvious nonsense, because the child has significant non-linguistic information and understanding, and he obviously does not switch it off when trying to understand language. The non-linguistic input includes context, which is a large source of information, and any general cognitive abilities that the child has, which are quite significant by the time the child starts to learn language.

Pinker does not tell us why he thinks we can ignore this non-linguistic understanding. The way the text is written, it suggests that the assumption is either noncontroversial or was discussed in chapter 1, both of which are false. The references from Osherson et al. are inserted as if they support Pinker's assumptions, but they are not.

The aim of the book that this quote is from is to investigate what is learnable, and from this to make deductions about the internal mechanism of language acquisition. By making this assumption right from the start, and using it all over the book, Pinker invalidates the whole enterprise.

2.14 Julia Kristeva: Revolution in Poetic Language (1984)

I wouldn't call Kristeva's material 'psycholinguistics'. However, a student told me that in a course in psycholinguistics the first thing they got was to learn this book, and that he has read in many places that Kristeva is one of the most important psycholinguists.

In case you are also struggling with 'Revolution in Poetic Language', the reason that you don't understand it is simple: There is nothing to understand, because this book is a meaningless drivel. It is not really an example of blatant nonsense, rather it is the 'New emperor's clothes' effect: The 'clever' people say that Kristeva's book is 'lucid' and 'crucially important book' (quotes from the cover), and nobody dare to say that it is drivel, because they are afraid to look stupid (or fail their exam).

Here is a typical example from the book. Of Course the quote does not make sense out of context, but it does not make more sense in the context of the book either. P. 30:

We view the subject in language as decentering the trancedental ego, cutting through it, and opening it up to a dialectic in which its syntactic and categorical understanding is merely the liminary moment of the process, which is itself always acted upon by the relation to the other dominanted by the death drive and its productive reiteration of the "signifier."

2.15 What is learning

Wexler discusses the question why young children don't understand sentences like the fox was kicked by the lion, and says it can be either maturation (by which he means biological development) or learning. Then he says:
Borer and Wexler raised an objection to a learning analysis for a delayed construction like the passive, namely the Triggering Problem: If change occurs in grammar because of a reaction to an input trigger, why does this change often take years?
Why that is a problem for learning analysis? Certainly, learning is not done by anything that can be called 'reaction to an input trigger'. This is a case of 'super-blatant-nonsense', because the author tries to make the reader forget the meaning of the word 'learning'. He then continues:
A tip-off that maturation may in fact be on the right track can be found in the OI stage. Note that in that stage all the properties that we know must be learned, namely the ones that differ from language to language, are learned extremely early (VEPS). It is the universal (apparently) property of finiteness of root clauses (morphology aside) that develops late. Thus a universal property is late, whereas an experience-dependent, learned property is early. This suggests maturation.
How does the order of acquisition tells us that it is maturation? It doesn't, but the author hopes the reader is confused enough by now to accept it anyway.

2.16 'the syntax of one'

This article (What infants know about syntax but couldn't have learned: experimental evidence for syntactic structure at 18 months; Jeffrey Lidz, Sandra Waxman and Jennifer Freedman; Cognition, Volume 89, Issue 3 , October 2003, Pages 295-303, abstract : doi:10.1016/S0010-0277(03)00116-1, full article) is really dumb, and if I read it 'somewhere' (e.g. on the net) I would have dismissed it as waffling by some clueless. But it was published by a respectable journal.

  1. The first mistake that they do is to assume that the analyses of "the red ball" to flat or nested structures are of equal probability. This is simply dumb, because cosntructs like "red ball" (in general, ) are very common in language, and constitute on their own a complete and context-free noun-phrases. In contrast, constructs like "the red" define a context-dependent noun-phrase which is quite difficult to interpret, and these constructs are rare, probably never used when speaking to young children. Thus the association between "red" and "ball" will neccesarily be must stronger than the association between "the" and "red", so the flat analysis will never be learned by any learning mechanism.

    This straightforward logic, though, seems to be beyond the authors range. Their discussion (including the "corpus analysis") is based on the assumption that the only way a child can learn the nested structure is by seeing a complex sentence which forces the nested structure interpretation (They also add the logical error that the sentence must include the word "one"). It is not obvious how come anybody can, in 2003, ignore learning-by-statistics, but apparently they can, and it didn't bother the reviewers.

  2. The interpretation of their experiment is much worse. The result they got is easily explained by making the obvious assumption that the children interpret the pharse "another one" as referring to something that they have seen before. Note that this assumption explains the data without any reference to the effects of the rest of the verbal input in either phases.

    Obvious assumptions, though, seem to be also beyond the range of the authors. Their discussion of the results is particularly stupid, because it is based on the assumption that the response of the children was based on the verbal input in the familiarization phase, and hence that the visual input had no effect. Obviously, visual input always has large effect, and much more so in pre-verbal children, but apparently this is not obvious to the authors (and the reviewers). It is probably an instance of "theory-driven blindness", i.e. failure to perceive facts and relations because they don't fit the theory.

It is quite amazing that this kind of garbage can go through the review process, because even believers in Chomsky's ideas should see the defects in this paper, because they are so obvious.



{1} There are non-obvious features about pronouns and anaphors. The first is that they exist, which is explained by the need to identify repeated references to the same object. The second one is the mostly non-overlap usage. This is clearly to help the listener to determine which is the noun phrase that the pronominal/anaphor is referring to.

{2} Like all symbolic models, Pinker's theories are incompatible with the stochastic connectivity of neurons, as explained in brain-symbols.



