In the 'known physiological and perceptual characteristics' I include anything that is (almost) universally accepted as such. For example, the limit of amount of phonological information that can be handled at the same time is one such characteristic.
The most serious danger in this kind of discussion the danger of making ad-hoc assumptions, based on the only intelligent systems that we are familiar with (humans). To try to avoid this danger, I try to explicitly discuss any assumption which I introduce and is not trivial.
The basic assumptions that I make are:
For a communication to take place, the result of a communication event, when individual A (the sender) communicate a message X to B (the receiver), has to lead to a consistent correlation between what A is thinking about and what B is thinking about. (Note that the way this 'thinking about' is implemented is irrelevant for communication). For this to happen, X must have a consistent correlation with what A thinks about, and B must be able to interpret X consistently. Thus X must be used in a consistent way, and the first requirement for communication is a set of messages are used in a consistent way. I will refer to the way a message is used as the meaning of the message.
In general, messages can be sequential strings of other messages, but there must be some subset of messages which cannot be decomposed this way. I will call this subset the basic units of the language. The meaning of a basic unit must be known by each individual directly, so each individual must have a way of finding the meaning of a basic unit directly from its sensory input.
The minimum length of the basic units is the resolution of the perception of a signal in the medium. To be reliably understood, they have to be significantly longer than this minimum. The maximum length of the basic units has to be smaller than the amount of medium-specific information that an individual can handle at the same time.
The number of basic units may be restricted by several factors:
Below the limits above, where finding the meaning of a basic unit is fast, processing a basic unit is much more efficient than processing a message which is made of several basic units. This is because it does not require cognitive operations to combine the meaning of each basic unit together. Thus we would expect the number of basic units to grow to these limits, rather than stay small.
The number of different ideas that intelligent systems can reasonably think about is huge, effectively infinite, because these systems combine concepts together in an unrestricted way. It is not possible to match this variety by learning basic units, so to be able to serve as a communication tool effectively, a language must be able to form combinations, too. Hence, most of the messages in the language must be combinations of basic units.
In general, a group of ideas, each of which is expressed by a basic unit, can be be combined to give several meanings. Therefore, to understand unambiguously a combined message, the receiver must have a way to decide how to combine the meaning of the basic units into the meaning of the complete message. The information for this decision can be conveyed by any combination of several methods:
The kind of combinations of ideas that intelligent systems can think, and therefore may need to express, is not limited, so the rules of the language must not limit the kind of combinations that can be expressed. The easiest way to achieve this, and maybe the only general way, is to make it always possible to modify a message by adding more information to it. In other words, the rules should allow combining any message with further information. A possible restriction on this flexibility is the limit of information that an individual can handled mentally at the same time. However, there is no need for rules to restrict these cases, because these combinations are difficult to produce, and therefore will not be part of the language anyway. Hence, we should not expect any explicit limits on combinations.
At the macro level, where quantum mechanical effects are negligible, the world seems to be made of objects, which have attributes, act in some way, and affect the attributes of some other objects (I use here the word 'attribute' in its widest meaning). Thus the basic message is mostly associating an object with some attribute(s), action or some effect. Therefore we would expect the bulk of the basic units to correspond to one of these, i.e. to be noun, adjective or verb (and adverbs), and the rules of the language to be about combining these kind of basic units.
In addition, as mentioned above, we would expect a language to contain specialized basic units that are used to determine how to combine the meaning of the basic units into the meaning of the full message.
References to objects real world are the most demanding part of the language, because the identity of the object is unrestricted, and the object is external to the language. On the other hand, attributes and effects are restricted by the the kind of objects they are associated with. Thus the language must have tools to make it easier to identify objects, which may be special basic units, or special modifications. The most important problem is multiple references to the same object, which are not only expensive, but also adds to the effort of the receiver the task of figuring out whether they really refer to the same object or not. Thus the language must contain means of easy identification of repeated references to the same object, which should also make it cheap on term of time and cognitive effort.
The typical length of a meaningful message is ultimately restricted by the cognitive abilities of the individuals, i.e. the amount of information that she/he can handle at the same time. If this limit is large, than the length of meaningful messages would be very variable, and restricted by the amount the information the sender want to deliver, the amount of time she/he has to do it, and how long she/he can expect the receiver to to be receiving. If the limit is small, it will determine the typical length of a complete message. In humans, the limit seems to be quite small, corresponding to a medium length sentence. Much longer sentences can be understood only if they are easily decomposable to shorter sentences.
The limit of the amount of medium-specific information that an individual can handle make it easier to handle messages that fit into this limit. If this limit is smaller than the cognitive limit, it would be easier for individuals to understand messages that can be combined in sub-message, each of which fit into the medium-specific limit. If this limit is larger than the cognitive limit, it probably will not have any effect.
While the rules of the language should allow expression of any idea that the systems that use it would like to express, there is no reason why they should allow interpretation of every possible signal. Thus many possible signals will have no interpretation in the language. That include both signals which cannot be interpreted as basic units, and sequences of basic units which cannot be combined using the rules of the language.
Because intelligent systems continually develop new areas of mental activity, the language must continually evolved to deal efficiently with the new areas. The evolution is close to be 'darwinian', because the changes, while not completely random, are not based on understanding of the language and an effort to conserve its global efficiency. As a result, languages continually diverge away from being optimal communication tool, and are pushed back towards communication optimum only when their efficiency start to fall significantly.
In addition, there are many additional forces on languages, which are not consistent, yet may have quite large effects. In humans, at least the following forces are significant:
To summarize, the following rules (at least) apply to all languages that are used for communication:
Any language that is used for communication by humans must have these features, independently on any innate rules. Therefore, finding these rule does not support theirs innateness.
Other, more specific rules, are more difficult to predict directly from the communicative role of language, because our understanding of language and human cognitive performance is not good enough, but they seems to be plausible. These include:
---------------------- Notes ----------------------