Notwithstanding the hallucinations of Chomsky et al, there is no evidence that human are innately sensitive to any language structures like words and sentences. On the other hand, infants are clearly sensitive to the differences between phonemes, and to other basic attributes of the sound, like rythm and pitch. The basic attributes sensitivity does not seem to be coded in the Wernicke. Thus the Wernicke area is probably innately coded to distinguish between phonemes {1}.
What is needed to make the Wernicke area good in distinguishing between phonemes ? Clearly, we need neurons that response significantly differently to different phonemes. So the next question is what are 'phonemes' as far as the brain is concerned.
Phonemes are distinct patterns of air vibrations, which are converted to neural signals by the inner ear (cochlea). The cochlea function as mechanical device for converting vibrations (originally vibration of air, converted by the external and middle ear mechanism to vibrations of the oval window) to location. Due to the design of the cochlea, different frequency cause vibrations, and hence neural signals by the hair cells, at different locations along the cochlea. Thus different phonemes are, as far is the brain is concerned, different patterns of activity in the hair cells along the cochlea.
Phonemes also have temporal component, so a complete description of a phoneme is actually not a single pattern, but a continuous change in the pattern of activity of the hair cells. This is what the neurons in the Wernicke area have to be sensitive to.
There is no problem explaining the sensitivity of neurons in the Wernicke area to temporal changes, because any network is sensitive to temporal changes in its input. Thus the question is how neurons in the Wernicke area are sensitive to patterns in activity along the cochlea.
For that, some of the neurons in the Wernicke should be getting input from several patches along the cochlea{2}. Different neurons should be getting input from different patches, so they respond differently to different phonemes. Note what we do NOT need to assume:
Once we hav in the Wernicke area neurons that stably respond differently to different phonemes, any phoneme that the infant heres would cause a stable subset of neurons to become active. A simple Hebbian process can cause the neurons in this subset to strenghen the connections between them, and thus to improve the respond to the phoneme (unsupervised learning). Thus, assuming Hebbian process is in action in the Wernicke area, just being exposed to language (sequences of phonemes) would cause the infant to increase his sensitivity to the phonemes that shhe hears.
As the child grows and become more adapt in associating patterns of activity in his brain, shhe can start to associate temporal pattersn of activity in the Wernicke area (which are the result of temporal sequence of phonmes, i.e. words), with other patterns of activity in the cortex. In other words, shhe learns the meaning of words. Note tha this does not require any special mechanism above a generic learning mechanism that associate two pattern of activity in the cortex that tend to predict each other (See in Cognition for a model of how this general learning mechanism works). Learning more complex structures in language (sentences etc.) follows, and naturally tends to concentrate in and arounf the Wernicke area, because the basic information about words is there.
Thus to explain the importance of the Wernicke area in understanding language, all that is needed is that some the neurons in the Wernicke area will be getting input from patches along the cochlea. As far as I know, the connectivity of the Wernicke area is not mapped well enough to tell if this is true, but it is a reasonable assumption .
The important point of the analysis above is that it shows that to explains the observation about the Wernicke area there is no need for elaborate models with explicit design. In addition, these models are incompatible with the stochastic connectivity of the Wernicke area (and the rest of the cortex, see here and here) and are extremely unlikely.
------------------------------------------------------------------------
{1} Most cognitive psychologists world probably say that it is code to 'recognize phonemes'. However, there is no reason to believe that the infant 'recognize' the phonemes in any sense except that shhe can distinguish between them, so using 'recognize' is misleading.
{2} By 'a neuron getting input from X' I mean that there is some pathway from X to the neuron, and that this pathway has strong and stable (over time) effect on the activity of the neuron.