related texts

==================================================================
This is the rejection E-mail I got from Neuroscience:
From: neurosci_endeavor.med.nyu.edu
Date: Thu, 10 Dec 1998 11:23:47 -0500 (EST)
In-Reply-To: <199812071124.LAA24718@gaia.cam.harlequin.co.uk>
To: Yehouda Harpaz <yeh_harlequin.co.uk>
Subject: Re: First review of manuscript A98138


Dear Dr. Harpaz,

I must begin with an apology as my last e-mail was in error.

Dr. Llinas carefully looked over the comments of the referee of your
manuscript.  In large he agreed with the comments and has asked us to
return your materials since your manuscript cannot be published.

Because of the number of excellent manuscripts being submitted, and the
limited number of manuscripts that can be accepted each year, we can no
longer consider manuscripts that do not receive a favorable review and a
high priority.

Sincerely,
==================================================================
This is the exact text of the review that I got from Neuroscience. The same text with my response.
==================================================================

NEUROSCIENCE MS No. A98138

Author: Y. Harpaz

Title: Replicability of cognitive imaging...

This paper addresses an extremely important issue, and the author is to be congratulated for undertaking the endeavor. If reproducibility across studies is as bad as he claims, no real progress will be made in the until the neuroscience community is made fully aware of th eproblem and furnishes a remedy or eplanation. The author is correct that lack of replication has not been discussed adaquately by the large majority of researchers, that mnay seem to be ignoring the problem, and that claims of replication are offten greatly exaggerated.

Unfortunately, not many readers will be convinced by the mostly qualitative observations presented here. While objecting to the unsubstantiated, subjective claims of investigators, th ewuthor frequently resorts to equally qualitative claims regarding lack of replication. For example, on pgs. 10-11, he describes data that "seems (sic) to be randomly distributed," published figures form which "it is clear that these regions are only small part of the activation," and data that show activations that are "different" or "vary a lot." There are almost no quantitative analyses presented, and it gradually dawns on the reader that these personal judgements of the published figures constitute the majority of the critique. This paper would considerably stronger if the author used published stereotaxic coordinates as data. defined a maximal separation distance criterion for replication, and confined the analysis to comparisons between very similar experiments. Demonstrating, for example, that less then 50% of activation foci are replicated across similar experiments (I suspect this would be the outcome) would be a truly alarming and forceful finding, and one not sullied by subjective interpretation.

Another weakness of the paper is that the author makes no clear distinction between variation between sibjects and variation between studies, and even seems to imply that that the former may be the main cause of the latter. There are, however, several reasons to distinguish these phenomena. First, most of researchers already accept individual variability as likely (at least for functions that are more learned and less hard-wired), while ignoring or denying variability across studies. Second, even great variation at the the subject level does not preclude replication at the study level, as long as there is some central tendency across subjects and a large enough sample is taken (remember the central limit theorem?). Third, the statistical analyses used in these studies are intended to identify regions that are very likely to be activated in common across all or most subjects in the study. Thus, such analyses are already, in a sense, replication tests. Because most such studies randomly select normal, healthy subjects from the same human subject pool, there is no compelling reason to think that studies with reasonable samples sizes (i.e. >10) should produce very different results merely because of normal variation among individuals. In contrast, small, systematic differences between experiment design factors in different studies could cause vastly different results different results, but the author devotes almost no attention to discussing this problem. In particular, there is no discussion of the role played by variations in activation and control tasks across supposedly similar studies, even though many investigators have stressed the probable importance of this factor. Poeppel, for example, has exhorted investigators to more closely analyze the cognitive demands of the tasks they use, attributing the lack of replication across phonological processing studies to a lack of uniformity of the tasks used to engage such processing. By ignoring this factor, the author perpetuates the tendency to generically label tasks and studies without looking at the the details, even though this likely to be the place where, as usual, the devil is.

The English used in this paper is not good. Almost every paragraph contains several major errors, such as incorrect use of or omission of articles, or incorrect noun-verb agreement. The words "replicabcle" and "replicability" do not exist in English, but "reproducible/reproducibility" or reliable/reliability could be substituted.