Stuart Vyse’s (2017) article about Daryl Bem and p-hacking was disturbing. The most serious implication is that Daryl Bem, a famous and well-respected psychologist, has been guilty of “an unethical manipulation of data in search of statistical significance” to support claims of the paranormal. Such manipulation is especially serious in this field for three reasons.
- If evidence for the paranormal were found, the implications for the rest of science would be profound.
- There is very little evidence for the paranormal—and Bem’s claims are frequently cited as providing it.
- Many people believe in the paranormal and look for evidence to back up their belief. If a researcher as respected as Bem claims there is reliable evidence, many people will be convinced, with serious consequences for the public understanding of science.
I have further reasons for worrying about Bem’s claims, in addition to those reported by Vyse.
In 1979, the Society for Psychical Research gave me a small grant to visit Carl Sargent’s laboratory in Cambridge. His research was providing dramatically positive results for ESP in the Ganzfeld and mine was not, so the idea was for me to learn from his methods in the hope of achieving similarly good results. The story of that visit is terribly depressing, as I described in an article and book (Blackmore 1987; 1996). After watching several trials and studying the procedures carefully, I concluded that Sargent’s experimental protocols were so well designed that the spectacular results I saw must either be evidence for ESP or for fraud. I then took various simple precautions and observed further trials during which it became clear that Sargent had deliberately violated his own protocols and in one trial had almost certainly cheated. I waited several years for him to respond to my claims and eventually they were published along with his denial (Harley and Matthews 1987; Sargent 1987).
By then, the “Great Ganzfeld Debate” was under way, in which skeptic and psychologist Ray Hyman carried out a meta-analysis of the forty-two published Ganzfeld experiments (Hyman 1985). Meta-analysis allows one to compare the results of many experiments, to find an overall effect size, to detect common patterns, and (of most relevance here) to test whether the overall effect can be attributed to flaws in the experiments. Hyman argued that many of the studies were flawed, and that the better the quality of the study, the smaller the apparent psi effect. Nine of the studies were Sargent’s.
Chuck Honorton (1985), originator of the Ganzfeld-psi experiments, then did his own analysis, using just twenty-eight of the forty-two studies (those that reported the number of direct hits). He concluded that there was a reliable effect that did not depend on any one experimenter and was not related to the quality of the study. This seemed to be good evidence for the reality of psi in the Ganzfeld and to show that Hyman was wrong.
What worried me was that Honorton had classified all of Sargent’s nine studies as “adequate for randomization” (one of several possible flaws considered). But seven of these nine studies had used the method I observed in Cambridge. So I repeated Honorton’s calculation counting these seven as flawed for randomization. I found a significant correlation (r= -.32, t=1.73, p<.05, 1-tailed) between randomization and z-score, therefore agreeing with Hyman. I submitted a brief comment on this to the Journal of Parapsychology in January 1987. In February, the editor accepted it for publication, but in May the following year, he wrote to say that they were behind schedule and unable to publish it after all.
Meanwhile, the debate led Honorton to design the “autoganzfeld” experiments, using a completely automated procedure (Honorton et al. 1990). The methods appeared to be rigorous and the results from several labs were significant, with the effect not depending on any one experimenter or lab. Later criticisms followed, including suggestions that sensory leakage might have occurred with this method (Wiseman et al. 1996), and the Ganzfeld debate continued (Milton and Wiseman 1999; Storm and Ertel 2001).
All this assumed greater significance when Honorton began working with Daryl Bem on a review of the Ganzfeld literature. This was published in 1994 in the prestigious psychology journal Psychological Bulletin, where it was presumably read by psychologists ignorant of the past history of the subject. They presented the same meta-analysis and the same autoganzfeld data and concluded that “the psi ganzfeld effect is large enough to be of both theoretical interest and potential practical importance” (Bem and Honorton 1994, 8).
They also admitted that “One laboratory contributed nine of the studies. Honorton’s own laboratory contributed five. … Thus, half of the studies were conducted by only two laboratories” (Bem and Honorton 1994, 6). But they did not say which laboratory contributed those nine studies. Even worse they did not mention Sargent, giving no references to his papers and none to mine. No one reading their review would have a clue that serious doubt had been cast on more than a quarter of the studies involved.
I have since met Bem more than once, most recently at one of the Tucson consciousness conferences where we were able to have a leisurely breakfast together and discuss the evidence for the paranormal. I told Bem how shocked I was that he had included the Sargent data without saying where it came from and without referencing either Sargent’s own papers or the debate that followed my discoveries. He simply said it did not matter.
In his article, Vyse gives a quote from an interview in Slate magazine in which Bem describes his experiments as “rhetorical devices” and says he didn’t worry about replication: “I gathered data to show how my point would be made. I used data as a point of persuasion.” This, chillingly, reminded me of Carl Sargent telling me that it wouldn’t matter if some experiments were unreliable because, after all, we know that psi exists.
But it does matter. It matters that Sargent’s experiments were seriously flawed. It matters that Bem included these data in his meta-analysis without referencing the doubt cast on them. It matters because Bem’s continued claims mislead a willing public into believing that there is reputable scientific evidence for ESP in the Ganzfeld when there is not.
- Bem, D.J. and C. Honorton. 1994. Does psi exist? Replicable evidence for an anomalous process of information transfer. Psychological Bulletin 115: 4–18.
- Blackmore, S.J. 1987. A report of a visit to Carl Sargent’s laboratory. Journal of the Society for Psychical Research 54: 186–198.
- ———. 1996 In Search of the Light: The Adventures of a Parapsychologist. Amherst, New York: Prometheus Books.
- Harley, T., and G. Matthews. 1987. Cheating, psi, and the appliance of science: A reply to Blackmore. Journal of the Society for Psychical Research 54: 199–207.
- Honorton, C. 1985. Meta-analysis of psi Ganzfeld research: A response to Hyman. Journal of Parapsychology 49(1): 51.
- Honorton, C., R.E. Berger, M.P. Varvoglis, et al. 1990. Psi communication in the Ganzfeld: Experiments with an automated testing system and a comparison with a meta-analysis of earlier studies. Journal of Parapsychology 54(2): 99.
- Hyman, R. 1985. The Ganzfeld psi experiment: A critical appraisal. Journal of Parapsychology 49: 3–49.
- Milton, J., and R. Wiseman. 1999. Does psi exist? Lack of replication of an anomalous process of information transfer. Psychological Bulletin 125: 387–391.
- Sargent, C. 1987. Sceptical fairytales from Bristol. Journal of the Society for Psychical Research 54: 208–218.
- Storm, L., and S. Ertel. 2001. Does psi exist? Comments on Milton and Wiseman’s (1999) meta-analysis of Ganzfeld research. Psychological Bulletin 127(1): 424–43.
- Wiseman, R., M. Smith, and D. Kornbrot. 1996. Exploring possible sender-to-experimenter acoustic leakage in the PRL autoganzfeld experiments. Journal of Parapsychology 60(2): 97.
- Vyse, S. 2017. P-hacker confessions: Daryl Bem and me. Skeptical Inquirer 41(5): 25–27.