"Science runs on the premise that experiments, and all the uncertainty involved in them, should be open for scrutiny." Editorial: Politics versus reality, Nature 434, 257 (17 March 2005)
3. Experiments past and future: Hirst et al. , the Burridge report, Ovelgônne et al. , the BBC Horizon "scientific experiment" and more ...
The publication of Davenas et al. was followed by a flurry of letters to Nature (Metzger 334, 375 and Snell,Seagrave both in 334, 559 ) briefly reporting on failed replications. However, as pointed out by Benveniste (Nature, 335, 759) all the reported trials deviate substantially from the experimental design of Davenas et al. , testing different biological systems and replacing basophil counts by assumedly equivalent measurements of mediator release (see Beauvais et al., J. Allergy Clin. Immunol (1991), 87(5)1020-8).
As far as the replicability of Benveniste's results is concerned, I can only recommend to examine the article by Hirst et al. "Human basophil degranulation is not triggered by very dilute antiserum against human IgE" in Nature, 366 (1993), 525-527.
|Data from Hirst et al. (Table 2)||Fischer p-value||Null-hypothesis rejected %|
|Succussed high dilution||0.0027||99.73|
|Unsuccussed high dilution||0.086||91.4|
It is possibly the most peculiar scientific paper that I have ever read. I have never read any other paper attributing all the results which are incompatible with its overall conclusions to unidentified systematic flaws in its own experiments. I have never read any other paper dismissing its own statistical data above the significance threshold as "chance results". The authors appear to recognize that their data are incompatible with their null hypothesis, i.e. with the assumption that there is no difference betweeen potentized solutions and placebo (p. 527, right column): "According to conventional scientific theory, there should be no differences within a session between the control treatment and the eight high-dilution treatments. ... This is not the case ... ." , but they attribute the effect to unknown causes ("a source of variation for which we cannot account"). Indeed, despite its overall conclusions, if taken seriously the paper's content provides independent confirmation to the main claims made in Benveniste's original article (as noticed by Emanuel Marin and J. Pharabod in 1994), except that no recursive waves are directly observable (see below, however). The value p=0.0027 in Table 2 of the article represents the probabilty to obtain such experimental data by chance, under the assumption that there be no difference between succussed high dilutions and control treatments. This may be reformulated as saying that the experimental data confirm within a 99.73% level of confidence that there is a difference between succussed high dilutions and control treatments.
In a nutshell, Hirst et al. set out to verify or disprove Benveniste's claim by testing the corresponding null hypothesis that there is no difference between succussed high solutions and controls. They somehow acknowledge that their experimental results are incompatible with the null hypothesis (NB: the null hypothesis that they choose to verify/disprove Davenas et al.), but nevertheless state that "no aspect of the data is consistent with the previously published claims.". The paper is generally accessible, so anybody can make up his/her mind.
It is worth noting that in Table 1 Hirst et al. the null-hypothesis being tested is not that "the treatment applied to the cells produces a response which is not different from the response in the absence of treatment", but that " the treatment applied to the cells produces a MEAN response which is not different from the MEAN response in the absence of treatment". In Table 2 of Hirst et al. on the other hand the hypothesis being tested is essentially that " the treatment applied to the cells produces a response VARIATION which is not different from the response VARIATION in the absence of treatment" , which the data clearly show to be untenable. It is hard to tell the difference between the unknown "source of variation" in Hirst et al. and the perplexing intermittency (see left) that appears in Davenas et al. and that was construed there in terms of "dilution waves". What appears to be happening is largely consistent with the findings of Davenas et al. : succussed anti-IgE strongly enhances the variation in basophil counts, while affecting the mean counts only moderately. The variations cancel out when the average is taken, so that the data in Table 1 capture only the lesser effect, which however remains significant for highly diluted anti-IgE, although Hirst et al. dismiss the result as a "chance result" of their ultra-conservative Bonferroni procedure, whose adequacy they don't even discuss.
It may be noted that in the original report (Jim Burridge, "A Repeat of the 'Benveniste' Experiment: Statistical Analysis", Research Report No. 100, Department of Statistical Science, University College London, England, March 1992, available here as zipped pdf) on which the published version of Hirst et al. is based the author, after clearly stating that "the main aim of the experiment is to show that the results do in fact behave as expected!", acknowledges that "one interpretation [of the results] is that there are, after all, differences between the treatments" (i.e. that Benveniste's main claim is correct) and that "further work needs to be done". Such remarks however did not make it to the published version of Hirst et al. .
As a personal comment I may add that the standards of scientific transparence in the Burridge report appear higher than those of some other published article and "scientific experiment" mentioned here. First, Burridge mentions some problems with the experimental procedure and states explicitly his goal to show that the system behaves as expected, then he somehow admits that the results do no fit his expectations, recommending further work to be done. Moreover, to my knowledge Burridge's report had never been mentioned or referenced anywhere prior to his sending a copy to this writer. Burridge's decision to engage in a constructive discussion and make his report available in 2001 is, in my opinion, a courageous and important contribution to this debate.
The p-values above and the Bonferroni-adjusted t-value provide strong quantitative evidence that the null-hypothesis should be rejected, i.e. that there may be a difference between high dilution treatments and controls. At a more speculative level it is interesting to visualize some of the data through following diagram, which is based on the data available (see the comments therein too). The data correspond to the y-coordinate (in tenths of millimeter) of the points in Fig 3a and Fig 3c in Hirst et al., where, as stated therein, each point is the mean of the triplicate determinations in a single experiment. The measurement results are then averaged on the 5 and 3 sessions for succussed anti-IgE and succussed buffer respectively. The accuracy of the measurements (or lack thereof) can be verified by anyone with some goodwill and a rule. This measurement endeavour was triggered by Hirst et al.'s adamant refusal to make their raw data available for public scrutiny and to interested parties such as Jacques Benveniste and coworkers. It goes without saying that single session data would provide a far better picture of what is going on, but Hirst et al. are unwilling (or unable) to provide them . One might claim that dilution waves are visible in the plot, even though the results have been averaged over different sessions. Visually the most unexpected feature of the plot is the apparent periodicity in basophils degranulation in the succussed buffer. Such an effect may well be an optical fluke or whatever. If the effect is real however, then periodicity may be an intrinsic property of basophil degranulation, while highly diluted treatments increase variation and average degranulation. The time structure of measurements (i.e. basophil counts) , which has never been considered in the experimental setting, may be crucial: basophils may always subsist as an oscillating superposition between degranulating and non-degranulating state, along the lines proposed in the high-dilutions quantum model. Highly diluted treatments may just boost the amplitude of the degranulating state as revealed by increased variation and mean. This speculative guess might be checked if the basophil counts for every session were made available by Hirst et al. .
The value p=0.086 again in Table 2 of Hirst et al. relative to unsuccussed high dilutions might point to some increased degranulation, although at a weaker level than that of succussed dilutions, so that in this case succussion would only strengthen the effect observed in Davenas et al., not cause it. This may be taken into account when analyzing the results in Ovelgônne et al. (J.H. Ovelgônne, A.W.J. M. Bol1, W.C.J. Hop, R. van Wijk, "Mechanical agitation of very dilute antiserum against IgE has no effect on basophil staining properties", Experientia 48.5, 1992, 504-508), quoted in F. Wiegant's letter to Nature, 370 (1994) 322), since their partial replication of Davenas et al. experiments does not include comparison with control treatments ("We found no evidence for a different effect of strongly agitated dilutions, compared to dilutions made with minimal physical agitation"). Equally relevant is the fact that the experimental design of Ovelôonne et al. deviates further from that of Davenas et al. , since the basophils counts at different dilutions are combined, so that effect at specific dilutions cannot be measured. The unexpected positive results of Hirst et al. would not be detected by the experimental design used in Ovelgônne et al..
Interestingly Wiegant is among the authors of Belon et al. (P. Belon, J. Cumps, M. Ennis, P. F. Mannaioni, J. Sainte-Laudy, M. Roberfroid and F. A. C. Wiegant, "Inhibition oh human basophil degranulation by successive histamine dilutions: Results of a European multi-center tral", Inflammation Research 48, Supplement 1 (1999) 17-18 ) , where a inhibitory effect of highly dilutued histamine is reported. While several of its authors are also previous co-authors Davenas et al. , the results in Belon et al. , which build on a series of published work by Boiron's researchers Belon and Sainte-Laudy a.o. (see the bibliography in Bellavite et al. freely available here ), are not a replication of those of Davenas et. al. (which is not cited among the article's references) , since the techniques used are different, including automated basophil counting . Unfortunately the data presented by Belon et al. are not exhaustive , a problem affecting virtually all the papers relevant to this discussion. One of the co-authors of Belon et al. , Madeleine Ennis, later conducted a related experiment (V. Brown, M. Ennis "Flow-cytometric analysis of basophil activation: inhibition by histamine at conventional and homeopathic concentrations." Inflamm Res 2001;50:S47–8), which attracted considerable media interest, as described below.
It may be noted that results which appear consistent with Benveniste's claims were obtained also in the experiments conducted at Clamart under the supervision of John Maddox and his team. Such results are briefly described by Maddox (see Nature,335,760 and the striking Fig. 1 therein) but their probatory value is dismissed stating that Benveniste "denies (contrary to the recollection of all three of us [Maddox, Randi and Stewart] that he remarked ' we've never seen one like that before ' " . The interested reader may well examine Maddox's report and weigh the scientific worth of its arguments.
The events are reported in this somewhat classic Guardian article by homeopath Lionel Milgrom. The "failed" British attempt to replicate Benveniste's findings mentioned in the article is just that by Hirst et al.
At the mediatic level, in 2002, according to their site, BBC Horizon "decided to conduct their own scientific experiment" , resulting in a failed attempt to replicate the already mentioned results of M. Ennis under James Randi's supervision (making sure "there's no room for error ...keeping the original samples"). The Horizon "scientific experiment" is documented (with the rigorous omission of any data, experimental protocol, references ...) at BBC Horizon's page Homeopathy: The Test . Madeleine Ennis expresses substantial criticism at the BBC procedure in this message and Dana Ullman reports that "Wayne Turnbull of Guys Hospital, London, made several significant changes in the experiment without communicating this information to anyone else, including Dr. John Enderby, Vice President of the Royal Society, who supervised this study" and that "Turnbull now asserts that his experiment was never portrayed as a replication of Dr. Ennis methodology". Actually in this "Dialogue between Lionel Milgrom and Wayne Turnbull" available from Dana Ullman's site, Turnbull writes that "the degranulation protocol that we use was never portrayed as a replication of Dr Ennis's methodology" and that "this whole project was never discussed, presented or conducted as an exercise into the fundamental validity of either homeopathy or the memory of water. The result is NOT that homeopathy has no credibility, or that the memory of water is nonsense". The interested reader may decide on his own how these statements by Wayne Turnbull, a scientist who conducted the Horizon experiment, are to be reconciled with those on the Horizon site : "We gathered experts from some of Britain's leading scientific institutions to help us repeat Ennis's experiments. (...) Now we repeat Ennis's procedure. We take a drop of water from each of the tubes and add a sample of living human cells. Then it's time for Wayne Turnbull at Guys Hospital, to analyse the cells to see whether the homeopathic water has had any effect.". Perhaps the behavioural patterns described in C.J.S. Picart's article "Scientific Controversy as Farce: The Benveniste-Maddox Counter Trials"( Social Studies of Science, Vol. 24, No. 1, 7-37 (1994)) have not lost their actuality.
The idea of letting a former illusionist with a substantial financial stake(*) in a negative result supervise a "double-blind" experiment is perhaps questionable. Double-blind experiments are supposed to eliminate bias. They do so at the price of transparency, quite an important value in the scientific paradigm. It is unclear why experiments should be carried out double-blind when the measurement process is automated, unless, that is, blindness supervised by an illusionist is deemed preferable to responsible transparence.
(*) As stated in the Horizon website : "Sceptic James Randi is so convinced that homeopathy will not work, that he has offered $1m to anyone who can provide convincing evidence of its effects. For the first time in the programme's history, Horizon conducts its own scientific experiment, to try and win his money." In the transcripts BBC's narrator states explicitly that "with a million dollars at stake James Randi wants to make sure there's no room for error."
Other results confirming Benveniste's finding have been announced by a Slovenian research team led by Prof. Igor Jerman.
Admitting the experimental relevance of high dilutions, i.e, very small amplitudes, entails a significant paradigm shift, with substantial practical implications. For instance , according to the article in Nature (333, 816; 1988), basophils will degranulate in a solution that has been filtered so as too sieve any antibody molecule out. It appears likely that such a filtering would eliminate the antibody amplitude from the sample, unless one speculatively assumes that tunnelling takes place across the filter. Tunnelling might provide an "explanation" for the perplexing experimental results reported by Endler et al. ("The Effect of Highly Diluted Agitated Thyroxine on the Climbing Activity of Frogs", Veterinary and Human Toxicology, 36 (1994), 56-59) and criticized by Robert Park the article already mentioned. The "elusive biophoton" would then be just the result of a small thyroxine amplitude tunnelling across the container's walls. However, in the MoW setting such an "explanation" opens the pandora vase of possible cross-container contamination. After speculatively assuming that contamination is relevant to MoW, it remains to be seen whether, when and how it will be tackled in an adequate, reproducible way at the experimental level. Indeed, intractable contamination issues haunted the last chapter of Benveniste's endeavours,"digital biology", as described in detail in Beauvais' book.
Assuming that the effect described in Benveniste's paper is real (i.e. reproducible), the model described here may be experimentally tested against the hypothesis of residual molecular order of the water molecules, which has been proposed to explain the persistence of the antibody's action (see Michel Schiff, "Un Cas de Censure dans la Science", Albin Michel 1994). The experiment would schematically go as follows. A single antibody molecule would be introduced in a bottle of water. After appropriately shaking the bottle, the water would be poured into two samples A and B. According to Benveniste's result both samples A and B would induce increased basophils' degranulation, i.e. reveal the antibody's presence. Sample A would then be physically or chemically tested for the presence of the molecule so as to induce a quantum measurement of the antibody molecule's position. A microscope could conceivably be used for this purpose. The measurement would therefore induce the reduction of the molecule's wave-packet. If the antibody molecule is localized in A, then according to the model described above no antibody amplitude would be left in B and therefore the basophils there would not degranulate. If Benveniste's effect were due to molecular order, however, B would be unaffected and the basophils there would degranulate [August 2006: according to my current understanding of entanglement as a property of local information exchange between observers, the latter experimental argument may not hold, see the argument at the top on page 4 of this article, as well as this usenet post and the references therein].
Jacques Benveniste gave this webpage as a reference in his letter to the NIH. The main arguments presented here is quoted in Bellavite et al., "Immunology and Homeopathy. 2. Cells of the Immune System and Inflammation", eCAM 2006;3(1)13–24 under reference to my paper in Frontier Perspectives 8(2), (cf. my usenet posts 1, 2. Bellavite et al. also provides a good survey of the whole topic discussed here, together with an extented bibliography. The evolution in time of this site is recorded at this online archive and here,). The MoW controversy is described in detail in the very interesting and readable (er, in French) book L' ame des Molecules - Une histoire de la memoire de l'eau by Francis Beauvais, co-author of Davenas et al. , published in 2006, where this writer's pitch is mentioned on p.269, Chapter 21.
I thank Syd Baumel for drawing my attention to the Belon and Beauvais papers. Sincere thanks also to Jim Burridge for sending me a copy of his report and for his permission to post our exchange.