Guest Post by Jan Willem Nienhuys
The so-called Swiss government report of 2011 on homeopathy was actually an expanded translation of a 2006 book, which in itself was an expanded version of a document submitted to a Swiss committee (PEK) in charge of evaluation of alternative medicine. It has been severely criticised. A summary of criticisms with links can be found on the RationalWiki item to which we may add the Zeno’s Blog. I present here the results of my scrutiny of chapter 10 (1), although I base my report on the original German edition.
This chapter by itself shows a familiar result: the better the investigation, the less evidence in favor of homeopathy it shows. It shows also how homeopaths systematically distort unfavorable results by mispresenting them. Chapter 10 deals with clinical investigations of homeopathy. The authors restrict their attention to an odd assortment of diseases such as acute rhinitis, allergic rhinitis, allergic asthma, sinusitis, adenoid vegetations, pharyngitis, tonsillitis, influenza-like infection and otitis media, together denoted as ‘upper respiratory tract infections/allergic reactions’ or URTI/A for short.
The number of papers reviewed is very small. The authors looked at much more than randomized clinical trials. Apparently their search did not extend further than 2003, but then they might have found over 150 papers, of which about one third double blind randomized trials that compared how well highly diluted homeopathy and placebo cured one of the indicated diseases. They managed to miss 25 papers mentioned in earlier meta-analyses and about four papers that are summarized in Pubmed.
Among the papers they missed is an extremely strong support for the claim ‘homeopathy works for URTI/A’. For example Riverón-Garrote et al. (2) did a placebo controlled double blind randomized clinical trial of homeopathy (apparently individualised) for asthma. Of about 33 verum patients 32 improved, whereas of about 30 placebo patients only 4 improved. The so-called p-value for such a result is less than 10–11. One wonders why this result wasn’t published in Science or Nature, but only in an obscure Spanish language homeopathic journal. Maybe the paper was excluded because it didn’t state that it was about allergic asthma, but note that in about three quarters of all asthma some kind of allergy is implicated.
Of course this pales in comparison to the paper by Friese and Zabalotnyi (3). Again a double blind randomised clinical trial with 72 sinusitis sufferers for both verum and placebo. But here 71 out of 72 verum patients were free of complaints after three weeks, or at least improved, whereas this was the case for only 8 of the placebo patients. Fisher’s Exact Test gives p = 2.47 times 10-29 (one tailed). A remarkable result, because it is well known that over 80% of sinusitis cases cures spontaneously within two weeks. Maybe placebos are dangerous in the hands of homeopaths. Again one wonders why Friese and Zabalotnyi didn’t share the Nobel prize in, say, 2008, and why it is necessary at all to meticulously analyse papers in which homeopathy shows a marginal advantage.
Instead, Maxion-Bergemann et al. include in their survey a paper by Bahemann (4). We quote the summary of the paper from the internet: ‘In homeopathic practice, Kalium bromatum is known as a remedy in the case of paranoid delusions, e. g. if someone suffers from the delusion of being the object of divine revenge, of being damned, or of being pursued. It is also a very important remedy in the case of nocturnal fears in children as well as in the case of convulsions, when they are hereditary, when they occur in childbed, or during teething. The following case demonstrates the successful treatment of a severe mononucleosis after studying the Materia medica.’ Mononucleosis isn’t even mentioned in the list given that specifies URTI/A. Maybe it was included because one of the symptoms of mononucleosis is a sore throat. Apparently the mononucleosis patient was given Kalium Bromatum (Maxion-Bergemann et al. state that it is Kalium Chromatum 200C, presumably Chromatum and Bromatum don’t differ too much to bother) because of something remarkable the patient said during the anamnesis. The reason for giving Kalium bromatum 200C in cases of paranoia might be that an overdose of bromide can induce psychoses. The homeopathic Materia Medica contains quite a few ‘symptoms’ from accidental poisonings reported in old medical literature; potassium bromide was liberally used in the nineteenth century for the calming of seizure and nervous disorders, according to Wikipedia.
More impressive in the list of 13 RCTs of Maxion-Bergemann are two of the largest ‘homeopathic’ trials known, namely of the remedy Oscillococcinum. These trials cannot be taken seriously. The first one, by Ferley et al. (5), has one glaring fault. They started with 478 ‘influenza’-patients (237 verum), tried to make 149 family physicians note down when the patients recovered, and then elected to restrict their attention to the 63 patients (39 verum) that recovered within 48 hours and therefore probably didn’t have flu at all. Coincidentally this was the only possibility out of 14 that gave a ‘significant’ result: correctly computed, p is just below 0.05. (Ferley et al. based their computation on 462 patients with 228 verum and applied a chi-squared test without continuity correction). It is hardly credible that they set this 48-hour criterion in advance, because even if the remedy worked, the risk of having too few subjects to get a significant result would have been considerable. But if one picks out one result among many possibilities, one should correct for multiple outcome. So the Ferley et al. investigation is at most an exploratory result in need of independent confirmation.
This ‘confirmation’ was undertaken soon afterwards, namely in the beginning of 1991, but the results were only published in 1998 and cannot be found on Pubmed (6). In this paper the definitions are somewhat different, but Papp et al. report that of 334 patients (167 verum) a total of 57 (32 verum) were cured in 48 hours. Now 25 versus 32 is not remarkable at all. One doesn’t need any elaborate computation for this. Calculation gives p=0.4. So one might think that the Ferley hypothesis was soundly refuted. But Papp et al. used something they call ‘the Krauth test’, probably some kind of automated post hoc fishing trip to select the best criteria to distinguish the placebo and verum groups. They claim that this ‘test’ gives p=0.0028. They specifically refer to ‘the null hypothesis (the number of patients free of symptoms after 48 hours is equal in both treatment groups)’, so their computation is wrong. The most remarkable thing about Papp et al. is that nobody seems to have to have noticed the large discrepancy between what the numbers say and the claim of the paper.
Another paper with ‘positive’ results is the 1994 study of Reilly et al. (7), number 28 in Maxion-Bergemann et al. The group of Reilly investigated allergic diseases treated by what they called homeopathy. The typical Reilly experiment consists of administering a highly diluted causative agent such as pollen or house dust mite or cat hairs or bird feathers to persons suffering from pollen allergy (seasonal rhinitis) or allergic asthma. However for true homeopathy one uses a substance that has been the subject of a so-called proving, and the remedy is chosen of the totality of all patient ‘symptoms’ – including things like sleeping position and fear of thunderstorms – sufficiently matches the symptoms of the proving. Let me call Reilly’s method ultra-isopathy. Reilly was already discussing this study on a symposium in 1990, but that paper is not clear. It is about 28 asthma patients, and only 24 were analysed. This small number in itself is already reason enough not to consider it. The main analysis was by comparing a subjective measure of wellbeing, the Visual Analog Scale (VAS). Here we find a significant difference (p=0.003) in favor of ultra-isopathy. However, in the small print we see that change in the very important FEV1-value (Forced Expiratory Volume in 1 second) was non-significant (p=0.08) but this refers only to the 18 patients that took such a test before and after the experiment.
Reilly attracted more attention with his first experiment in this vein (8). He started out with 79 patients in both the verum and the placebo group. The treatment was ultradiluted grass pollen for hay fever. The analysis was only about 56 verum and 52 placebo (in a diagram 53 placebo are shown). Such a large dropout (32%) is not good. On basis of the VAS-scores Reilly found p=0.02. VAS is only an ordinal scale and it is not at all clear that one person’s 60 mm means the same as another person’s 60 mm, and also not that two patients with respectively 40 mm and 80 mm together can be considered as equivalent to two other patients with 60 mm each. If we distinguish only better / equal / worse, then the numbers for the verum group were 34 / 9 / 13 and for the placebo group 27 / 5 / 21. One can analyse this in various ways: as a 3 by 2 contingency table (p=0.15), or as a 2 by 2 table, namely by joining the middle group either to the right (p=0.10) or to the left (p=0.34). In this manner the difference is less impressive.
Maxion-Bergemann et al. collected 29 articles. I take the liberty of removing from these everything that is not a double blind RCT that compares how well highly diluted homeopathy and placebo cures an URTI/A disease. We also remove all research with 50 or less patients. The more or less openly fraudulent or at least grossly mistaken Oscillococcinum trials I also leave out. In order of appearance we have then Wiesenauer 1985 (9)  Reilly 1986 (8)  Wiesenauer 1989 (10)  De Lange-de Klerk 1994 (11)  Aabel 2000 (12)  Jacobs 2001 (13)  Friese 2001 (14)  Lewith 2002 (15)  White 2003 (16)  The square brackets refer to the numbering in Maxion-Bergemann et al. A short review of these nine articles follows.
Wiesenauer 1985: one standard remedy for hayfever. Randomised 213 patients, analysed only 164. “no statistical significance was achieved” says the abstract on Pubmed. Reilly 1986: this we have discussed already. Ultra-isopathy for hayfever. Randomised 158 patients, analysed 108. Statistically significant, but barely so. Wiesenauer 1989: four groups, each with their own standard remedy or placebo for sinusitis, 152 patients. “There was no remarkable difference in the therapeutic success among the investigated homeopathic drug combinations nor between the active drugs and placebo”, according to the abstract in Pubmed De Lange-de Klerk 1994: this research was reported more extensively in the lead author’s dissertation (17). Individualised homeopathy for recurrent URTI in children. 175 children were randomised and 170 analysed after following them for a year. 128 different remedies/potencies were prescribed and all together 1042 different prescriptions were handed out. The result was a non-significant difference between homeopathy and placebo. One striking aspect of this investigation is that only after all computations were done, it was revealed which of the two groups was the placebo group and which the verum group. So the author or her thesis advisors deliberately made it impossible to fall for the temptation to start a fishing expedition in the data after the code was completely broken. See also Pubmed. Aabel 2000: ultra-isopathy for birch pollen allergy. Strictly speaking this investigation shouldn’t be in this short list because it was partly prophylactic. From Pubmed: “Surprisingly, the verum treated patients fared worse than the placebo group”. No measure of statistical significance is mentioned. Remarkably this article is preceded by a similar article (18) that Maxion-Bergemann et al. apparently weren’t able to locate. Jacobs 2001: 75 children with otitis media were treated with individualised homeopathy or placebo. Pubmed: “differences were not statistically significant”. It seems that Jacobs has indulged in a fishing trip because she mentions a “significant decrease in symptoms at 24 and 64 h after treatment in favor of homeopathy”. But that is wrong. Significance only can have a meaning if it refers to a single outcome that was planned before any patients were seen. Just picking out two results out of many and stating they are ‘significant’ betrays a fundamental ignorance of research methodology. Friese 2001: this article is also published elsewhere (19), at least the numbers are exactly the same according to Pubmed. 97 children randomized for either individual homeopathic treatment or placebo treatment of adenoid vegetations, 82 analysed. Apparently these 82 comprised 41 placebo and 41 verum, and of these 12 and 9 respectively required an operation in the end. This allegedly corresponds to p=0.64, “These results show no statistical significance.” Incidentally, this is the same Friese as reference 3. Lewith 2002: again ultra-isopathy, now for asthma, 242 patients randomised, 202 completed all clinical assessments. The full article can be accessed via Pubmed and elsewhere. The main conclusion is “Homoeopathic immunotherapy is not effective in the treatment of patients with asthma.” The authors notice that the averages in both groups behave somewhat erratic, and they have no explanation for this. White 2003: individualised homeopathy compared to placebo for 96 children with asthma, who are followed for 12 months. The conclusion is that there is no evidence that this kind of homeopathy is better than placebo. In other words, out of nine investigations only one (Reilly 1986) obtains a barely significant result.
But the interpretation of Maxion-Bergemann et al. is totally different: “If only the placebo-controlled, randomized trials with the highest EBM evidence are considered, 12 of 16 trials show a positive result for the homeopathically treated group (significantly positive 8/16 and trend 4/16).” Even in the more restricted subset of nine discussed above they are overly optimistic. They mark Wiesenauer (1985), De Lange-de Klerk (1994), Jacobs (2001) as showing a ‘trend for homeopathy’ and Lewith (2002) is even marked ‘significant’. The meticulous and high quality research of De Lange (1993, 1994) is judged ‘trend for homeopathy’.
In case of De Lange it seems clear where this judgement comes from. De Lange had several outcomes (number of sick periods, total duration of sick periods, sum of all dayscores etc., and all these showed roughly the same small non-significant difference in favor of homeopathy. This is not really strange, because these outcomes all measure about the same phenomenon. It is not remarkable that there is a small difference between the averages of the two groups that can only be noticed if the children are followed for a full year. There is not even the beginning of a reason that this has anything to do with the treatment. For example the homeopathy group had ‘significantly’ less pets at home. This might serve as an explanation why they as a group were slightly less sick. One might also speculate that this was retroactively caused by the homeopathic treatment. This is not really more improbable than highly diluted stuff (more than 95% D6 and higher) having an effect.
By convention ‘statistically significant’ is the lower limit where weak conclusions such as ‘worth investigating further’ can be justified, and we repeat: only if it refers to a single outcome measure or endpoint chosen before any data collection has started. De Lange chose recurrent URTI because homeopathy was reputed to be most effective for this type of complaints, especially after investigations such as those of Reilly (1986). If following 170 children for a full year cannot show a clear advantage, then that is simply a negative result. In the case of Lewith the ‘significant for homeopathy’ is probably based on partial results such as that in week 3 ‘homeopathy’ fared better in the asthma VAS. One can just as well point to week 16 where the FEV1 of the placebo group seems much better than in the homeopathy group.
Maxion-Bergemann et al. seem to have been singularly inept in collecting papers on homeopathic trials, and for no apparent reason they decided to look also at a large number of case reports and investigations without control group or blinding, even after investigators as early as 1991 have remarked that henceforth only well designed large double blind RCTs were worth considering. If we restrict our attention to the properly blinded controlled investigations, we see the same thing as in other meta-analyses of homeopathy: there is lots of rubbish in favor of homeopathy, but the good trials say plainly and clearly: homeopathy is ineffective, precisely what can be predicted from the fact that there is nothing in it.
Homeopaths nowadays have a lot to say about RCTs and how they prove homeopathy. RCTs are subtle and complicated scientific tools. It is somewhat strange to see how homeopaths resolutely ignore two centuries of basic science but then argue their cause on the basis of complicated statistics.
Homeopathy is an assortment of wildly different practices and theories. We have seen ultra-isopathy, individualised homeopathy and the practice of giving one standardised remedy for one diagnosis without asking too many personal details from the patient. These standard remedies are often branded mixtures of highly diluted ‘classical’ homeopathy, quite contrary to the opinions of homeopathy’s inventor Hahnemann. There are many more variants of homeopathy and the homeopaths themselves cannot agree which are the correct ones.
Moreover, if a treatment or trial doesn’t work out, then a number of additional hypotheses about homeopathy can be invoked, which is what Maxion-Bergemann et al. do. Homeopathic remedies supposedly are counteracted by lots of regular medications and even by strong tasting or smelling food, such as coffee, parsley, garlic and peppermint. Hahnemann even disapproved of reading in bed and long afternoon naps and prolonged suckling of infants (Organon, section 260). Poor performance of homeopathy can be blamed on something called ‘initial aggravation’ or else on lack of experience of the poorly performing homeopath.
But that these factors are relevant at all is unknown, just like there is no proof at all for the similia principle, nor for the hundred thousands or even millions of ‘symptoms’ associated with highly diluted materials in the homeopathic Materia Medica. If homeopaths really want scientists to share homeopathic beliefs, they should not think up lame excuses for ‘failed’ tests, but for starters they might try to present proofs for all or at least some of their ‘symptoms’. They don’t try very hard and in so far it has been tried, it also has failed (20).
I would like to thank Willem Betz for helpful remarks.
I am a retired mathematician with no other interest than a desire to promote science.
1. Stefanie Maxion-Bergemann, Gudrun Bornhöft, Denise Bloch, Christina Vogt-Frank, Marco Righetti, André Thurneysen. (2011) Clinical Studies on the Effectiveness of Homeopathy for URTI/A (Upper Respiratory Tract Infections and Allergic Reactions) in: Homeopathy in Healthcare – Effectiveness, Appropriateness, Safety, Costs. G. Bornhöft and P.F. Mattheiesen (eds.), Berlin etc., Springer 2011, p. 18-157.
2. Riverón-Garrote, M., Fernandez-Argüelles, R.; Morán-Rodríquez, F.; Campistrou-Labaut, J.L. (1998) Ensayo clínico controlado aleatorízado del tratamiento homeopático del asma bronquial, Boletín Mexicano de Homepatía 1998; 31(2):54-61.
3. Friese, K.-H., Zabalotnyi, D.I. (2007) Homöopathie bei akuter Rhinosinusitis, Eine doppelblinde, placebokontrollierte Studie belegt die Wirksamkeit und Verträglichkeit eines homöopathischen Kombinationsarzneimittels, HNO 55(4):271-277.
4. Bahemann A. (2002) Kalium bromatum bei infektiöser Mononukleose. Zeitschrift für Klassische Homöopathie 46:232–233.
5. Ferley J.P., Zmirou D., D’Adhemar D., Balducci F. (1989). A controlled evaluation of a homoeopathic preparation in the treatment of influenza like syndromes. British Journal of Clinical Pharmacology 27:329-335.
6. Papp R., Schuback G., Beck E., Burkard G., Bengel J., Lehrl S., Belon P. (1998). Oscillococcinum in patients with influenza-like syndromes: a placebo-controlled double-blind evaluation. British Homeopathic Journal 87:69-76.
7. Reilly, D.T., Taylor, M.A., Beattie, N.G.M., Campbell, J.H., McSharry C., Aitchison T.C., Carter R., Stevenson R. (1994) Is evidence for homoeopathy reproducible?, Lancet 1994 344:1601-1606.
8. Reilly, D.T., Taylor, M.A., McSharry, C., Aitchison, T. (1986) Is Homoeopathy a Placebo Response?, Controlled Trial of Homoeopathic Potency – With Pollen in Hayfever as Model, Lancet II.2:881-886.
9. Wiesenauer, M., Gaus, W. (1985) Double-blind Trial Comparing the Effectiveness of Galphimia Potentisation D6 (Homoeopathic Preparation), Galphimia Dilution 10-6 and Placebo on Pollinosis, Arzneimittelforschung 35(11):1745-1747.
10. Wiesenauer M, Gaus W, Bohnacker U, Häussler S (1989) Wirksamkeitsprüfung von homöopathischen Kombinationspräparaten bei Sinusitis: Ergebnisse einer randomisierten Doppelblindstudie unter Praxisbedingungen. Arzneimittelforschung 39:620-625.
11. de Lange-de Klerk E.S.M., Blommers J., Kuik D.J., Bezemer P.D., Feenstra L. (1994). Effects of homoeopathic medicines on daily burden of symptoms in children with recurrent upper respiratory tract infections. BMJ 309:1329-1332.
12. Aabel, S. (2000) No beneficial effect of isopathic prophylactic treatment for birch pollen allergy during a low-pollen season, A double-blind, placebo-controlled clinical trial of homeopathic Betula 30c. British Homeopathic Journal 89(4):169-173.
13. Jacobs, J., Springer, D.A., Crothers, D. (2001) Homeopathic treatment of acute otitis media in children, A preliminary randomized placebo-controlled trial. The Pediatric Infectious Disease Journal 20(2):177-183.
14. Friese K.H., Feuchter U., Lüdtke R., Moeller H. (2001) Results of a randomised prospective double-blind trial on the homeopathic treatment of adenoid vegetations. European Journal of General Practice 7:48-54.
15. Lewith, G.T., Watkins, A.D.; Hyland, M.E.; Shaw, S.; Broomfield, J.A.; Dolan, G.; Holgate, S.T. (2002) Use of ultramolecular potencies of allergen to treat asthmatic people allergic to house dust mite: double blind randomised controlled clinical trial, BMJ 324:520-523.
16. White, A., Slade, P.; Hunt, C.; Hart, A.; Ernst, E. (2003) Individualised homeopathy as an adjunct in the treatment of childhood asthma, A randomised placebo controlled trial. Thorax 58(4):317-321
17. Lange-de Klerk, E.S.M. de, Effects of homoeopathic medicines on children with recurrent upper respiratory tract infections. Vrije Universiteit Amsterdam, 1993 (Dissertation).
18. Aabel, S., Laerum, E.; Dölvik, S.; Djupesland, P. (2000) Is homeopathic ‘immunotherapy’ effective?, A double-blind, placebo-controlled trial with the isopathic remedy Betula 30c for patients with birch pollen allergy. British Homeopathic Journal 89(4):161-168.
19. Friese K.-H., Feuchter U., Möller H. (1997). Die homöopathische Behandling von adenoiden Vegetationen. HNO; 45:618–624.
20. Brien S., Lewith G., Bryant, T. (2003) Ultramolecular homeopathy has no observable clinical effects. A randomized, double-blind, placebo-controlled proving trial of Belladonna 30C.