bias
A recent comment to a post of mine (by a well-known and experienced German alt med researcher) made the following bold statement aimed directly at me and at my apparent lack of understanding research methodology:
C´mon , as researcher you should know the difference between efficacy and effectiveness. This is pharmacological basic knowledge. Specific (efficacy) + nonspecific effects = effectiveness. And, in fact, everything can be effective – because of non-specific or placebo-like effects. That does not mean that efficacy is existent.
The point he wanted to make is that outcome studies – studies without a control group where the researcher simply observe the outcome of a particular treatment in a ‘real life’ situation – suffice to demonstrate the effectiveness of therapeutic interventions. This belief is very wide-spread in alternative medicine and tends to mislead all concerned. It is therefore worth re-visiting this issue here in an attempt to create some clarity.
When a patient’s condition improves after receiving a therapy, it is very tempting to feel that this improvement reflects the effectiveness of the intervention (as the researcher mentioned above obviously does). Tempting but wrong: there are many other factors involved as well, for instance:
- the placebo effect (mainly based on conditioning and expectation),
- the therapeutic relationship with the clinician (empathy, compassion etc.),
- the regression towards the mean (outliers tend to return to the mean value),
- the natural history of the patient’s condition (most conditions get better even without treatment),
- social desirability (patients tend to say they are better to please their friendly clinician),
- concomitant treatments (patients often use treatments other than the prescribed one without telling their clinician).
So, how does this fit into the statement above ‘Specific (efficacy) + nonspecific effects = effectiveness’? Even if this formula were correct, it would not mean that outcome studies of the nature described demonstrate the effectiveness of a therapy. It all depends, of course, on what we call ‘non-specific’ effects. We all agree that placebo-effects belong to this category. Probably, most experts also would include the therapeutic relationship and the regression towards the mean under this umbrella. But the last three points from my list are clearly not non-specific effects of the therapy; they are therapy-independent determinants of the clinical outcome.
The most important factor here is usually the natural history of the disease. Some people find it hard to imagine what this term actually means. Here is a little joke which, I hope, will make its meaning clear and memorable.
CONVERATION BETWEEN TWO HOSPITAL DOCTORS:
Doc A: The patient from room 12 is much better today.
Doc B: Yes, we stared his treatment just in time; a day later and he would have been cured without it!
I am sure that most of my readers now understand (and never forget) that clinical improvement cannot be equated with the effectiveness of the treatment administered (they might thus be immune to the misleading messages they are constantly exposed to). Yet, I am not at all sure that all ‘alternativists’ have got it.
Homeopathy has many critics who claim that there is no good evidence for this type of therapy. Homeopaths invariably find this most unfair and point to a plethora of studies that show an effect. They are, of course, correct! There are plenty of trials that suggest that homeopathic remedies do work. The question, however, is HOW RELIABLE ARE THESE STUDIES?
Here is a brand new one which might stand for dozens of others.
In this study, homeopaths treated 50 multimorbid patients with homeopathic remedies identifies by a method called ‘polarity analysis’ (PA) and prospectively followed them over one year (PA enables homeopaths to calculate a relative healing probability, based on Boenninghausen’s grading of polar symptoms).
The 43 patients (86%) who completed the observation period experienced an average improvement of 91% in their initial symptoms. Six patients dropped out, and one did not achieve an improvement of 80%, and was therefore also counted as a treatment failure. The cost of homeopathic treatment was 41% of projected equivalent conventional treatment.
Good news then for enthusiasts of homeopathy? 91% improvement!
Yet, I am afraid that critics might not be bowled over. They might smell a whiff of selection bias, lament the lack of a control group or regret the absence of objective outcome measures. But I was prepared to go as far as stating that such results might be quite interesting… until I read the authors’ conclusions that is:
Polarity Analysis is an effective method for treating multimorbidity. The multitude of symptoms does not prevent the method from achieving good results. Homeopathy may be capable of taking over a considerable proportion of the treatment of multimorbid patients, at lower costs than conventional medicine.
Virtually nothing in these conclusions is based on the data provided. They are pure extrapolation and wild assumptions. Two questions seem to emerge from this:
- How on earth can we take this and so many other articles on homeopathy seriously?
- When does this sort of article cross the line between wishful thinking and scientific misconduct?
Few subjects lead to such heated debate as the risk of stroke after chiropractic manipulations (if you think this is an exaggeration, look at the comment sections of previous posts on this subject). Almost invariably, one comes to the conclusion that more evidence would be helpful for arriving at firmer conclusions. Before this background, this new publication by researchers (mostly chiropractors) from the US ‘Dartmouth Institute for Health Policy & Clinical Practice’ is noteworthy.
The purpose of this study was to quantify the risk of stroke after chiropractic spinal manipulation, as compared to evaluation by a primary care physician, for Medicare beneficiaries aged 66 to 99 years with neck pain.
The researchers conducted a retrospective cohort analysis of a 100% sample of annualized Medicare claims data on 1 157 475 beneficiaries aged 66 to 99 years with an office visit to either a chiropractor or to a primary care physician for neck pain. They compared hazard of vertebrobasilar stroke and any stroke at 7 and 30 days after office visit using a Cox proportional hazards model. We used direct adjusted survival curves to estimate cumulative probability of stroke up to 30 days for the 2 cohorts.
The findings indicate that the proportion of subjects with a stroke of any type in the chiropractic cohort was 1.2 per 1000 at 7 days and 5.1 per 1000 at 30 days. In the primary care cohort, the proportion of subjects with a stroke of any type was 1.4 per 1000 at 7 days and 2.8 per 1000 at 30 days. In the chiropractic cohort, the adjusted risk of stroke was significantly lower at 7 days as compared to the primary care cohort (hazard ratio, 0.39; 95% confidence interval, 0.33-0.45), but at 30 days, a slight elevation in risk was observed for the chiropractic cohort (hazard ratio, 1.10; 95% confidence interval, 1.01-1.19).
The authors conclude that, among Medicare B beneficiaries aged 66 to 99 years with neck pain, incidence of vertebrobasilar stroke was extremely low. Small differences in risk between patients who saw a chiropractor and those who saw a primary care physician are probably not clinically significant.
I do, of course, applaud any new evidence on this rather ‘hot’ topic – but is it just me, or are the above conclusions a bit odd? Five strokes per 1000 patients is definitely not “extremely low” in my book; and furthermore I do wonder whether all experts would agree that a doubling of risk at 30 days in the chiropractic cohort is “probably not clinically significant” – particularly, if we consider that chiropractic spinal manipulation has so very little proven benefit.
My message to (chiropractic) researchers is simple: PLEASE REMEMBER THAT SCIENCE IS NOT A TOOL FOR CONFIRMING BUT FOR TESTING HYPOTHESES.
On 1/12/2014 I published a post in which I offered to give lectures to students of alternative medicine:
Getting good and experienced lecturers for courses is not easy. Having someone who has done more research than most working in the field and who is internationally known, might therefore be a thrill for students and an image-boosting experience of colleges. In the true Christmas spirit, I am today making the offer of being of assistance to the many struggling educational institutions of alternative medicine .
A few days ago, I tweeted about my willingness to give free lectures to homeopathic colleges (so far without response). Having thought about it a bit, I would now like to extend this offer. I would be happy to give a free lecture to the students of any educational institution of alternative medicine.
I did not think that this would create much interest – and I was right: only the ANGLO-EUROPEAN COLLEGE OF CHIROPRACTIC has so far hoisted me on my own petard and, after some discussion (see comment section of the original post) hosted me for a lecture. Several people seem keen on knowing how this went; so here is a brief report.
I was received, on 14/1/2015, with the utmost kindness by my host David Newell. We has a coffee and a chat and then it was time to start the lecture. The hall was packed with ~150 students and the same number was listening in a second lecture hall to which my talk was being transmitted.
We had agreed on the title CHIROPRACTIC: FALLACIES AND FACTS. So, after telling the audience about my professional background, I elaborated on 7 fallacies:
- Appeal to tradition
- Appeal to authority
- Appeal to popularity
- Subluxation exists
- Spinal manipulation is effective
- Spinal manipulation is safe
- Ad hominem attack
Numbers 3, 5 and 6 were dealt with in more detail than the rest. The organisers had asked me to finish by elaborating on what I perceive as the future challenges of chiropractic; so I did:
- Stop happily promoting bogus treatments
- Denounce obsolete concepts like ‘subluxation’
- Clarify differences between chiros, osteos and physios
- Start a culture of critical thinking
- Take action against charlatans in your ranks
- Stop attacking everyone who voices criticism
I ended by pointing out that the biggest challenge, in my view, was to “demonstrate with rigorous science which chiropractic treatments demonstrably generate more good than harm for which condition”.
We had agreed that my lecture would be followed by half an hour of discussion; this period turned out to be lively and had to be extended to a full hour. Most questions initially came from the tutors rather than the students, and most were polite – I had expected much more aggression.
In his email thanking me for coming to Bournemouth, David Newell wrote about the event: The general feedback from staff and students was one of relief that you possessed only one head, :-). I hope you may have felt the same about us. You came over as someone who had strong views, a fair amount of which we disagreed with, but that presented them in a calm, informative and courteous manner as we did in listening and discussing issues after your talk. I think everyone enjoyed the questions and debate and felt that some of the points you made were indeed fair critique of what the profession may need to do, to secure a more inclusive role in the health care arena.
My own impression of the day is that some of my messages were not really understood, that some of the questions, including some from the tutors, seemed like coming from a different planet, and that people were more out to teach me than to learn from my talk. One overall impression that I took home from that day is that, even in this college which prides itself of being open to scientific evidence and unimpressed by chiropractic fundamentalism, students are strangely different from other health care professionals. The most tangible aspect of this is the openly hostile attitude against drug therapies voiced during the discussion by some students.
The question I always ask myself after having invested a lot of time in preparing and delivering a lecture is: WAS IT WORTH IT? In the case of this lecture, I think the answer is YES. With 300 students present, I am fairly confident that I did manage to stimulate a tiny bit of critical thinking in a tiny percentage of them. The chiropractic profession needs this badly!
According to the ‘General Osteopathic Council’ (GOC), osteopathy is a primary care profession, focusing on the diagnosis, treatment, prevention and rehabilitation of musculoskeletal disorders, and the effects of these conditions on patients’ general health.
Using many of the diagnostic procedures applied in conventional medical assessment, osteopaths seek to restore the optimal functioning of the body, where possible without the use of drugs or surgery. Osteopathy is based on the principle that the body has the ability to heal, and osteopathic care focuses on strengthening the musculoskeletal systems to treat existing conditions and to prevent illness.
Osteopaths’ patient-centred approach to health and well-being means they consider symptoms in the context of the patient’s full medical history, as well as their lifestyle and personal circumstances. This holistic approach ensures that all treatment is tailored to the individual patient.
On a good day, such definitions make me smile; on a bad day, they make me angry. I can think of quite a few professions which would fit this definition just as well or better than osteopathy. What are we supposed to think about a profession that is not even able to provide an adequate definition of itself?
Perhaps I try a different angle: what conditions do osteopaths treat? The GOC informs us that commonly treated conditions include back and neck pain, postural problems, sporting injuries, muscle and joint deterioration, restricted mobility and occupational ill-health.
This statement seems not much better than the previous one. What on earth is ‘muscle and joint deterioration’? It is not a condition that I find in any medical dictionary or textbook. Can anyone think of a broader term than ‘occupational ill health’? This could be anything from tennis elbow to allergies or depression. Do osteopaths treat all of those?
One gets the impression that osteopaths and their GOC are deliberately vague – perhaps because this would diminish the risk of being held to account on any specific issue?
The more one looks into the subject of osteopathy, the more confused one gets. The profession goes back to Andrew Still ((August 6, 1828 – December 12, 1917) Palmer, the founder of chiropractic is said to have been one of Still’s pupils and seems to have ‘borrowed’ most of his concepts from him – even though he always denied this) who defined osteopathy as a science which consists of such exact exhaustive and verifiable knowledge of the structure and functions of the human mechanism, anatomy and physiology & psychology including the chemistry and physics of its known elements as is made discernable certain organic laws and resources within the body itself by which nature under scientific treatment peculiar to osteopathic practice apart from all ordinary methods of extraneous, artificial & medicinal stimulation and in harmonious accord with its own mechanical principles, molecular activities and metabolic processes may recover from displacements, derangements, disorganizations and consequent diseases and regain its normal equilibrium of form and function in health and strength.
This and many other of his statements seem to indicate that the art of using language for obfuscation has a long tradition in osteopathy and goes back directly to its founding father.
What makes the subject of osteopathy particularly confusing is not just the oddity that, in conventional medicine, the term means ‘disease of the bone’ (which renders any literature searches in this area a nightmare) but also the fact that, in different countries, osteopaths are entirely different professionals. In the US, osteopathy has long been fully absorbed by mainstream medicine and there is hardly any difference between MDs and ODs. In the UK, osteopaths are alternative practitioners regulated by statute but are, compared to chiropractors, of minor importance. In Germany, osteopaths are not regulated and fairly ‘low key’, while in France, they are numerous and like to see themselves as primary care physicians.
And what about the evidence base of osteopathy? Well, that’s even more confusing, in my view. Evidence for which treatment? As US osteopaths might use any therapy from drugs to surgery, it could get rather complicated. So let’s just focus on the manual treatment as used by osteopaths outside the US.
Anyone who attempts to critically evaluate the published trial evidence in this area will be struck by at least two phenomena:
- the wide range of conditions treated with osteopathic manual therapy (OMT)
- the fact that there are several groups of researchers that produce one positive result after the next.
The best example is probably the exceedingly productive research team of J. C. Licciardone from the Osteopathic Research Center, University of North Texas. Here are a few conclusions from their clinical studies:
- The large effect size for OMT in providing substantial pain reduction in patients with chronic LBP of high severity was associated with clinically important improvement in back-specific functioning. Thus, OMT may be an attractive option in such patients before proceeding to more invasive and costly treatments.
- The large effect size for short-term efficacy of OMT was driven by stable responders who did not relapse.
- Osteopathic manual treatment has medium to large treatment effects in preventing progressive back-specific dysfunction during the third trimester of pregnancy. The findings are potentially important with respect to direct health care expenditures and indirect costs of work disability during pregnancy.
- Severe somatic dysfunction was present significantly more often in patients with diabetes mellitus than in patients without diabetes mellitus. Patients with diabetes mellitus who received OMT had significant reductions in LBP severity during the 12-week period. Decreased circulating levels of TNF-α may represent a possible mechanism for OMT effects in patients with diabetes mellitus. A larger clinical trial of patients with diabetes mellitus and comorbid chronic LBP is warranted to more definitively assess the efficacy and mechanisms of action of OMT in this population.
- The OMT regimen met or exceeded the Cochrane Back Review Group criterion for a medium effect size in relieving chronic low back pain. It was safe, parsimonious, and well accepted by patients.
- Osteopathic manipulative treatment slows or halts the deterioration of back-specific functioning during the third trimester of pregnancy.
- The only consistent finding in this study was an association between type 2 diabetes mellitus and tissue changes at T11-L2 on the right side. Potential explanations for this finding include reflex viscerosomatic changes directly related to the progression of type 2 diabetes mellitus, a spurious association attributable to confounding visceral diseases, or a chance observation unrelated to type 2 diabetes mellitus. Larger prospective studies are needed to better study osteopathic palpatory findings in type 2 diabetes mellitus.
- OMT significantly reduces low back pain. The level of pain reduction is greater than expected from placebo effects alone and persists for at least three months. Additional research is warranted to elucidate mechanistically how OMT exerts its effects, to determine if OMT benefits are long lasting, and to assess the cost-effectiveness of OMT as a complementary treatment for low back pain.
Based on this brief review of the evidence origination from one of the most active research team, one could be forgiven to think that osteopathy is a panacea. But such an assumption is, of course, nonsensical; a more reasonable conclusion might be the following: osteopathy is one of the most confusing and confused subject under the already confused umbrella of alternative medicine.
I know, it’s not really original to come up with the 10000th article on “10 things…” – but you will have to forgive me, I read so many of these articles over the holiday period that I can’t help but jump on the already over-crowded bandwagon and compose yet another one.
So, here are 10 things which could, if implemented, bring considerable improvement in 2015 to my field of inquiry, alternative medicine.
- Consumers need to get better at acting as bull shit (BS) detectors. Let’s face it, much of what we read or hear about this subject is utter BS. Yet consumers frequently lap up even the worst drivel like it were some source of deep wisdom. They could save themselves so much money, if they learnt to be just a little bit more critical.
- Dr Oz should focus on being a heart surgeon. His TV show has been demonstrated far too often to be promoting dangerous quackery. Yet as a heart surgeon, he actually might do some good.
- Journalists ought to remember that they have a job that extends well beyond their ambition to sell copy. They have a responsibility to inform the public truthfully and responsibly.
- Book publishers should abstain from churning out book after book that does little else but mislead the public about alternative medicine in a way that all to often is dangerous to the readers’ health. The world does not need the 1000th book repeating nonsense on detox, wellness etc.!
- Alternative practitioners must realise that claiming that therapy x cures condition y is not just slightly over-optimistic (or based on ‘years of experience’); if the claim is not based on sound evidence, it is what most people would call an outright lie.
- Proponents of alternative medicine should learn that it is neither fair nor productive to fiercely attack everyone personally who disagrees with their enthusiasm for this or that form of alternative medicine. In fact, it merely highlights the acute lack of rational arguments.
- Researchers of alternative medicine have to remember how important it is to think critically – an uncritical scientist is at best a contradiction in terms and at worst a pseudo-scientist who is likely to cause harm.
- Authorities should amass the courage, the political power and the financial means of going after those charlatans who ruthlessly exploit the public by making a fast and easy buck on the gullibility of consumers. Only if there is the likelihood of hefty fines will we see a meaningful decrease in the current epidemic of alternative health fraud.
- Politicians should realise that alternative medicine is not just a trivial subject with which one might win votes, if one issues platitudes to please the majority; alternative medicine is used by so many people that it has become an important public health issue.
- Prince Charles need to learn how to control himself and abstain from meddling in health politics by using every conceivable occasion to promote what he thinks is ‘integrated medicine’ but which, in fact, can easily be disclosed to be quackery.
As you see, my list almost instantly turned into a wish-list, and the big questions that follow from it are:
- How could we increase the likelihood of these wishes to come true?
- And would there be anything left of alternative medicine, if all of these wishes miraculously became true in 2015?
I do not pretend to have the answers, but I do feel strongly that a healthy dose of critical thinking in all levels of education – from kindergartens to schools, from colleges to universities etc. – would be a good and necessary starting point.
I know, my list is not just a wish list, it also is a wishful thinking list. It would be hopelessly naïve to assume that major advances will be made in 2015. I am realistic, sometimes even quite pessimistic, about progress in alternative medicine. But this does not mean that I or anyone else should just give up. 2015 will be a year where at least one thing is certain: you will see me continuing me my fight for reason, critical analysis, rational debate and good evidence – and that’s a promise!
As promised, I will try with this post to explain my reservations regarding the new meta-analysis suggesting that individualised homeopathic remedies are superior to placebos. Before I start, however, I want to thank all those who have commented on various issues; it is well worth reading the numerous and diverse comments.
To remind us of the actual meta-analysis, it might be useful to re-publish its abstract (the full article is also available online):
BACKGROUND:
A rigorous and focused systematic review and meta-analysis of randomised controlled trials (RCTs) of individualised homeopathic treatment has not previously been undertaken. We tested the hypothesis that the outcome of an individualised homeopathic treatment approach using homeopathic medicines is distinguishable from that of placebos.
METHODS:
The review’s methods, including literature search strategy, data extraction, assessment of risk of bias and statistical analysis, were strictly protocol-based. Judgment in seven assessment domains enabled a trial’s risk of bias to be designated as low, unclear or high. A trial was judged to comprise ‘reliable evidence’ if its risk of bias was low or was unclear in one specified domain. ‘Effect size’ was reported as odds ratio (OR), with arithmetic transformation for continuous data carried out as required; OR > 1 signified an effect favouring homeopathy.
RESULTS:
Thirty-two eligible RCTs studied 24 different medical conditions in total. Twelve trials were classed ‘uncertain risk of bias’, three of which displayed relatively minor uncertainty and were designated reliable evidence; 20 trials were classed ‘high risk of bias’. Twenty-two trials had extractable data and were subjected to meta-analysis; OR = 1.53 (95% confidence interval (CI) 1.22 to 1.91). For the three trials with reliable evidence, sensitivity analysis revealed OR = 1.98 (95% CI 1.16 to 3.38).
CONCLUSIONS:
Medicines prescribed in individualised homeopathy may have small, specific treatment effects. Findings are consistent with sub-group data available in a previous ‘global’ systematic review. The low or unclear overall quality of the evidence prompts caution in interpreting the findings. New high-quality RCT research is necessary to enable more decisive interpretation.
Since my team had published an RCTs of individualised homeopathy, it seems only natural that my interest focussed on why the study (even though identified by Mathie et al) had not been included in the meta-analysis. Our study had provided no evidence that adjunctive homeopathic remedies, as prescribed by experienced homeopathic practitioners, are superior to placebo in improving the quality of life of children with mild to moderate asthma in addition to conventional treatment in primary care.
I was convinced that this trial had been rigorous and thus puzzled why, despite receiving ‘full marks’ from the reviewers, they had not included it in their meta-analysis. I thus wrote to Mathie, the lead author of the meta-analysis, and he explained: For your trial (White et al. 2003), under domain V of assessment, we were unable to extract data for meta-analysis, and so it was attributed high risk of bias, as specified by the Cochrane judgmental criteria. Our designated main outcome was the CAQ, for which we needed to know (or could at least estimate) a mean and SD for both the baseline and the end-point of the study. Since your paper reported only the change from baseline in Table 3 or in the main text, it is not possible to derive the necessary end-point for analysis.
It took a while and several further emails until I understood: our study did report both the primary (Table 2 quality of life) and secondary outcome measure (Table 3 severity of symptoms). The primary outcome measure was reported in full detail such that a meta-analysis would have been possible. The secondary outcome measure was also reported but not in full detail, and the data provided by us would not lend themselves to meta-analyses. By electing not our primary but our secondary outcome measure for their meta-analysis, Mathie et al were able to claim that they were unable to use our study and reject it for their meta-analysis.
Why did they do that?
The answer is simple: in their methods section, they specify that they used outcome measures “based on a pre-specified hierarchical list in order of greatest to least importance, recommended by the WHO“. This, I would argue is deeply flawed: the most important outcome measure of a study is usually the one for which the study was designed, not the one that some guys at the WHO feel might be important (incidentally, the WHO list was never meant to be applied to meta-analyses in that way).
By following rigidly their published protocol, the authors of the meta-analysis managed to exclude our negative trial. Thus they did everything right – or did they?
Well, I think they committed several serious mistakes.
- Firstly, they wrote the protocol, which forced them to exclude our study. Following a protocol is not a virtue in itself; if the protocol is nonsensical it even is the opposite. Had they proceeded as is normal in such cases and used our primary outcome measure in their meta-analyses, it is most likely that their overall results would not have been in favour of homeopathy.
- Secondly, they awarded our study a malus point for the criterium ‘selective outcome reporting’. This is clearly a wrong decision: we did report the severity-outcome, albeit not in sufficient detail for their meta-analysis. Had they not committed this misjudgment, our RCT would have been the only one with an ‘A’ rating. This would have very clearly highlighted the nonsense of excluding the best-rated trial from meta-analysis.
There are several other oddities as well. For instance, Mathie et al judge our study to be NOT free of vested interest. I asked Mathie why they had done this and was told it is because we accepted free trial medication from a homeopathic pharmacy. I would argue that my team was far less plagued by vested interest than the authors of their three best (and of course positive) trials who, as I happen to know, are consultants for homeopathic manufacturers.
And all of this is just in relation to our own study. Norbert Aust has uncovered similar irregularities with other trials and I take the liberty of quoting his comments posted previously again here:
I have reason to believe that this review and metaanalysis in biased in favor of homeopathy. To check this, I compared two studies (1) Jacobs 1994 about the treatment of childhood diarrhea in Nicaragua, (2) Walach 1997 about homeopathic threatment of headaches. The Jacobs study is one of the three that provided ‘reliable evidence’, Walach’s study earned a poor C2.2 rating and was not included in the meta-analyses. Jacobs’ results were in favour of homeopathy, Walach’s not.
For the domains where the rating of Walach’s study was less than that of the Jacobs study, please find citations from the original studies or my short summaries for the point in question.
Domain I: Sequence generation:
Walach:
“The remedy selected was then mailed to a notary public who held a stock of placebos. The notary threw a dice and mailed either the homeopathic remedy or an appropriate placebo. The notary was provided with a blank randomisation list.”
Rating: UNCLEAR (Medium risk of bias)
Jacobs:
“For each of these medications, there was a box of tubes in sequentially numbered order which had been previously randomized into treatment or control medication using a random numbers table in blocks of four”
Rating: YES (Low risk of bias)
Domain IIIb: Blinding of outcome assessor
Walach:
“The notary was provided with a blank randomization list which was an absolutely unique document. It was only handed out after the biometrician (WG) had deposited all coded original data as a printout at the notary’s office. (…) Data entry was performed blindly by personnel not involved in the study. ”
Rating: UNCLEAR (Medium risk of bias)
Jacobs:
“All statistical analyses were done before breaking the randomisation code, using the program …”
Rating: YES (Low risk of bias)
Domain V: Selective outcome reporting
Walach:
Study protocol was published in 1991 prior to enrollment of participants, all primary outcome variables were reported with respect to all participants and the endpoints.
Rating: NO (high risk of bias)
Jacobs:
No prior publication of protocol, but a pilot study exists. However this was published in 1993 only after the trial was performed in 1991. Primary outcome defined (duration of diarrhea), reported but table and graph do not match, secondary outcome (number of unformed stools on day 3) seems defined post hoc, for this is the only one point in time, this outcome yielded a significant result.
Rating: YES (low risk of bias)
Domain VI: Other sources of bias:
Walach:
Rating: NO (high risk of bias), no details given
Jacobs:
Imbalance of group properties (size, weight and age of children), that might have some impact on course of disease, high impact of parallel therapy (rehydration) by far exceeding effect size of homeopathic treatment
Rating: YES (low risk of bias), no details given
In a nutshell: I fail to see the basis for the different ratings in the studies themselves. I assume bias of the authors of the review.
Conclusion
So, what about the question posed in the title of this article? The meta-analysis is clearly not a ‘proof of concept’. But is it proof for misconduct? I asked Mathie and he answered as follows: No, your statement does not reflect the situation at all. As for each and every paper, we selected the main outcome measure for your trial using the objective WHO classification approach (in which quality of life is clearly of lower rank than severity). This is all clearly described in our prospective protocol. Under no circumstances did we approach this matter retrospectively, in the way you are implying.
Some nasty sceptics might have assumed that the handful of rigorous studies with negative results were well-known to most researchers of homeopathy. In this situation, it would have been hugely tempting to write the protocol such that these studies must be excluded. I am thrilled to be told that the authors of the current new meta-analysis (who declared all sorts of vested interests at the end of the article) resisted this temptation.
Dietary supplements (DS) are heavily promoted usually with the claim that they have stood the test of time and that they are natural and hence harmless. Unsurprisingly, their use has become very wide-spread. A new study determined the use of DSs, factors associated with DS use, and reasons for use among U.S. college students.
College students (N = 1248) at 5 U.S. universities were surveyed. Survey questions included descriptive demographics, types and frequency of DS used, reasons for use and money spent on supplements. Supplements were classified using standard criteria. Logistic regression analyses examined relationships between demographic and lifestyle factors and DS use.
Sixty-six percent of college students surveyed used DS at least once a week, and 12% consumed 5 or more supplements a week. Forty-two percent used multivitamins/multiminerals, 18% vitamin C, 17% protein/amino acids and 13% calcium at least once a week. Factors associated with supplement use included dietary patterns, exercise, and tobacco use. Students used supplements to promote general health (73%), provide more energy (29%), increase muscle strength (20%), and enhance performance (19%).
The authors of this survey concluded that college students appear more likely to use DS than the general population and many use multiple types of supplements weekly. Habits established at a young age persist throughout life. Therefore, longitudinal research should be conducted to determine whether patterns of DS use established early in adulthood are maintained throughout life. Adequate scientific justification for widespread use of DS in healthy, young populations is lacking.
Another new study investigated the use of DSs in 334 dancers from 53 countries, who completed a digitally based 35-question survey detailing demographic information and the use of DSs. Supplement use was prevalent amongst this international cohort, with 48% reporting regular DSs use. Major motives for supplement use were to improve health, boost immunity, and reduce fatigue. Forty-five percent believed that dancing increased the need for supplementation, whilst 30% recognized that there were risks associated with DSs.
The most frequently consumed DSs were vitamin C (60%), multivitamins (67%), and caffeine (72%). A smaller group of participants declared the use of whey protein (21%) or creatine (14%). Supplements were mainly obtained from pharmacies, supermarkets, and health-food stores. Dancers recognized their lack of knowledge in DSs use and relied on peer recommendations instead of sound evidence-based advice from acknowledged nutrition or health care professionals.
The authors concluded that this study demonstrates that DSs use is internationally prevalent amongst dancers. Continued efforts are warranted with regard to information dissemination.
Finally, a third study investigated use of DSs in patients in Japan. This survey was completed by 2732 people, including 599 admitted patients, 1154 ambulatory patients, and 979 healthy subjects who attended a seminar about DSs. At the time of the questionnaire, 20.4% of admitted patients, 39.1% of ambulatory patients, and 30.7% of healthy subjects were using DSs, which including vitamin/mineral supplements, herbal extracts, its ingredients, or food for specified health uses.
The primary purpose for use in all groups was health maintenance, whereas 3.7% of healthy subjects, 10.0% of ambulatory patients, and 13.2% of admitted patients used DSs to treat diseases. In addition, 17.7% of admitted patients and 36.8% of ambulatory patients were using DSs concomitantly with their medications. However, among both admitted patients and ambulatory patients, almost 70% did not mention DSs use to their physicians. Overall, 3.3% of all subjects realized adverse effects associated with DSs.
The authors concluded that communication between patients and physicians is important to avoid health problems associated with the use of DSs.
There is little doubt, DSs are popular with all sorts of populations and have grown into a multi billion dollar industry. There is also no doubt that the use of only very few DSs are evidence-based (and if so, in only relatively rare situations). And there can be no doubt that many DSs can do harm. What follows is simple: for the vast majority of DSs the benefits do not demonstrably out-weigh the risks.
If that is true, we have to ask ourselves: Why are they so popular?
The answer, I think, is because of the very phenomenon I am constantly trying to fight on this blog – IRRESPONSIBLE CHARLATANS PULLING WOOL OVER CONSUMERS EYES.
Guest post by Pete Attkins
Commentator “jm” asked a profound and pertinent question: “What DOES it take for people to get real in this world, practice some common sense, and pay attention to what’s going on with themselves?” This question was asked in the context of asserting that personal experience always trumps the results of large-scale scientific experiments; and asserting that alt-med experts are better able to provide individulized healthcare than 21st Century orthodox medicine.
What does common sense and paying attention lead us to conclude about the following? We test a six-sided die for bias by rolling it 100 times. The number 1 occurs only once and the number 6 occurs many times, never on its own, but in several groups of consecutive sixes.
I think it is reasonable to say that common sense would, and should, lead everyone to conclude that the die is biased and not fit for its purpose as a source of random numbers.
In other words, we have a gut feeling that the die is untrustworthy. Gut instincts and common sense are geared towards maximizing our chances of survival in our complex and unpredictable world — these are innate and learnt behaviours that have enabled humans to survive despite the harshness of our ever changing habitat.
Only very recently in the long history of our species have we developed specialized tools that enable us to better understand our harsh and complex world: science and critical thinking. These tools are difficult to master because they still haven’t been incorporated into our primary and secondary formal education systems.
The vast majority of people do not have these skills therefore, when a scientific finding flies in the face of our gut instincts and/or common sense, it creates an overwhelming desire to reject the finding and classify the scientist(s) as being irrational and lacking basic common sense. It does not create an intense desire to accept the finding then painstakingly learn all of the science that went into producing the finding.
With that in mind, let’s rethink our common sense conclusion that the six-sided die is biased and untrustworthy. What we really mean is that the results have given all of us good reason to be highly suspicious of this die. We aren’t 100% certain that this die is biased, but our gut feeling and common sense are more than adequate to form a reasonable mistrust of it and to avoid using it for anything important to us. Reasons to keep this die rather than discard it might be to provide a source of mild entertainment or to use its bias for the purposes of cheating.
Some readers might be surprised to discover at this point that the results I presented from this apparently heavily-biased die are not only perfectly valid results obtained from a truly random unbiased die, they are to be fully expected. Even if the die had produced 100 sixes in that test, it would not confirm that the die is biased in any way whatsoever. Rolling a truly unbiased die once will produce one of six possible outcomes. Rolling the same die 100 times will produce one unique sequence out of the 6^100 (6.5 x 10^77) possible sequences: all of which are equally valid!
Gut feeling plus common sense rightfully informs us that the probability of a random die producing one hundred consecutive sixes is so incredibly remote that nobody will ever see it occur in reality. This conclusion is also mathematically sound: if there were 6.5 x 10^77 people on Earth, each performing the same test on truly random dice, there is no guarantee that anyone would observe a sequence of one hundred consecutive sixes.
When we observe a sequence such as 2 5 1 4 6 3 1 4 3 6 5 2… common sense informs us that the die is very likely random. If we calculate the arithmetic mean to be very close to 3.5 then common sense will lead us to conclude that the die is both random and unbiased enough to use it as a reliable source of random numbers.
Unfortunately, this is a perfect example of our gut feelings and common sense failing us abysmally. They totally failed to warn us that the 2 5 1 4 6 3 1 4 3 6 5 2… sequence we observed had exactly the same (im)probability of occurring as a sequence of one hundred 6s or any other sequence that one can think of that doesn’t look random to a human observer.
The 100-roll die test is nowhere near powerful enough to properly test a six-sided die, but this test is more than adequately powered to reveal some of our cognitive biases and some of the deficits in our personal mastery of science and critical thinking.
To properly test the die we need to provide solid evidence that it is both truly random and that its measured bias tends towards zero as the number of rolls tends towards infinity. We could use the services of one testing lab to conduct billions of test rolls, but this would not exclude errors caused by such things as miscalibrated equipment and experimenter bias. It is better to subdivide the testing across multiple labs then carefully analyse and appropriately aggregate the results: this dramatically reduces errors caused by equipment and humans.
In medicine, this testing process is performed via systematic reviews of multiple, independent, double-blind, placebo-controlled trials — every trial that is insufficiently powered to add meaningfully to the result is rightfully excluded from the aggregation.
Alt-med relies on a diametrically opposed testing process. It performs a plethora of only underpowered tests; presents those that just happen to show a positive result (just as a random die could’ve produced); and sweeps under the carpet the overwhelming number of tests that produced a negative result. It publishes only the ‘successes’, not its failures. By sweeping its failures under the carpet it feels justified in making the very bold claim: Our plethora of collected evidence shows clearly that it mostly ‘works’ and, when it doesn’t, it causes no harm.
One of the most acidic tests for a hypothesis and its supporting data (which is a mandatory test in a few branches of critical engineering) is to substitute the collected data for random data that has been carefully crafted to emulate the probability mass functions of the collected datasets. This test has to be run multiple times for reasons that I’ve attempted to explain in my random die example. If the proposer of the hypothesis is unable to explain the multiple failures resulting from this acid test then it is highly likely that the proposer either does not fully understand their hypothesis or that their hypothesis is indistinguishable from the null hypothesis.
Guest post by Jan Oude-Aost
ADHD is a common disorder among children. There are evidence based pharmacological treatments, the best known being methylphenidate (MPH). MPH has kind of a bad reputation, but is effective and reasonably safe. The market is also full of alternative treatments, pharmacological and others, some of them under investigation, some unproven and many disproven. So I was not surprised to find a study about Ginkgo biloba as a treatment for ADHD. I was surprised, however, to find this study in the German Journal of Child and Adolescent Psychiatry and Psychotherapy, officially published by the “German Society of Child and Adolescent Psychiatry and Psychotherapy“ (Deutsche Gesellschaft für Kinder- und Jugendpsychiatrie und Psychotherapie). The journal’s guidelines state that studies should provide new scientific results.
The study is called “Ginkgo biloba Extract EGb 761® in Children with ADHD“. EGb 761® is the key ingredient in “Tebonin®“, a herbal drug made by “Dr. Wilma Schwabe GmbH“. The abstract states:
“One possible treatment, at least for cognitive problems, might be the administration of Ginkgo biloba, though evidence is rare.This study tests the clinical efficacy of a Ginkgo biloba special extract (EGb 761®) (…) in children with ADHD (…).“
“Eine erfolgversprechende, bislang kaum untersuchte Möglichkeit zur Behandlung kognitiver Aspekte ist die Gabe von Ginkgo biloba. Ziel der vorliegenden Studie war die Prüfung klinischer Wirksamkeit (…) von Ginkgo biloba-Extrakt Egb 761® bei Kindern mit ADHS.“ (Taken from the English and German abstracts.)
The study sample (20!) was recruited among children who “did not tolerate or were unwilling“ to take MPH. The unwilling part struck me as problematic. There is likely a strong selection bias towards parents who are unwilling to give their children MPH. I guess it is not the children who are unwilling to take MPH, but the parents who are unwilling to administer it. At least some of these parents might be biased against MPH and might already favor CAMmodalities.
The authors state three main problems with “herbal therapy“ that require more empirical evidence: First of all the question of adverse reactions, which they claim occur in about 1% of cases with “some CAMs“ (mind you, not “herbal therapy“). Secondly, the question of drug interactions and thirdly, the lack of information physicians have about the CAMs their patients use.
A large part of the study is based on results of an EEG-protocol, which I choose to ignore, because the clinical results are too weak to give the EEG findings any clinical relevance.
Before looking at the study itself, let’s look at what is known about Ginkgo biloba as a drug. Ginkgo is best known for its use in patients with dementia, cognitive impairment und tinnitus. A Cochrane review from 2009 concluded:
“There is no convincing evidence that Ginkgo biloba is efficacious for dementia and cognitive impairment“ [1].
The authors of the current Study cite Sarris et al. (2011), a systematic review of complementary treatment of ADHD. Sarris et al. mention Salehi et al. (2010) who tested Ginkgo against MPH. MPH turned out to be much more effective than Ginkgo, but Sarris et al. argue that the duration of treatment (6 weeks) might have been too short to see the full effects of Ginkgo.
Given the above information it is unclear why Ginkgo is judged a “possible“ treatment, properly translated from German even “promising”, and why the authors state that Ginkgo has been “barely studied“.
In an unblinded, uncontrolled study with a sample likely to be biased toward the tested intervention, anything other than a positive result would be odd. In the treatment of autism there are several examples of implausible treatments that worked as long as parents knew that their children were getting the treatment, but didn’t after proper blinding (e.g. secretin).
This study’s aim was to test clinical efficacy, but the conclusion begins with how well tolerated Ginkgo was. The efficacy is mentioned subsequently: “Following administration, interrelated improvements on behavioral ratings of ADHD symptoms (…) were detected (…).“ But the way they where “detected“ is interesting. The authors used an established questionnaire (FBB-HKS) to let parents rate their children. Only the parents. The children and their teachers where not given the FBB-HKS-questionnaires, inspite of this being standard clinical practice (and inspite of giving children questionnaires to determine changes in quality of life, which were not found).
None of the three problems that the authors describe as important (adverse reactions, drug interactions, lack of information) can be answered by this study. I am no expert in statistics but it seems unlikely to me to meaningfully determine adverse effects in just 20 patients especially when adverse effects occur at a rate of 1%. The authors claim they found an incidence rate of 0,004% in “700 observation days“. Well, if they say so.
The authors conclude:
“Taken together, the current study provides some preliminary evidence that Ginkgo biloba Egb 761® seems to be well tolerated in the short term and may be a clinically useful treatment for children with ADHD. Double-blind randomized trials are required to clarify the value of the presented data.“
Given the available information mentioned earlier, one could have started with that conclusion and conducted a double blind RCT in the first place!
“Clinical Significance
The trends of this preliminary open study may suggest that Ginkgo biloba Egb 761® might be considered as a complementary or alternative medicine for treating children with ADHD.“
So, why do I care? If preliminary evidence “may suggest“ that something “might be considered“ as a treatment? Because I think that this study does not answer any important questions or give us any new or useful knowledge. Following the journal’s guidelines, it should therefore not have been published. I also think it is an example of bad science. Bad not just because of the lack of critical thinking. It also adds to the misinformation about possible ADHD treatments spreading through the internet. The study was published in September. In November I found a website citing the study and calling it “clinical proof“ when it is not. But child psychiatrists will have to explain that to many parents, instead of talking about their children’s health.
I somehow got the impression that this study was more about marketing than about science. I wonder if Schwabe will help finance the necessary double-blind randomized trial…
[1] See more at: http://summaries.cochrane.org/CD003120/DEMENTIA_there-is-no-convincing-evidence-that-ginkgo-biloba-is-efficacious-for-dementia-and-cognitive-impairment#sthash.oqKFrSCC.dpuf