In my last post, we discussed the “A+B versus B” trial design as a tool to produce false positive results. This method is currently very popular in alternative medicine, yet it is by no means the only approach that can mislead us. Today, let’s look at other popular options with a view of protecting us against trialists who naively or willfully might fool us.
The crucial flaw of the “A+B versus B” design is that it fails to account for non-specific effects. If the patients in the experimental group experience better outcomes than the control group, this difference could well be due to effects that are unrelated to the experimental treatment. There are, of course, several further ways to ignore non-specific effects in clinical research. The simplest option is to include no control group at all. Homeopaths, for instance, are very proud of studies which show that ~70% of their patients experience benefit after taking their remedies. This type of result tends to impress journalists, politicians and other people who fail to realise that such a result might be due to a host of factors, e.g. the placebo-effect, the natural history of the disease, regression towards the mean or treatments which patients self-administered while taking the homeopathic remedies. It is therefore misleading to make causal inferences from such data.
Another easy method to generate false positive results is to omit blinding. The purpose of blinding the patient, the therapist and the evaluator of the outcomes in clinical trials is to make sure that expectation is not the cause of or contributor to the outcome. They say that expectation can move mountains; this might be an exaggeration, but it can certainly influence the result of a clinical trial. Patients who hope for a cure regularly do get better even if the therapy they receive is useless, and therapists as well as evaluators of the outcomes tend to view the results through rose-tinted spectacles, if they have preconceived ideas about the experimental treatment. Similarly, the parents of a child or the owners of an animal can transfer their expectations, and this is one of several reasons why it is incorrect to claim that children and animals are immune to placebo-effects.
Failure to randomise is another source of bias which can make an ineffective therapy look like an effective one when tested in a clinical trial. If we allow patients or trialists to select or choose which patients receive the experimental and which get the control-treatment, it is likely that the two groups differ in a number of variables. Some of these variables might, in turn, impact on the outcome. If, for instance, doctors allocate their patients to the experimental and control groups, they might select those who will respond to the former and those who don’t to the latter. This may not happen with malicious intent but through intuition or instinct: responsible health care professionals want those patients who, in their experience, have the best chances to benefit from a given treatment to receive that treatment. Only randomisation can, when done properly, make sure we are comparing comparable groups of patients, and non-randomisation is likely to produce misleading findings.
While these options for producing false positives are all too obvious, the next possibility is slightly more intriguing. It refers to studies which do not test whether an experimental treatment is superior to another one (often called superiority trials), but to investigations attempting to assess whether it is equivalent to a therapy that is generally accepted to be effective. The idea is that, if both treatments produce the same or similarly positive results, both must be effective. For instance, such a study might compare the effects of acupuncture to a common pain-killer. Such trials are aptly called non-superiority or equivalence trials, and they offer a wide range of possibilities for misleading us. If, for example, such a trial has not enough patients, it might show no difference where, in fact, there is one. Let’s consider a deliberately silly example: someone comes up with the idea to compare antibiotics to acupuncture as treatments of bacterial pneumonia in elderly patients. The researchers recruit 10 patients for each group, and the results reveal that, in one group, 2 patients died, while, in the other, the number was 3. The statistical tests show that the difference of just one patient is not statistically significant, and the authors therefore conclude that acupuncture is just as good for bacterial infections as antibiotics.
Even trickier is the option to under-dose the treatment given to the control group in an equivalence trial. In our hypothetical example, the investigators might subsequently recruit hundreds of patients in an attempt to overcome the criticism of their first study; they then decide to administer a sub-therapeutic dose of the antibiotic in the control group. The results would then apparently confirm the researchers’ initial finding, namely that acupuncture is as good as the antibiotic for pneumonia. Acupuncturists might then claim that their treatment has been proven in a very large randomised clinical trial to be effective for treating this condition, and people who do not happen to know the correct dose of the antibiotic could easily be fooled into believing them.
Obviously, the results would be more impressive, if the control group in an equivalence trial received a therapy which is not just ineffective but actually harmful. In such a scenario, the most useless or even slightly detrimental treatment would appear to be effective simply because it is equivalent to or less harmful than the comparator.
A variation of this theme is the plethora of controlled clinical trials which compare one unproven therapy to another unproven treatment. Perdicatbly, the results indicate that there is no difference in the clinical outcome experienced by the patients in the two groups. Enthusiastic researchers then tend to conclude that this proves both treatments to be equally effective.
Another option for creating misleadingly positive findings is to cherry-pick the results. Most trails have many outcome measures; for instance, a study of acupuncture for pain-control might quantify pain in half a dozen different ways, it might also measure the length of the treatment until pain has subsided, the amount of medication the patients took in addition to receiving acupuncture, the days off work because of pain, the partner’s impression of the patient’s health status, the quality of life of the patient, the frequency of sleep being disrupted by pain etc. If the researchers then evaluate all the results, they are likely to find that one or two of them have changed in the direction they wanted. This can well be a chance finding: with the typical statistical tests, one in 20 outcome measures would produce a significant result purely by chance. In order to mislead us, the researchers only need to “forget” about all the negative results and focus their publication on the ones which by chance have come out as they had hoped.
One fail-proof method for misleading the public is to draw conclusions which are not supported by the data. Imagine you have generated squarely negative data with a trial of homeopathy. As an enthusiast of homeopathy, you are far from happy with your own findings; in addition you might have a sponsor who puts pressure on you. What can you do? The solution is simple: you only need to highlight at least one positive message in the published article. In the case of homeopathy, you could, for instance, make a major issue about the fact that the treatment was remarkably safe and cheap: not a single patient died, most were very pleased with the treatment which was not even very expensive.
And finally, there is always the possibility of overt cheating. Researchers are only human and are thus not immune to temptation. They may have conflicts of interest or may know that positive results are much easier to publish than negative ones. Certainly they want to publish their work – “publish or perish”! So, faced with disappointing results of a study, they might decide to prettify them or even invent new ones which are more pleasing to them, their peers, or their sponsors.
Am I claiming that this sort of thing only happens in alternative medicine? No! Obviously, the way to minimise the risk of such misconduct is to train researchers properly and make sure they are able to think critically. Am I suggesting that investigators of alternative medicine are often not well-trained and almost always uncritical? Yes.