MD, PhD, FMedSci, FRSB, FRCP, FRCPEd

On 25 and 26 May of this year I wrote two posts about an acupuncture trial that, in my view, was dodgy. To refresh your memory, here is the relevant part of the 2nd post:

This new study was designed as a randomized, sham-controlled trial of acupuncture for persistent allergic rhinitis in adults investigated possible modulation of mucosal immune responses. A total of 151 individuals were randomized into real and sham acupuncture groups (who received twice-weekly treatments for 8 weeks) and a no acupuncture group. Various cytokines, neurotrophins, proinflammatory neuropeptides, and immunoglobulins were measured in saliva or plasma from baseline to 4-week follow-up.

Statistically significant reduction in allergen specific IgE for house dust mite was seen only in the real acupuncture group. A mean (SE) statistically significant down-regulation was also seen in pro-inflammatory neuropeptide substance P (SP) 18 to 24 hours after the first treatment. No significant changes were seen in the other neuropeptides, neurotrophins, or cytokines tested. Nasal obstruction, nasal itch, sneezing, runny nose, eye itch, and unrefreshed sleep improved significantly in the real acupuncture group (post-nasal drip and sinus pain did not) and continued to improve up to 4-week follow-up.

The authors concluded that acupuncture modulated mucosal immune response in the upper airway in adults with persistent allergic rhinitis. This modulation appears to be associated with down-regulation of allergen specific IgE for house dust mite, which this study is the first to report. Improvements in nasal itch, eye itch, and sneezing after acupuncture are suggestive of down-regulation of transient receptor potential vanilloid 1.

…the trial itself raises a number of questions:

  1. Which was the primary outcome measure of this trial?
  2. What was the power of the study, and how was it calculated?
  3. For which outcome measures was the power calculated?
  4. How were the subjective endpoints quantified?
  5. Were validated instruments used for the subjective endpoints?
  6. What type of sham was used?
  7. Are the reported results the findings of comparisons between verum and sham, or verum and no acupuncture, or intra-group changes in the verum group?
  8. What other treatments did each group of patients receive?
  9. Does anyone really think that this trial shows that “acupuncture is a safe, effective and cost-effective treatment for allergic rhinitis”?

In the comments section, the author wrote: “after you have read the full text and answered most of your questions for yourself, it might then be a more appropriate time to engage in any meaningful discussion, if that is in fact your intent”, and I asked him to send me his paper. As he does not seem to have the intention to do so, I will answer the questions myself and encourage everyone to have a close look at the full paper [which I can supply on request].

  1. The myriad of lab tests were defined as primary outcome measures.
  2. Two sentences are offered, but they do not allow me to reconstruct how this was done.
  3. No details are provided.
  4. Most were quantified with a 3 point scale.
  5. Mostly not.
  6. Needle insertion at non-acupoints.
  7. The results are a mixture of inter- and intra-group differences.
  8. Patients were allowed to use conventional treatments and the frequency of this use was reported in patient diaries.
  9. I don’t think so.

So, here is my interpretation of this study:

  • It lacked power for many outcome measures, certainly the clinical ones.
  • There were hardly any differences between the real and the sham acupuncture group.
  • Most of the relevant results were based on intra-group changes, rather than comparing sham with real acupuncture, a fact, which is obfuscated in the abstract.
  • In a controlled trial fluctuations within one group must never be interpreted as caused by the treatment.
  • There were dozens of tests for statistical significance, and there seems to be no correction for multiple testing.
  • Thus the few significant results that emerged when comparing sham with real acupuncture might easily be false positives.
  • Patient-blinding seems questionable.
  • McDonald as the only therapist of the study might be suspected to have influenced his patients through verbal and non-verbal communications.

I am sure there are many more flaws, particularly in the stats, and I leave it to others to identify them. The ones I found are, however, already serious enough, in my view, to call for a withdrawal of this paper. Essentially, the authors seem to have presented a study with largely negative findings as a trial with positive results showing that acupuncture is an effective therapy for allergic rhinitis. Subsequently, McDonald went on social media to inflate his findings even more. One might easily ask: is this scientific misconduct or just poor science?

END OF QUOTE

This and the previous post created lots of discussion and comments. However, the question whether the study in question amounted to scientific misconduct was never satisfactorily resolved. Therefore, I decided to write to the editor of ‘Ann Allergy Asthma Immunol‘ where the trial had been published. He answered by saying I would need to file an official complaint for him to address the issue. On 13 June, I therefore sent him the following email:

Thank you for your letter of 3/6/2016 suggesting I make a formal complaint about the paper entitled ‘EFFECT OF ACUPUNCTURE ON HOUSE DUST MITE…’ [ Ann Allergy Asthma Immunol 2016] by McDonald et al. I herewith wish to file such a complaint.

The article in question reports an RCT of acupuncture for persistent allergic rhinitis. It followed a parallel group design with 3 groups receiving the following interventions:

1.       Acupuncture

2.       Sham-acupuncture

3.       No treatment

There was a plethora of outcome measures and time points on which they were measured. A broad range of parameters was defined as primary endpoints.

The conclusion reached by the authors essentially was that acupuncture affected several outcome measures in a positive sense, thus supporting the notion that acupuncture is efficacious [“Symptoms and quality of life improved significantly and were still continuing to improve 4 weeks after treatment ceased.”] This conclusion, however, is misleading and needs correcting.

The main reasons for this are as follows:

·         Despite the fact that the authors did many dozens of statistical tests for significance, they did not correct for this multiplicity of tests. Consequently, some or most of the significant results are likely to be false positive.

·         Many of the positive results of this paper were not obtained by comparing one group to another but by doing before/after comparisons within one group. This approach defies the principle of a controlled clinical trial. For doing intra-group comparisons, we obviously do not need any control group at all. The findings from intra-group comparisons are prominently reported in the paper, for instance in the abstract, giving the impression that they originate from inter-group comparisons. One has to read the paper very carefully to find that, when inter-group comparisons were conducted, their results did NOT confirm the findings from the reported intra-group comparisons. As this is the case for most of the symptomatic endpoints, the impression given is seriously misleading and needs urgent correction.

On the whole, the article is a masterpiece of obfuscation and misrepresentation of the actual data. I urge you to consider the harm than can be done by such a misleading publication. In my view, the best way to address this problem is to withdraw the article.

I look forward to your decision.

Regards

E Ernst

END OF QUOTE

I had to send several reminders but my most recent one prompted the following response dated 7/11/ 2016:

Dear Professor Ernst,
Thank you for your patience while we worked through the process of considering your complaints levied on the article entitled ‘EFFECT OF ACUPUNCTURE ON HOUSE DUST MITE…’ [ Ann Allergy Asthma Immunol 2016] by McDonald et al. I considered the points that you made in your previous letter, sought input from our editorial team (including our biostatistics editor) , our publisher and the authors themselves. I sent the the  charges ( point by point) anonymously to the authors and allowed them time to respond which they did. I had their responses reviewed by selected editors and , as a result of this process, have decided not to pursue any corrective or punitive action based upon the following :
 
  1. Our editorial team recognizes that this is not the best clinical trial we have published in the Annals of Allergy, Asthma and Immunology. However, neither is is the worst. As in most published research studies, there are always things that could have been done better to make it a stronger paper. Never-the-less, the criticism falls fall short of any sort of remedy that would include withdrawal of the manuscript.
  2. Regarding your accusation that the multiple positive endpoint resulted in the authors making specific therapeutic claims, our assessment is that no specific therapeutic claim was made but rather the authors maintained that the data support the value of acupuncture in improving symptoms and quality of life in patients with AR. We do not believe there was overreach in those statements.
  3. The authors’ stated intent was to show immune changes associated with clinical markers of improvement in the active acupuncture group compared to controls. The authors maintain (and our editors agree) that their data assessments were primarily based upon three statistical tests not “dozens” (as stated in your original letter of complaint).  The power analysis and sample size calculations were presented to us and deemed adequate , making the probability of a type I error quite low.
  4. The authors acknowledge in their paper that there could be limitations to their data interpretation based upon potential disparities between intra- and intergroup comparisons. The editors felt their transparency was adequately disclosed.
In summary, as editor-in-chief of the Annals of Allergy, Asthma and Immunology, I did not find sufficient merit in your charges to initiate any corrective or punitive action for this manuscript. I understand you will strongly disagree with this decision and I regret that. However, in the final analysis, my primary intent is to preserve the objectivity, fairness, and integrity of our journal and its review process. I believe I have accomplished that in this instance.
 
Sincerely
END OF QUOTE
This seems to settle the issue: the study in question does not involve scientific misconduct!
Or does it?
I would be grateful for the view of the readers of this post.

18 Responses to Is this scientific misconduct or not?

  • I well remember this study. It was very poor and should not have seen the light of day.

    However, the response of the editor-in-chief is unsurprising. No editor is ever likely to admit his journal publishes rubbish unless the pressure to do so is extraordinary (think, for example, Lancet, Andrew Wakefield, MMR and autism). To effect ‘corrective or punitive action’ would be an admission of negligence on the part of the referees, one or more members of the editorial board and the editor-in-chief.

    The most interesting line in the response is “Our editorial team recognizes that this is not the best clinical trial we have published in the Annals of Allergy, Asthma and Immunology. However, neither is it the worst.” (Minor typo corrected.) From my perspective this damns the journal as lazy in its criteria for publication: perhaps it runs on admittedly low standards because it has to fill its pages somehow?

    All over the globe, there are ‘journal clubs’ in university departments which meet regularly for (mainly) postgrads and postdocs to discuss a recent publication of interest. The clubs stimulate a competitive emphasis on finding fault with published material. I have rarely seen a paper that is immune from criticism in this setting. Up-and-coming scientists learn from the process how to judge a paper, how to avoid pitfalls in their own research and publications and — dare I say it? — in a few instances how to use weasel words to paper over cracks. Your editor-in-chief is doing this in his point 2.

    The bottom line for most (good) scientists is, I think, to learn to distinguish reasonable studies from poor ones and simply not cite them. Numerically, the poor studies dominate the field, particularly in the present environment of ridiculous bibliometrics. Chasing a journal to admit it’s published a stinker is, as I suspect you already know, like trying to get a believer in one or other superstition to admit there’s nothing rational in their faith.

    PS I have a number of scientific colleagues who refuse to cite work that is not open-access. This criterion alone would render the acupuncture study uncited by many.

    • I meant to add… Of course, acupuncturists will use this paper as ‘evidence’ in their publicity, but they already do that with plenty of other publications. What thinking people will do is to seek the views of scientifically competent individuals — they can see examples in the two blog posts about this particular paper on this site.

      For convinced believers, there’s no amount of drawing attention to the paper’s unstated conflicts of interest, its scientific and statistical inadequacies and its fundamental incompetence that will move their opinions. We’ve seen enough examples of that kind of intransigence to prove that point.

  • This is maybe slightly off topic but then again maybe not. I’ve been in contact with a number of editors regarding this group’s acupuncture papers, but then specifically regarding their conflicts of interests. This is about being associated with a number of “for profit” acupuncture clinics , publishing scientific papers on the services offered by these clinics, receiving donation(s) from these clinics and failing to declare to have a conflict of interest. Not once, it happens quite a lot.

    https://frankvanderkooy.wordpress.com/2016/10/28/the-nicm-and-their-undeclared-conflict-of-interest-an-example-of-scientific-misconduct

    But back to this blog post. McDonald does have a clinic where acupuncture services are being offered and yet they failed to declare to have a conflict of interest – and this is clearly a conflict of interest! Maybe something to get the editor, who doesn’t want any negative publicity for his journal, over the line? Science doesn’t seem to be able to do it, so, maybe if they break the rules with a simple thing such as not declaring to have a conflict of interest something might happen?

  • IANAS but I can’t help feeling the bullet points in your formal complaint should’ve included cites and/or direct references to the relevant portions of the paper, to ensure they can’t just weasel out via equally imprecise “he-said-she-said” arguments (which is what their response sounds like).

    You say they did “many dozens of tests”; they say they made their assessments on “primarily” three. That makes me wonder:

    1. How many tests did they do in total? Cos “many dozens” and “three” are both factual statements and a huge discrepancy which should be straightforwardly testable just by tallying them up, while “primarily” is just the sort of shady-looking hidey-hole for any amount of sins that genuine science should be relentless in eliminating, so needs a good hard dissecting just to confirm it was merely a loose turn of phrase and nothing more.

    2. If they did do dozens of tests and discarded the majority and only used a few, when did they discard those tests, why did they do it, and how do they justify it? There’s a big difference between discarding a test before you run it (e.g. because you realize it’s stupid and unnecessary, your boss refuses to release the funding, and so on) and discarding it after (e.g. because you mucked it up, you didn’t like the results, etc).

    #1 should be straightforward to itemize, at which point it’ll be clear to everyone whether your “many dozens” is hyperbole (in which case you shot yourself in the foot), or if their “[primarily] three” needs good hard questions asked, like:

    1. Why their research methods are either so goddamn awful that they subsequently had to throw away ~90% of their work as irrelevant to the investigation, or produce results that are of too-poor-quality to be used? (In which case why should we assume the remaining ~10% work is any better?)

    2. Why we shouldn’t naturally reach for Occam’s Razor and assume the most promising explanation is that they threw away the ~90% tests that yielded results they didn’t like (e.g. negative) and kept only the ~10% that did? Cos if readers have to ask… At the very least, if they did several dozen tests but then focused on primarily three, there should be a comprehensive explanation in the paper, because you don’t do that much work and not say anything about it unless you’ve a very good reason not to (e.g. it didn’t work as you’d hoped and you’re too embarrassed/dishonest to share this information).

    On the positive side, props to the editor on admitting they’ve published even worse! (Though if that’s the case he might want to check their impact factor as it may have a rounding error in it.)

    • I counted in excess of 50 statistical tests.

      • So how in the world did they get from 50 to 3? Perhaps you should paste together a 50-item bullet list identifying each of those tests in the paper and ask them to clarify the purpose of the other 47, and their reasons for discarding them? You know, for the sake of good clear unambiguous science. Heck I flunked high school stats a few decades ago and haven’t touched math since, but even I know that you decide on the exact tests you’re going to perform up front, then do them and accept their results in full. You don’t get to cherry-pick from those results or go back and try some different tests instead, cos that injects bias which defeats the whole point of doing a statistical analysis in the first place.

        As I say IANAS, but I’m not a complete stranger to basic critical thinking, and neither this numerical discrepancy nor the editor’s response pass even my rudimentary smell test. At the very least it requires an ironclad explanation, which they clearly haven’t got or they’d have already presented it. Gonna call toilet paper.

  • If professional acupuncturists are actively involved in the design and/or implementation of an acupuncture study, are you saying that automatically constitutes a conflict of interest, and that such studies should therefore be ignored?

  • Well, I wouldn’t exactly call this “misconduct,” but how WAS “real” acupuncture defined such that “sham” acupuncture could be determined and compared? Were the subjects who received the “real” acupuncture treated with a needle in Large Intestine 4 (LI-4) while those treated with “sham” acupuncture had a needle stuck in Liver 2 (LV-2) … or some other point that couldn’t POSSIBLY treat allergic rhinitis effectively and predictably? 🙂

    What is REAL acupuncture?

    ~TEO.

    • HERE ARE THE 2 RELEVANT PARAGRAPHS FROM THE ARTICLE:
      Traditional Chinese-style single-use disposable stainless steel
      needles (0.25  40 mm) (C and G acupuncture needles; Helio
      Supply Co Pty Ltd, Sydney, Australia) were used. The needles were
      inserted at the indicated points but not manipulated. Yintang, LI 20,
      and GV 23 were needled obliquely to a depth of 3 to 5 mm, whereas
      LI 4 and ST 36 were needled perpendicularly to a depth of 10 to 15
      mm. Needles were retained for 20 minutes (without manipulation)
      and then removed.
      Nonechannel points used in sham acupuncture protocols have
      been shown not to be inert28,29; however, because there is no sham
      acupuncture protocol that has been validated as inert,30,31 needling
      nonechannel points was the most appropriate invasive sham
      protocol available.

      • Thanks, Edzard. That’s exactly what I had in mind. Sounds like they needled many of TCM’s greatest hits, using them as the REAL points. And, keeping within the Meridian Mythology of TCM, they tried to use the right channels for the presentation. But did they?

        Trouble is, among other things TCM, there IS no “allergic rhinitis” anymore than there are two livers, just because there’s a Liver Channel on both sides of the body. And, theres no immune system, hence no “allergic.” So, in short, the diagnosis of whatever is causing the runny nose, might, in Traditional Chinese Medical terms be wrong. Yet, it seems they’ve used Traditional Chinese Medical points to treat a “western” diagnosis.

        For example, since in TCM, the Spleen (SP) is an organ of digestion (not the stomach as you probably thought), you would use Spleen Points and the Spleen Channel (among others) to treat diagnoses of indigestion or nausea, let’s say, you know, to reverse the direction of Spleen QI in the case of nausea. Or, if you had a bad headache, you might, in TCM terms, treat Liver YANG rising by putting a needle in the web between the big toe and its next door neighbor … you know, to bring HEAT and YANG down from the head of the patient. You get the point.

        So, oddly, I see the biggest hole in the study as treating something that wasn’t properly diagnosed in the same terms as what was used to treat it. I mean, let’s face it, if you choose the wrong medicine to treat, why would you expect a good result or how could you confidently conclude the results were valid? 🙂

  • If a study with no specific outcome is hypothesised, and something statistically significant (acupuncture – dust mite link) pops up, that is grounds to repeat the study to confirm the validity of the previous finding. I would then accept the conclusions if an acupuncture – dust mite link was the ONLY thing the study was trying to establish.

Leave a Reply

Your email address will not be published. Required fields are marked *

Gravityscan Badge

Recent Comments

Note that comments can be edited for up to five minutes after they are first submitted.


Click here for a comprehensive list of recent comments.

Categories