Fujii was a Tokyo-based, Japanese anaesthetist who was unbelievably prolific in publishing clinical trials and other research papers. In 2012 he was effectively proven to have fabricated most of his results by a forensic statistical analysis of his published data which showed that his subjects could not possibly have been randomly allocated to different treatment groups as he claimed. He holds the current record for the most retracted papers at around 190. Serious concerns were first raised publicly about the authenticity of his data in 2000 and again in 2001 but he was nonetheless able to continue publishing prolifically for another decade. Many of his trials concerned the evaluation of treatments for post-operative nausea and vomiting and long before he was officially unmasked as a fraudster, his data seems to have been effectively ignored by those devising international guidelines for managing this problem. He used a number of ploys to deflect suspicion: he used co-authors in other institutions without their knowledge to explain his apparently huge throughput of clinical subjects and even resorted to forging signatures of supposed co-authors. In his later years he published many papers in journals that did not specialise in anaesthesiology to help reduce expert scrutiny and avoid referees and editors who might have been aware of his reputation.
Yoshitaka Fujii was born in 1960 and graduated in medicine from the Tokai University School of Medicine in Tokyo and obtained his Japanese medical licence in 1987. He did postgraduate studies at the Tokyo Medical and Dental University before taking up a professional appointment in the Department of Anesthesiology at the Toride Kyodo General Hospital in 1991. He became a registered anaesthesiologist in 1990 and a “Board Certified Anaesthesiologist” in 1994. From 1995 to 1997 he worked in Canada at McGill University, Ontario. From 1997-2005 he was based at the University of Tsukuba, Institute of Clinical Medicine and in 2005 moved to Toho University School of Medicine, Tokyo until his dismissal in 2012. He was a member of several Japanese learned societies relating to anaesthesiology and also the American and Canadian Societies for Anesthesiologists.
From 1991 to 2011, he published prolifically and he authored or co-authored around 250 articles in scientific and medical journals. This is an average of more than 12 publications per year over his career. More than half of these publications were randomized double-blind placebo controlled trials (RCT), the “gold standard of evidence” for medical decision-making. He also published about 40 papers reporting the results of experimental studies performed with dogs. He received public funds to conduct his studies, received grants to speak at two sponsored seminars and his prolific publication record would have been a key factor in his academic appointments.
Fujii’s main area of interest was in the prevention and treatment of post-operative nausea and vomiting (PONV) which affects up to a third of patients undergoing general anaesthesia. It can be a very unpleasant and distressing condition for individual patients and can delay discharge especially in patients undergoing day surgery. It can sometimes have more serious consequences such as aspiration of gastric contents. Fujii also wrote some papers about other topics and the main areas are listed below.
- Neuromuscular blockade – the use of drugs (like the plant extract curare) that cause paralysis or relaxation of muscles during surgery.
- Cardiovascular responses to airway manipulation – when the airway is mechanically stimulated in anaesthetised patients (e.g. by insertion of a tube) it causes a rise in heart rate and blood pressure unless a local anaesthetic is used.
- Pain on injection of propofol – this is an intravenous anaesthetic that sometimes produces intense pain upon injection.
A decade of suspicion and veiled accusations
In his clinical trials, Fujii claimed to have used almost 14,000 human subjects and this averages out at something like 600 patients per year. In 1998 he published more than 30 RCTs with no fewer than 3000 subjects. His studies with dogs reported data from about 700 mongrel dogs over a six-year period. This is an amazingly high throughput of patients and very high usage of dogs. There should have been independent records against which reported patient enrolment onto trials and reported animal used could be checked.
In 2000, Peter Kranke and two German colleagues from the University of Würzburg, wrote a letter to the editor of the journal Anesthesia and Analgesia in response to a paper by Fujii et al about prevention of PONV in women undergoing gynaecological surgery. Kranke et al identified 47 articles authored by Fujii et al over the period 1994-1997 using the drug granisetron to prevent PONV. The most frequently reported side effect in 21 of these papers was headache and in 13 of these articles the frequency of headache was identical in all groups and only differed by a maximum of one in the others. In 10 out of the 18 studies with more than two subject groups there was an identical incidence of headache in all groups and they calculated that the probability of this occurring by chance was p=0.0000000068 or less than 1 in a hundred million. They made no direct accusation of fraud but the implication was clear:
“..We have to conclude that there must be an underlying influence causing such incredibly nice data reported by Fujii et al.” (i.e. we believe this data is fabricated?)
As an analogy, let us assume that someone reported the results of a series of 100 groups of 20 coin tosses. One would expect the average result over the 100 groups to be close to a 10-10 heads/tails split. However, you would also expect variability in the individual group results centred on this 10-10 split. If it was claimed that on 80 of the 100 occasions the split was exactly 10-10 and that in the other 20 groupings the split was either 11-9 or 9-11 then you would instinctively suspect that the data had been fabricated. Statistical analysis would confirm that this distribution was so improbable as to be impossible.
In the first paragraph of his response to Kranke et al (2000), Fujii simply describes how his group have shown that granisetron has been effective in reducing the incidence of PONV after various types of surgery. In the second and last paragraph he reiterates that mild headache was experienced by about 10% of study patients. The only direct response to the implicit “accusations” in the Kranke et al letter is in the penultimate sentence:
“Consequently, an incidence of headache seems to be identical, but it was true.” (i.e. trust me I’m a doctor?).
In 2001 Kranke et al published a meta-analysis in Acta Anaesthesiologica Scandanavica that reviewed the effect of the drug granisetron in preventing PONV. They identified 27 clinical trials with almost 3000 patients that were suitable for inclusion in their analysis and overall these studies suggested that granisetron had a significant effect at both high and low doses. They then separated these trials into two categories. About two-thirds of the studies they termed as from a dominant centre; they all had Fujii as a co-author rather than originating from one institution. The others category were those where Fujii was not a co-author. These two categories produced quite different results when analysed separately. Overall the Fujii papers found granisetron to be more effective but a low dose of granisetron had no significant effect whereas a high dose had a large effect. The “other authors” group reported that the effect of granisetron at the high and low doses was almost exactly the same.
Kranke et al concluded that the dominating centre (i.e. Fujii papers) had significantly altered the outcome of the meta-analysis. They suggested that the dominating centre’s results should be excluded from the analysis or analysed separately to make the difference between the two groups apparent. It is striking that publications by Fujii accounted for two-thirds of all the patients used in granisetron trials around the world. Kranke et al subtly implied that the results of Fujii et al were so out of line with the rest of the world that they were highly suspicious. An editorial published in the same journal issue undermined this interpretation. This editorial was entitled “Meta-analysis: a valuable but easily misused tool”- a title that I would generally have some sympathy with. This editorial suggested that perhaps differences in the initial baseline risks of the two sub-divisions of studies (Fujii vs the rest of the world) could be the reason why the outcomes were so different in the two groups. If there had been no other concerns about the integrity of Fujii then it would have been reasonable to look for rational scientific explanations such as this for the differences between the two groups. However, on top of the evidence also from the Würzburg group in the letter to Anaesthesia and Analgesia, then this meta-analysis probably should have triggered a formal investigation and an end to Fujii’s fraudulent research. In fact Fujii continued to publish his fabricated data for more than a decade after these implicit allegations and he even managed to gain a new position at Toho University in 2005.
Martin Tramer wrote an editorial in the European Journal of Anaesthesiology in 2013 about the Fujii affair to coincide with the retraction of the 12 Fujii papers from the journal. Tramer discusses the letter by Kranke et al and then spends some time analysing the post 2001 events when hindsight would suggest that Fujii’s fraudulent activities should have been stopped.
Tramer notes that Fujii’s rate of publication of papers dropped quite sharply after 2000 when these suspicions were first publicly aired. According to Carlisle (2012), Fujii published less than 10 RCTs per year after 2000 and generally less five per year in his last few publishing years. In the two years just prior to the Kranke accusations, he published over 20 RCTs per year with a peak of no less than 32 trials in 1998 with a total of over 3000 patients involved. Interestingly in the period immediately after Kranke et al’s allegations (2000-2005) he published a flurry of experiments with dogs. Tramer also noted that after these implied accusations in two specialised anaesthesiology journals, he started to publish a high proportion of his work in journals where the main focus was in areas like surgery, pharmacology, gynaecology and ophthalmology rather than anaesthesiology per se. Presumably the editors and referees for these journals would be less focused upon anaesthesiology and less aware of the doubts about Fujii’s work amongst anaesthesiologists. Listed below are the percentages of his papers published in specialist anaesthesiology journals between 1990 and 2010 on either side of the 2000 accusations by Kranke et al.
1990-1995 100% of around 20 articles published in anaesthesiology journals
1996-2000 94% of around 80 articles
2001-2005 47% of about 60 articles
2006-2010 34% of 28 articles
Despite this trend, he still managed to publish an average of over 5 papers per year in specialist anaesthesiology journals between 2005 and 2011.
Tramer was a member of a panel of international experts who met in 2002 and 2005 to set international guidelines for the management of PONV and he explains how these expert groups ignored Fujii’s work when they set these guidelines. In 2002, delegates were faced with a situation where almost all the published output on the drug granisetron was associated with Fujii. The report cited just four papers about granisetron and the mass of papers by Fujii et al were ignored. Neither the drug’s manufacturers nor anyone else, subsequently commented upon the absence of Fujii’s work and Tramer concludes that:
“Fujii had become invisible”
Again in the 2005 report, no reference was made to Fujii’s work and a new drug called ramosetron was ignored because the only published studies emanated from Fujii and again this decision to ignore Fujii’s work was accepted without question.
Tramer criticises the authors of three meta-analyses published in 2006 and 2007 that included the work of Fujii in their analyses. He also mentions a 2012 meta-analysis that included Fujii’s work and whose authors openly justified the inclusion of his work because:
- The criticism of it seemed to be coming from one group of authors (bias?)
- That consistency might not necessarily mean fraud
- None of his papers had yet been retracted.
This latter point emphasises again how important it is that the scientific literature should be purged of suspect data by fraudulent authors.
Tramer’s criticism included a meta-analysis co-authored by John Carlisle and Carl Stevenson published in the Cochrane Database of Systematic Reviews CDSR. (Carlisle ultimately played a decisive role in proving data fabrication and triggering the retraction of most of Fujii’s tainted output.) In May 2013 Carlisle and colleagues responded to Tramer’s criticism in an editorial in the CDSR. They pointed out that the authors and editors faced a dilemma. They were convinced that most of Fujii’s was fabricated but none of his papers had been retracted and no one had officially investigated his integrity. The authors, editors and publishers sought a way of conveying their suspicions to readers whilst minimising the risk of successful litigation. The general conclusion was that the drugs used to treat PONV are relatively inefficient; lots of people need to be treated with the drugs that will only benefit a small proportion. The abstract and the plain language summary make no mention of Fujii and the discrepancy between his results and those of other authors. In the full report however, Carlisle and Stevenson did have a section with the heading:
“Post-hoc interstudy analysis: studies authored by Fujii et al”.
In this they compared Fujii’s results with those from all others and made clear that they were doing this because of criticisms of Fujii’s work. This analysis showed that Fujii’s studies were generally more favourable to granisetron than all other studies. They showed several Forest plots in which Fujii’s results were highlighted by colour just to show how their distribution of results was very different from that recorded in other papers. Carlisle and Stevenson expected that anaesthesiologists reading the whole paper would understand that Fujii’s work was under suspicion and ask themselves:
- Why was Fujii’s work analysed separately and compared to that of other authors?
- Why is granisetron more effective when used by Fujii than when used by other anaesthesiologists?
- Why is the distribution of Fujii’s data in Forest plots so different to that of other authors?
Conclusive evidence of data fabrication is finally produced
In 2012, John Carlisle published a statistical analysis of data in 168 controlled trials published by Fujii et al to test data integrity. Carlisle showed that the distribution of general background data about the subjects in these papers could not have been generated by random allocation of subjects to groups as was claimed and this is a prerequisite of a RCT. For example, the chances of the patient weight distributions being generated by random allocation was 1 in 1033 (i.e. 1 in 10 with 33 extra zeros after it); similar astronomically high values were found for several other background characteristics like patient height, age, operation duration, anaesthetic time etc. This analysis was so convincing that it was almost immediately accepted as proof of data fabrication although Carlisle does not make a direct accusation of data fabrication. In a follow-up paper, Carlisle (2012a) reports another meta-analysis whose title makes it clear that the aim is to how the results for the prevention of PONV of Fujii et al differ from those of other authors. The drugs granisetron and ramosetron appear to be much more effective when used by Fujii et al than when used by other anaesthetists. Fujii’s data also indicated synergism when different types of anti-emetics were used together but there was no hint of this synergism and maybe a suggestion of antagonism from other authors.
Armed with Carlisle’s evidence, the editors of 23 journals from around the world where Fujii had published his fraudulent data sent a formal, signed request (April 6th 2012) to 11 senior Japanese professors at the institutions listed on Fujii’s papers asking them to investigate the authenticity of his work. This letter cites the analysis by Carlisle and also notes that 8 of his papers had already been retracted by Toho University because of lack of ethical approval. This letter prompted an investigation and report by the Japanese Society of Anesthetists (JSA).
The JSA committee looked at Fujii’s original research data, laboratory note books, hospital records and they also interviewed Fujii and his co-authors. Of the 212 papers scrutinised, they were only able to verify and authenticate the data and subject numbers for three; this data was not fabricated but was not collected by Fujii himself. They concluded that 172 papers contained fabricated data including 126 RCTs and for the remaining 37 papers they could not make a definite judgement. The number of patients who were enrolled on his studies did not match the numbers on hospital records at institutions where the work was claimed to have been conducted. His papers were generally vague about where and when studies had been conducted and did not specify which institution had granted ethical approval. He made it appear that many of his studies were multi-site studies to disguise the very high patient turnover. He used co-authors to add credibility to his papers and to deflect questions about the sheer volume of work being published often using their names without their agreement or knowledge.
The JSA committee criticised Professor Hidenoori Toyooka who was a long time senior associate of Fujii and published extensively with him. Professor Toyooka was at one time Editor-in-Chief of the Japanese Journal of Anesthesia. The committee concluded that he clearly acknowledged his association with Fujii and claimed credit for their joint publications for himself and the University of Tsukuba. He was said to have been aware of the deep suspicions about Fujii’s work raised in 2000 by the Kranke et al letter but took no action. Other co-authors were found not to be complicit in the fraudulent activities of Fujii; some did use the Fujii et al publications to boost their CV but some were apparently not aware of papers published in their names. In many cases listed, co-authors did not sign any authorisation to publish and at least once he forged the signatures of co-authors on the authorisation form.
Most of Fujii’s large volume of published work has now been retracted; 183 of his papers had been retracted by May 2013. This figure of 183 retractions shatters the record for retracted papers previously held by the German anaesthetist Joachim Boldt who had briefly held this dubious distinction with a mere 88 retracted articles.