ANIL POTTI, Joseph Nevins and their colleagues at Duke University in Durham, North Carolina, garnered widespread attention in 2006. They reported in the New England Journal of Medicine that they could predict the course of a patient’s lung cancer using devices called expression arrays, which log the activity patterns of thousands of genes in a sample of tissue as a colourful picture (see above). A few months later, they wrote in Nature Medicine that they had developed a similar technique which used gene expression in laboratory cultures of cancer cells, known as cell lines, to predict which chemotherapy would be most effective for an individual patient suffering from lung, breast or ovarian cancer.
2006年,北卡罗来纳州达勒姆市杜克大学的阿尼尔•波提、约瑟夫•奈文斯及其同事受到广泛关注。他们在《新英格兰医学杂志》上发表文章,说是可以使用一种叫做表达序列的手段来预测肺癌病人的癌变过程,这种方法可以把组织样本中成千种基因的活动形态记录成一张彩色图片(如上图)。几个月后,他们又在《自然医学》上报告说,他们已经研发出一种同类的技术,可以使用实验室培养的癌细胞(即细胞系)中的基因表达,来预测对于一个肺癌、乳腺癌或卵巢癌患者来说,哪一种化疗方法会最有效。
At the time, this work looked like a tremendous advance for personalised medicine—the idea that understanding the molecular specifics of an individual’s illness will lead to a tailored treatment. The papers drew adulation from other workers in the field, and many newspapers, including this one (see article), wrote about them. The team then started to organise a set of clinical trials of personalised treatments for lung and breast cancer. Unbeknown to most people in the field, however, within a few weeks of the publication of the Nature Medicine paper a group of biostatisticians at the MD Anderson Cancer Centre in Houston, led by Keith Baggerly and Kevin Coombes, had begun to find serious flaws in the work.
当时,这一研究好象是个性化医疗的重大突破——想想看,只要我们了解一个人的病症在分子层面有哪些特点,就可以对之进行针对性的治疗,这是多么聪明的想法!因此,这些论文从同一领域的其他研究者那里获得了大量赞誉,包括《经济学人》在内的许多报纸杂志都载文予以报道。这个研究小组接着开始组织一系列的临床试验,为肺癌和乳腺癌病人提供个性化的治疗。不过,这个领域里的多数人都不知道,在《自然医学》上那篇论文发表后短短几周内,休斯顿安德森癌症中心的一群生物统计学家就在基思•拜格里和凯文•库姆斯的领导下,开始在上述研究中发现严重的缺陷。
Dr Baggerly and Dr Coombes had been trying to reproduce Dr Potti’s results at the request of clinical researchers at the Anderson centre who wished to use the new technique. When they first encountered problems, they followed normal procedures by asking Dr Potti, who had been in charge of the day-to-day research, and Dr Nevins, who was Dr Potti’s supervisor, for the raw data on which the published analysis was based—and also for further details about the team’s methods, so that they could try to replicate the original findings.
拜格里博士和库姆斯博士本来是因应安德森中心的临床研究员想要使用这项新技术的需要,而准备再现波提博士的研究结果的。最初遇到困难时,他们按照常规,请负责日常研究工作的波提博士以及他的主管奈文斯博士提供他们所发表的论文原始数据,并进一步详细询问他们的研究方法,以便能够复制他们原来的发现。
A can of worms问题多多
Dr Potti and Dr Nevins answered the queries and publicly corrected several errors, but Dr Baggerly and Dr Coombes still found the methods’ predictions were little better than chance. Furthermore, the list of problems they uncovered continued to grow. For example, they saw that in one of their papers Dr Potti and his colleagues had mislabelled the cell lines they used to derive their chemotherapy prediction model, describing those that were sensitive as resistant, and vice versa. This meant that even if the predictive method the team at Duke were describing did work, which Dr Baggerly and Dr Coombes now seriously doubted, patients whose doctors relied on this paper would end up being given a drug they were less likely to benefit from instead of more likely.
波提博士和奈文斯博士回答了他们的问题,还公开更正了几个错误,但是拜格里博士和库姆斯博士还是发现,他们的所谓预测方法其实跟碰运气差不多。而且,新的问题还在接踵而至。比如,他们发现在一篇论文中,波提博士及其同事把他们用来推导化疗预测模型的细胞系标错了,把敏感性的标成了有抗性的,又把有抗性的标成了敏感性的。这也就是说,即使杜克大学研究小组所描述的这个预测方法是可行的——当然,拜格里博士和库姆斯博士现在已经表示严重怀疑——假如医生依赖这篇论文的话,就会给病人开错药,结果病人获益的程度可能不但不会增加,反而会降低。
Another alleged error the researchers at the Anderson centre discovered was a mismatch in a table that compared genes to gene-expression data. The list of genes was shifted with respect to the expression data, so that the one did not correspond with the other. On top of that, the numbers and names of cell lines used to generate the data were not consistent. In one instance, the researchers at Duke even claimed that their work made biological sense based on the presence of a gene, called ERCC1, that is not represented on the expression array used in the team’s experiments.
安德森中心的研究员说他们还发现了一个错误,就是用来比较基因和基因表达数据的表格错位了——基因列相对于表达数据出现了位移,结果彼此不相对应。此外,用来得出数据的细胞系的数量和名称前后也不一致。有一个地方,杜克大学的研究员甚至声称由于一个叫做ERCC1的基因的存在,证明他们的研究完全合乎生物科学,可是这个基因在他们实验所使用的表达系列里根本就没有出现。
Even with all these alleged errors, the controversy might have been relegated to an arcane debate in the scientific literature if the team at Duke had not chosen, within a few months of the papers’ publication (and at the time questions were being raised about the data’s quality) to launch three clinical trials based on their work. Dr Potti and his colleagues also planned to use their gene-expression data to guide therapeutic choices in a lung-cancer trial paid for by America’s National Cancer Institute (NCI). That led Lisa McShane, a biostatistician at the NCI who was already concerned about Dr Potti’s results, to try to replicate the work. She had no better luck than Dr Baggerly and Dr Coombes. The more questions she asked, the less concrete the Duke methods appeared.
虽然存在这些有争议的错误,如果杜克大学的研究小组没有选择在论文发表后几个月内(而且当时关于数据质量存在的问题已经开始浮出水面),就以他们的研究为基础开始三项临床试验,这场论战本来可能会降级为科学文献里晦涩难懂的争辩。波提博士及其同事还计划把他们的基因表达数据用来在一项由美国国家癌症研究所(NCI)出资进行的肺癌试验中指导治疗方法的选择,这就促使原本就对波提博士的研究结果持保留意见的NCI生物统计学家丽萨•麦克申也开始尝试复制他们的工作。结果跟拜格里博士和库姆斯博士一样,她提出的问题越多,杜克大学的研究方法就显得越不靠谱。
In light of all this, the NCI expressed its concern about what was going on to Duke University’s administrators. In October 2009, officials from the university arranged for an external review of the work of Dr Potti and Dr Nevins, and temporarily halted the three trials. The review committee, however, had access only to material supplied by the researchers themselves, and was not presented with either the NCI’s exact concerns or the problems discovered by the team at the Anderson centre. The committee found no problems, and the three trials began enrolling patients again in February 2010.
有鉴于此,NCI向杜克大学的管理层就该项研究的实际情形表示了担忧。2009年10月,杜克大学的官员对波提博士和奈文斯博士的研究工作进行了外部审核,并暂时中止了三项临床试验。可是审查委员会只收到了研究方自己所提供的材料,他们既不知道NCI的保留意见细节,也不清楚安德森中心研究人员所发现的问题。结果委员会没发现任何问题,三项临床试验也于2010年2月重新开始招收病人。
Finally, in July 2010, matters unravelled when the Cancer Letter reported that Dr Potti had lied in numerous documents and grant applications. He falsely claimed to have been a Rhodes Scholar in Australia (a curious claim in any case, since Rhodes scholars only attend Oxford University). Dr Baggerly’s observation at the time was, “I find it ironic that we have been yelling for three years about the science, which has the potential to be very damaging to patients, but that was not what has started things rolling.”
一直到2010年7月,《癌症通讯》报道说,波提博士在若干文件和拨款申请中造了假,谎称自己是澳大利亚的罗兹奖学者(这一声明非常稀奇,因为罗兹奖学者是只能上牛津大学的),事情才开始败露。拜格里博士当时评论说:“我觉得特别具有讽刺意味的是,我们为了他们的研究大喊大叫了三年,因为可能会严重损害到病人,可那并没能让他们停下来。”
A bigger can?还有更多的问题?
By the end of 2010, Dr Potti had resigned from Duke, the university had stopped the three trials for good, scientists from elsewhere had claimed that Dr Potti had stolen their data for inclusion in his paper in the New England Journal, and officials at Duke had started the process of retracting three prominent papers, including the one in Nature Medicine. (The paper in the New England Journal, not one of these three, was also retracted, in March of this year.) At this point, the NCI and officials at Duke asked the Institute of Medicine, a board of experts that advises the American government, to investigate. Since then, a committee of the institute, appointed for the task, has been trying to find out what was happening at Duke that allowed the problems to continue undetected for so long, and to recommend minimum standards that must be met before this sort of work can be used to guide clinical trials in the future.
2010年底,波提博士离开了杜克大学,而杜克大学也彻底终止了那三项临床试验。有其它地方的科学家宣称,波提博士在《新英格兰医学杂志》上的那篇论文中盗用了他们的数据,于是杜克大学的官员启动程序,开始撤回连同《自然医学》上那篇文章在内的三篇重要论文。(《新英格兰医学杂志》上的那篇论文不算在这三篇之内,不过也已于今年3月撤回。)与此同时,NCI和杜克大学的官员邀请为美国政府提供专家咨询的医学协会着手进行调查。从那时开始,该协会专门为此成立的委员会一直在研究,这些问题为什么会在杜克大学持续存在这么久而没人发现,以便就将来把此类研究工作用于指导临床试验前必须满足的最低标准提出建议。
At the committee’s first meeting, in December 2010, Dr McShane stunned observers by revealing her previously unpublished investigation of the Duke work. Subsequently, the committee’s members interviewed Dr Baggerly about the problems he had encountered trying to sort the data. He noted that in addition to a lack of unfettered access to the computer code and consistent raw data on which the work was based, journals that had readily published Dr Potti’s papers were reluctant to publish his letters critical of the work. Nature Medicine published one letter, with a rebuttal from the team at Duke, but rejected further comments when problems continued. Other journals that had carried subsequent high-profile papers from Dr Potti behaved in similar ways. (Dr Baggerly and Dr Coombes did not approach the New England Journal because, they say, they “never could sort that work enough to make critical comments to the journal”.) Eventually, the two researchers resorted to publishing their criticisms in a statistical journal, which would be unlikely to reach the same audience as a medical journal.
在该委员会于2010年12月召开的第一次会议上,麦克申博士公布了她之前就杜克大学的研究所做的未公开发表的调查,结果令观察员大吃一惊。随后,委员会的成员就拜格里博士梳理数据时所遇到的问题跟他进行了谈话。拜格里博士说,除了无法自由取得作为该项研究基础的计算机编码和前后一致的原始数据外,那些发布了波提博士论文的杂志都不肯发表他批评该项研究的信函。《自然医学》发表了一封他的信,连同杜克大学研究小组的辩驳,但是在问题持续得不到解决时,却拒绝发表更多的评论。其它后来发表了波提博士高调论文的刊物也一样。(拜格里博士和库姆斯博士没有跟《新英格兰医学杂志》联系过,因为他们“一直无法彻底理清那项研究,因而无法向该杂志提出批评意见”。)最终,这两位研究员只得把他们的批评发表在一个统计刊物上,其读者群自然跟医学杂志的不一样。
Two subsequent sessions of the committee have included Duke’s point of view. At one of these, in March 2011, Dr Nevins admitted that some of the data in the papers had been “corrupted”. He continued, though, to claim ignorance of the problems identified by Dr Baggerly and Dr Coombes until the Rhodes scandal broke, and to support the overall methods used in the papers—though he could not explain why he had not detected the problems even when alerted to anomalies.
该委员会接下来的两次会议听取了杜克大学的意见。在2011年3月的那次会议上,奈文斯博士承认这些论文中有些数据“存在缺陷”。不过,他声称在罗兹丑闻发生前,他对于拜格里博士和库姆斯博士所发现的问题一无所知,而且他对于论文中所使用的总体方法表示支持,虽然他无法解释为什么当有人指出一些反常情况时,他没能及时发觉问题。
At its fourth, and most recent meeting, on August 22nd, the committee questioned eight scientists and administrators from Duke. Rob Califf, a vice-chancellor in charge of clinical research, asserted that what had happened was a case of the “Swiss-cheese effect” in which 15 different things had to go awry to let the problems slip through unheeded. Asked by The Economist to comment on what was happening, he said, “As we evaluated the issues, we had the chance to review our systems and we believe we have identified, and are implementing, an improved approach.”
8月22日,在第四次也是最近的一次会议上,委员会询问了杜克大学的八位科学家及其管理人员。负责临床研究的副校长罗勃•卡利夫说,这起事件是典型的“瑞士奶酪效应”,也就是说,有15件不同的事情都出了错,才导致问题发生而无人察觉。当《经济学人》就所发生的事请他评论时,他说:“我们在评估这些问题时,也借这个机会审查了我们的系统。我们相信已经找到了、并且正在实施改进的措施。”
The university’s lapses and errors included being slow to deal with potential financial conflicts of interest declared by Dr Potti, Dr Nevins and other investigators, including involvement in Expression Analysis Inc and CancerGuide DX, two firms to which the university also had ties. Moreover, Dr Califf and other senior administrators acknowledged that once questions arose about the work, they gave too much weight to Dr Nevins and his judgment. That led them, for example, to withhold Dr Baggerly’s criticisms from the external-review committee in 2009. They also noted that the internal committees responsible for protecting patients and overseeing clinical trials lacked the expertise to review the complex, statistics-heavy methods and data produced by experiments involving gene expression.
杜克大学的过失和错误包括:在处理波提博士、奈文斯博士及其他研究员所公开的潜在财务利益冲突时反应过慢,这其中包括当事人在表达分析公司(Expression Analysis Inc)和癌症指导诊断公司(CancerGuide DX)中的介入,而这两间公司跟杜克大学也有关联。此外,卡利夫博士及其他高级管理人员也承认,研究工作出问题以后,他们过于倚赖奈文斯博士和他的判断,以至于在2009年举行那次外部审查时,没有把拜格里博士的批评意见告诉审查委员会。他们也注意到,大学内部负责保护病人和监督临床试验的委员会缺乏专业知识,无法审查基因表达试验中所使用的复杂统计方法及其所得出的数据。
That is a theme the investigating committee has heard repeatedly. The process of peer review relies (as it always has done) on the goodwill of workers in the field, who have jobs of their own and frequently cannot spend the time needed to check other people’s papers in a suitably thorough manner. (Dr McShane estimates she spent 300-400 hours reviewing the Duke work, while Drs Baggerly and Coombes estimate they have spent nearly 2,000 hours.) Moreover, the methods sections of papers are supposed to provide enough information for others to replicate an experiment, but often do not. Dodgy work will out eventually, as it is found not to fit in with other, more reliable discoveries. But that all takes time and money.
这是一个调查委员会反复听到的主题。同行审查这一程序依靠的是同一领域其他同行的善意(一直以来都是如此),但是这些同行也都有他们自己的工作,通常没有足够的时间来彻底仔细地检查别人的论文。(麦克申博士估计,她花了300-400小时检查杜克大学的研究;拜格里博士和库姆斯博士估计,他们花了将近2,000小时。)此外,论文的方法部分本来应该为其他人复制实验提供足够的信息,但通常情况都不是这样。骗人的研究如果跟其它更可靠的发现相违,最终会被拆穿,但这一切都需要时间和费用。
The Institute of Medicine expects to complete its report, and its recommendations, in the middle of next year. In the meantime, more retractions are coming, according to Dr Califf. The results of a misconduct investigation are expected in the next few months and legal suits from patients who believe they were recruited into clinical trials under false pretences will probably follow.
医学协会计划明年年中完成报告及其建议。与此同时,按照卡利夫博士的说法,还会有更多的论文被撤回。在接下来的几个月里,渎职调查会见分晓,随之而来的可能是因为虚假陈述而报名参加临床试验的病人提起法律诉讼。
The whole thing, then, is a mess. Who will carry the can remains to be seen. But the episode does serve as a timely reminder of one thing that is sometimes forgotten. Scientists are human, too.
整件事情一团糟。最后谁会成为炮灰还不知道,不过这些情节提醒我们记住一个常常被忽略的事实——科学家也是人。
Correction: This article originally stated that by the end of 2010 officials at Duke University began the process of retracting five papers. That should have been three papers. This was corrected on September 8th.
更正:本文原来说,截止到2010年底,杜克大学的官员启动程序,开始撤回五篇论文,实际为三篇,故于9月8日更正。
原文链接:http://www.economist.com/node/21528593
相关阅读:东北农业大学实验室感染事故引发的安全思考