Let's Look Beyond Random Trials When Assessing New Drug Treatments

NPR | By Stuart Kauffman

Published November 19, 2012 at 10:48 AM EST

In this post I report, in outline, a recent publication in PLOS ONE by Margaret Eppstein, Jeffrey Horbar, Jeff Buzas and myself, Stuart Kauffman. All four of us are at the University of Vermont, with Horbar also director of the Vermont Oxford Network of over 900 hospitals. I will refer to the four co-authors as "The Vermont Group." The full paper is entitled "Searching the Clinical Fitness Landscape".

Our work calls into question the adequacy of the criteria, the famous double blind randomized clinical trials, or RCT for "randomized controlled trial," upon which the FDA relies for testing drug safety and efficacy. Our work is meant both to examine the search for good combinations of treatment modalities on a day to day basis and may be of use in seeking useful combinations of drugs requiring multifactorial causal effects. In brief, our work shows that when RCTs work, they do really work, often well. But they often fail in complex biological-medical situations where causality is multifactorial, as it typically is. In place of RCT, our group has found a better alternative in these cases which we call "Team Learning."

The medical and societal implications may be large, as I describe below.

The Vermont Group created a simple mathematical model I need to explain. We supposed 100 treatment modalities, each with two discrete levels, low and high. Now imagine that all 100 modalities have totally independent modes of action. And arbitrarily, suppose that "high" is best on each of these 100 treatment modalities. Then "all 100 high," is the best that can be done to achieve some desired clinical outcome, say survival. The more "lows" chosen among the 100, the worse the outcome.

Think next of the "goodness" of the outcome as a "height," and in the case of 100 independent treatment modalities one achieves a single peaked "clinical landscape" rather like Mt. Fuji, with smooth sides.

But suppose the treatment modalities are not independent and interact in positive and negative ways. If this is true, causality is now "multifactorial." No single factor or modality acts alone to affect the outcome. Intuitively, this can and does create a clinical landscape with multiple "local peaks" of good outcomes.

In our model, we are able to tune how richly the 100 modalities interact. We tested two very different "learning strategies" on these single peaked and multipeaked clinical landscapes. The first is statistically rigorous double blind randomized clinical trials, RCT. Here from 400 to 3,200 patients were studied per trial. In RCT, the modality selected for testing is based on which treatments are more consistently used in the top 50 hospitals compared to the bottom 50 hospitals among a total 100 hospitals. The selected modality is tested once by a random 10 of the 100 hospitals, and adopted by all the 100 hospitals if statistically significant in the double blind experiment beyond the 5 percent level. Thereafter, that modality is not tested again. RCT performs well, with statistically rigorous results, when the 100 treatment modalities are independent.

But as the 100 treatment modalities interact ever more richly, RCT progressively fails. It does not achieve statistically significant results.

Team learning, emerging now in real teams of hospitals forming quality improvement collaboratives, is radically different.

In our model of real life team learning, we modeled 100 hospitals broken into 10 teams of 10 hospitals. Each hospital acted as a single "learning agent." Here is how a team "learns." Each of the 10 members of the team independently decides which of its 100 treatment modalities to try changing from "high to low" or "low to "high" by observing what appears to be working better among its team members. Specifically, it decides to try a new treatment value that has the highest relative prevalence among team members that have higher survival rates than at its own institution, compared to team members have have lower survival rates.

Each hospital then tries the new value of its selected modality back at its own institution, in combination with the other 99 treatments already in use there. This is tried on a relatively small number of patients, ranging from 40 per trial to 320 per trial. The new treatment is adopted if any improvement in survival is observed, without regard to statistical significance.

Then this search process iterates. During this, some hospital may chose a previously flipped treatment and flip it back from low to the former high.

In both RCT and Team Learning these iterations continue for 100 cycles. Stunningly, Team Learning outperforms RCT in virtually all cases. RCT only outperforms Team Learning on single-peaked "Fuji" landscapes when factors are fully independent and very large numbers of patients are enrolled in the trials.

This is only a first study. But do we know that causality in biology is multifactorial? Yes, it typically is. Polygeny in developmental biology has been known for many years. Here many genes contribute to one phenotype or "outcome." Causality here is multifactorial. Finally, I also want to note that surgery improves over time without using RCT; we cannot clone surgeons. So clinical multifactorial learning without RCT is perfectly possible.

If confirmed, what does this portend?

1) Although double blind randomized controlled trials will remain the gold standard for assessing the risks and benefits of novel therapies, it becomes very likely that the reliance of the medical profession and FDA on such studies, which when they work do work, is nevertheless too narrow a basis for clinical learning about good combinations of drugs to use or other combinations of treatment modalities. Team Learning is better at learning complex combinations of treatment modalities. Cells and organisms are networks causally. Effective control may require inputs at many points simultaneously, not single silver bullets. Thus over-reliance on single-factor RCT is probably throwing away clinically relevant information to all our loss.

2) That which is FDA approved strongly defines "best practice" medicine by which well trained doctors practice and are even constrained to practice, for fear of unscientific, even charlatan, medicine. But if "best practice" is based on too narrow an evidential basis, we all suffer.

3) We need to learn multifactorial medicine, some new combination of systems biology, complexity theory and medicine, an imposing task for the near future.

4) We may need to broaden our thinking and rethink rules which may be overly constrictive.