N-of-1 Trials Take on Challenges in Health Care

ABOVE: MODIFIED FROM
© ISTOCK.COM, Varijanta

For a few months in the first half of 2019, Chris Payze started each morning at home in Queensland, Australia, by jotting down answers to a series of questions. What time did I go to bed? How many times did I wake up? Speaking to The Scientist this April, 71-year-old Payze said she’d gotten “really into the groove” of this daily routine. “It only takes me about five minutes.”

She recorded the information for a trial of melatonin, a hormone that regulates sleep-wake cycles and is often taken orally as a sleep aid, although it’s not clear how well it works. Payze has Parkinson’s disease, and for the last couple years, she, like many people with the condition, has been dealing with insomnia. “I just have awful trouble sleeping at night,” she explains. While she doesn’t feel sleep-deprived, the interrupted sleep “is just...

Payze has participated in clinical trials for other medications, and volunteered for this one after reading about it in a local newsletter for people with Parkinson’s. But this trial was different from the others in one immediately obvious way: Payze would be the only participant.

That’s because this particular 12-week study was what’s known as an N-of-1 trial. Focused on the collection of treatment-response data in a single patient, this relatively little-used trial design represents the ultimate form of patient-centered medicine. Researchers design a mini-investigation of a treatment’s effectiveness entirely around an individual, with the goal of determining whether or not a particular treatment works for her. (See infographic below.)

A growing appreciation of how responses to treatments may vary among and within patients is helping to attract new interest to the N-of-1 approach.

The trial was organized in a randomized sequence of two-week periods. In each period, Payze took either melatonin or placebo, never knowing which she was taking. Having just finished the treatments, Payze will soon receive a report with her results, which she and her doctor can use to help decide whether or not she will include melatonin in her medications. (Participants in traditional clinical trials, by contrast, typically have to wait for a study to be published before they receive any information about the results, if they hear anything at all.)

The potential applications of this trial design go beyond the personalization of clinical care. Researchers at the University of Queensland who are helping to manage the trial have plans to aggregate Payze’s results with those from similar N-of-1 trials in which Parkinson’s patients have taken courses of both melatonin and placebo to evaluate the supplement’s effects. Although the details of the trial procedure might not be exactly the same for each patient, an aggregated series of N-of-1 trials can reveal population-level trends while capturing how responses to treatments may vary among and within patients.

A growing appreciation of this heterogeneity is helping to attract new interest in the N-of-1 approach. Formally developed as a concept in the 1980s, these trials struggled to take off in the health care and research communities due to the effort involved in personalizing research to this extreme degree. Instead, most clinical studies measure a treatment’s efficacy with a randomized controlled trial (RCT) design. Typically set up to collect population-level data, many RCTs lack the resolution to allow in-depth analyses of variability in treatment responses—a limitation that’s now widely appreciated among researchers in precision medicine.

“You can’t ignore individuality. It just doesn’t make sense,” says Nicholas Schork, director of quantitative medicine and systems biology at the Translational Genomics Research Institute (TGen) in Arizona and an advocate for the greater use of N-of-1 trials across clinical care and research. “If you want to get at the biology, what actually contributes to the variation that exists, and identify more clearly who might or might not respond to a drug . . . then you have to study individuals.”

N-of-1 trials in clinical care

N-of-1 trials made their first big impression on the medical scene in the mid-1980s. A 65-year-old man in Ontario had visited his doctor complaining of severe shortness of breath and was prescribed a cocktail of therapies for asthma: the oral steroid prednisone; the bronchodilation drugs albuterol and ipratropium bromide; and theophylline, an anti-inflammatory compound similar to caffeine. But months later, he was still struggling to control his symptoms.

The case caught the eye of Gordon Guyatt, a McMaster University physician who would later coin the term “evidence-based medicine.” Guyatt was interested in applying a new clinical approach he’d heard about from psychotherapists in which patient and doctor rigorously tested whether a treatment was working over weeks or months. The asthma patient was the perfect case—the condition was chronic and relatively stable, so wouldn’t clear up before such a test could be completed; and the treatments were fast-acting, so their effects should be easily observable.

Focusing on theophylline (which, along with ipratropium bromide, the doctor had expressed the most uncertainty about), Guyatt’s team designed a multi-crossover protocol—that is, the patient would take either the drug or a placebo for 10 days, then take the other for the following 10 days, and then launch into a new 20-day block of two 10-day periods, and so on, until doctor and patient agreed to stop. A pharmacist made up both the drug and the placebo into identical capsules for the patient, who rated his symptoms on a questionnaire while continuing his other three treatments normally.

After analyzing four periods—two with the placebo and two with theophylline—the results seemed clear: theophylline was doing more harm than good, with the patient reporting more difficulty breathing while taking the drug. Speaking to other clinicians years later, Guyatt recalled: “There was such a dramatic difference between one period and another that I said, ‘Look, there’s just no question. We know what’s going on here.’” The team halted theophylline (but not ipratropium bromide, which a subsequent drug-placebo sequence suggested was helping), and the patient’s symptoms improved.

The researchers described the case in an influential 1986 paper and argued for using N-of-1 trials in clinical care to help identify commercially available treatments that could benefit a particular patient and avoid prescribing those that didn’t.¹ They also established an N-of-1 service at McMaster University where physicians in the local medical community could try the approach with their own patients, or assign patients to trials there. Guyatt and other McMaster clinicians trained in the N-of-1 procedure oversaw the trials, helped design protocols, worked with pharmacists to source drugs and appropriate placebos, and collected reports from the participants about their symptoms.

By the end of the decade, the center had completed 57 trials, 15 of which resulted in a substantial change to a patient’s course of treatment.² Others followed Guyatt’s lead. Researchers at the University of Washington in Seattle launched their own N-of-1 service and oversaw more than 30 trials in the 1980s and 1990s, about half of which provided a definitive answer as to whether a patient should or shouldn’t take a particular treatment, before the program ran out of funding.

A couple of other institutions kept the approach going. In 1998, clinical researcher Jane Nikles of the University of Queensland in Australia helped launch an N-of-1 unit that has trialed available treatments on individual patients with conditions including ADHD and chronic pain. The program is planning several more projects for the coming years. And back in Canada, pediatrician Sunita Vohra, who did her graduate studies at McMaster, headed up a clinic at the University of Alberta that facilitated N-of-1 trials focused on patients’ use of so-called complementary and alternative medicine, an umbrella term for interventions ranging from probiotics to what Vohra’s webpage calls “pediatric integrative medicine (acupuncture, Reiki, massage therapy).” Trials arose from situations where “the patient wanted to take [a particular medicine], or was curious to take it, or was planning to take it,” Vohra says, “and they and their health care provider were interested to build knowledge around that.” That clinic has wound down activities, according to Vohra.

When carried out as an extension of standard care, N-of-1 trials that focus on treatments already on the market are free from the regulation surrounding clinical research trials, allowing them to more dynamically cater to individual patients’ needs. And the benefits go beyond assessing a particular treatment’s effectiveness. The individualized design allows patients to define their own treatment goals and specify trial outcomes that are meaningful to them—an improvement in quality of life, for example, rather than a decrease in a particular disease biomarker. What’s more, receiving personalized results could positively affect patient behavior after the trial—encouraging adherence when a therapy is found to be effective, and motivating a patient to seek other options when it isn’t.

Nikles and Queensland colleague Geoff Mitchell have found some evidence for this: up to 85 percent of patients who completed N-of-1 trials for ADHD treatment stuck with the recommended treatment strategy for at least a year.³ Although the team didn’t study a control group, adherence rates can be as little as 50 percent among people with chronic conditions receiving standard clinical care, notes Nikles, who says that boosting adherence could help make clinical practice more cost-effective in the long run.

Advocates of N-of-1 trials are using results such as these to try to improve perceptions of the approach. Previous research suggests they have some work to do: according to one 2009 survey, some doctors view the trials as an impractical amount of work, and even “a threatening paradigm shift in the doctor-patient relationship.”⁴ What’s more, although the concept makes intuitive sense, attempts to demonstrate that it provides clinical benefit have been few and far between—with some studies concluding it does not.

Richard Kravitz, who researches health policy and internal medicine at the University of California, Davis, and colleagues recently explored this issue. Between 2014 and 2017, the team monitored more than 200 people with chronic musculoskeletal pain participating in an N-of-1 trial or receiving standard clinical care in northern California. Findings that the researchers published last fall show that, although patients in the N-of-1 group reported greater involvement in medication-related decisions, there was no significant difference between the groups in terms of how effectively patients were able to manage their pain by the end of the study period.⁵

While the project demonstrated N-of-1 trials’ feasibility, “the results were disappointing,” says Kravitz. Guyatt himself coauthored a commentary accompanying the study stating that the N-of-1 concept “has failed to demonstrate improved clinical outcomes” and may represent “another instance of a beautiful idea being vanquished by cruel and ugly evidence.” ⁶

Many researchers don’t share Guyatt’s view. Kravitz, Vohra, and others have instead argued that N-of-1 trials might only benefit some patients with certain conditions.⁷ University of Exeter medical sociologist Nicky Britten is investigating this possibility. After noticing an uneven distribution of N-of-1 trials across medicine, she and her students started scouring the literature, finding that conditions such as mental and behavioral disorders “seem to attract N-of-1 trials” more than heart and kidney conditions, she says. The team wants to identify if that’s because “some conditions lend themselves to N-of-1 trials better than others, or [because] clinicians in those fields know about them and are keen to do them.” Doing so could allow researchers to “zoom in to where an N-of-1 trial can really help.”

All for One

The N-of-1 trial design aims to provide a definitive answer as to whether a treatment works in a particular patient. As such, the entire process of testing a treatment is personalized to that patient—from the selection of measurable outcomes to the use of data once the trial is over. The approach therefore differs from most randomized controlled trials (RCTs), which are usually geared toward answering a particular research question. Yet despite their individualized design, N-of-1 trials can also be useful in clinical research. Data collected from multiple N-of-1 trials can be aggregated and—provided that the correct statistical tools are applied—analyzed to generate population-level data about drug response, while capturing far more information about intra- and interindividual heterogeneity than most RCT designs.

TYPICAL
N-OF-1 TRIAL

STANDARD RANDOMIZED
CONTROLLED TRIAL (RCT)

A patient works with her physician to develop a study design with the primary goal of finding the best course of treatment for her.	A patient signs on to a clinical trial with protocols and outcome measures determined by clinicians with the primary goal of answering a research question about the intervention.
She alternates between taking a drug and a placebo during the course of the trial. The drug and placebo are made into identical capsules, which she takes in treatment blocks of several days or weeks at a time in a randomized sequence.	Although some designs—for example, crossover trials—involve multiple treatment periods, a patient in an RCT will often take either a drug or a placebo for the entire duration of the trial.
The patient receives detailed feedback at the end of the study about her results, plus a recommended treatment plan based on that information.	Patients typically do not receive individualized feedback about the trial, and only in some cases are they notified when study results are published—if they are published at all.
Provided patients consent to be involved in research, clinicians can use special analytical techniques such as Bayesian statistics to aggregate data from multiple N-of-1 trials in order to make inferences about a particular therapy’s efficacy at the population level. These techniques account for the fact that trials were carried out individually, and capture information about inter- and intraindividual variation in drug response.	Researchers analyze group-level data from the trial to make general statements about the drug’s safety and efficacy for the population tested. Only limited information about interindividual variation is available.

See full infographic: WEB | PDF

Aggregating N-of-1 trials for clinical research

Matthieu Roustit, a clinical pharmacologist at the University of Grenoble in France, became familiar with the N-of-1 approach a few years ago, when he heard Vohra give a talk on the topic. Roustit was studying treatments for Raynaud syndrome, a condition in which arterial spasms reduce blood flow to a person’s extremities, causing numbness and skin discoloration. One medication that clinicians use to treat the condition is the artery-dilating drug sildenafil (sold by Pfizer as Viagra). The drug isn’t specifically approved for the purpose—it’s more widely used for erectile dysfunction and pulmonary arterial hypertension—and although a handful of RCTs have tested the drug’s efficacy for Raynaud’s, their results have been equivocal.

After hearing about Vohra’s experience with the N-of-1 approach, Roustit and his colleagues decided that, rather than carry out another RCT, they’d launch a series of randomized, double-blind N-of-1 trials. Partly funded by Pfizer, the team designed a protocol that had participants self-administer a treatment up to twice a day when they felt symptoms coming on. Each trial was organized into treatment blocks of three one-week periods: every week, the patient would be provided with a supply of a placebo, 40 mg of sildenafil, or 80 mg of sildenafil. Neither the patient nor the doctor knew the order of treatments.

People had this silly misunderstanding that if you’re studying a single individual, you’re casting a blind eye to the rest of the world. That doesn’t have to be the case.
—Nicholas Schork, Translational Genomics Research Institute

At the end of the study, rather than simply computing results per individual, the researchers also aggregated data for the 38 patients who completed at least two rounds of each treatment. The team found that, while the drug showed only moderate efficacy in treating symptoms at the population level, “there is major heterogeneity,” says Roustit.⁸ “In some patients we can see efficacy, and in other patients there is absolutely no superiority versus placebo. . . . That’s very important information—usually in trials you just get [an overall] p-value, and based on this p-value, which is kind of arbitrary, you decide whether a drug works or doesn’t work.”

Motivated by these results, Roustit says his team is now on the search for other treatments for Raynaud’s that may help more of the patient population. The use of N-of-1 trials as a tool for clinical research, as opposed to purely for patient care, is attracting new interest as the scientific community grapples with disappointments in the field of precision medicine. Promises that approaches such as genome screening would identify the right drugs for the right people have been fulfilled in only a handful of cases, and much of the problem seems to stem from unseen sources of variability in treatment responses—both between patients, and, perhaps less appreciated, within the same patient over time.

In an RCT, “you might find that, by some definition or other, 70 percent of people respond and 30 percent don’t, and then you jump to the conclusion: What we need to do is identify who these 70 percent are and why they’re responders,” explains Stephen Senn, a pharma consultant and statistician. The problem is that “if you measure the same person on different occasions, even if they’re given the same treatment, you might get a different answer.”

Although originally framed as a way to move away from population-level averages, N-of-1 trials could help tackle both intra- and interindividual variation, as researchers collect multiple, detailed measures on each person. Until recently, “people had this silly misunderstanding that if you’re studying a single individual, you’re casting a blind eye to the rest of the world,” says Schork. “That doesn’t have to be the case. You can use . . . N-of-1 studies to draw inferences about what’s going on in the population, [and] do that with greater rigor and sophistication.”

In theory, then, N-of-1 trials could help collect individual-level data on responses to experimental drugs that haven’t hit the market yet, providing a more rigorous test of a new treatment’s efficacy. The concept has yet to catch on in the pharmaceutical industry, but several academic research groups, including the University of Queensland team overseeing Payze’s insomnia study, are now pursuing series of N-of-1 trials (which, when carried out for research purposes, are subject to the same regulation as other research trials). Nikles notes that, provided effort is made to include a representative sample of patients, an N-of-1 series could provide more information for less effort compared to an RCT, making it an attractive approach for studying drug responses in rare diseases. “You need a lot fewer people to do an aggregated N-of-1” than an equivalent RCT, she says, “because you have more [statistical] power” from the multiple measures per person.

One group in the Netherlands recently demonstrated this principle with a study of mexiletine, a sodium-channel blocker that has for decades been used as a local anesthetic and antiarrhythmic agent. A 2012 RCT of around 60 people found that the drug was also effective for treating muscle stiffness in a disorder known as nondystrophic myotonia; last year, the Netherlands team found a comparable level of statistical support for efficacy after analyzing aggregated data from just 11 N-of-1 trials.⁹

Researchers are still working out how to properly combine data from trials specifically designed to be run independently. Aggregating data consisting of repeated measures per individual, potentially taken over different durations and with an eye toward different patient-specified goals, necessitates a careful statistical approach.

One method that’s come to prominence is Bayesian statistics, notes Senn, who strongly criticized some of the earlier methods proposed by Guyatt’s team to analyze N-of-1 data as failing to distinguish between various sources of within- and between-patient heterogeneity. Bayesian techniques weight data from each new patient against existing data from previous patients and provide results in terms of probabilities of efficacy, rather than a more-limited conclusion based on p-values. Roustit, whose team incorporated Bayesian techniques in its analysis of responses to sildenafil, says the method takes some getting used to for clinicians more familiar with RCT data. “It’s a totally new paradigm in clinical research.”

A patient-centered approach

Whatever the N-of-1 trial’s eventual place in medicine, there are several current trends that may boost the approach’s popularity in both clinical care and research. The first is the development of technology such as cloud-connected activity monitors that could minimize the effort needed to carry out the trials, says Schork. “As we reduce the costs associated with collecting data on individuals over a long enough period of time to determine unequivocally if they are responding to a particular drug, [these trials] are going to become mainstream.”

There’s also the fact that members of the public are taking a more active role in tracking their own health than ever before. More than 300,000 health-related mobile apps are available worldwide, while Fitbit, a company offering activity-monitoring wristbands, reported that its number of active users exceeded 27 million at the end of last year.

Vohra says that, in her experience, “patients are pretty savvy—they are curious and interested in their own health. . . . I think it’s perfectly OK to work with the patient as a partner, and say, ‘This is what is known, [these are] things that aren’t, and together we can try and sort out what might be the most effective treatment for the treatment goals that you have.’ ” Indeed, surveys by Nikles and others have found that people report feeling empowered by getting involved in their own care, and that N-of-1 trials increase their understanding of their condition and how to manage symptoms.¹⁰

Back in Queensland, Payze describes her experience in the melatonin trial in a similar way, noting that she enjoyed being proactive about her sleep problems and happily used a sleep-monitoring watch for part of the study. With the trial over, she says she’s looking forward to finding out whether or not melatonin is helping with her insomnia. “Normally with research programs, they just tell you you’re not going to know anything about what’s been involved or what you’ve put your hand up for,” she tells The Scientist. “But in this one, they’ll give us some results, so that’s good.”

Correction (August 21): The original version of this article stated that the University of Queensland’s N-of-1 program is less active now than it was a decade ago. In fact, the program is conducting and planning several further projects. The Scientist regrets the error.

References

G. Guyatt et al., “Determining optimal therapy—Randomized trials in individual patients,” New Engl J Med, 314:889–92, 1986.
G.H. Guyatt et al., “The n-of-1 randomized controlled trial: Clinical usefulness: Our three-year experience,” Ann Intern Med, 112:292–99, 1990.
C.J. Nikles et al., “Long-term changes in management following n-of-1 trials of stimulants in attention-deficit/hyperactivity disorder,” Eur J Clin Pharmacol, 63:985–89, 2007.
R.L. Kravitz et al., “Marketing therapeutic precision: Potential facilitators and barriers to adoption of n-of-1 trials,” Contemp Clin Trials, 30:436–45, 2009.
R.L. Kravitz et al., “Effect of mobile device-supported single-patient multi-crossover trials on treatment of chronic musculoskeletal pain,” JAMA Intern Med, 178:1368–77, 2018.
R.D. Mizra, G.H. Guyatt, “A randomized clinical trial of n-of-1 trials—Tribulations of a trial,” JAMA Intern Med, 178:1378–79, 2018.
R.L. Kravitz et al., “A case for the n-of-1 trials–Reply,” JAMA Intern Med, 179:453, 2019.
M. Roustit et al., “On-demand sildenafil as a treatment for Raynaud phenomenon: A series of n-of-1 trials,” Ann Intern Med, 169:694–703, 2018.
B.C. Stunnenberg et al., “Effect of mexiletine on muscle stiffness in patients with nondystrophic myotonia evaluated using aggregated N-of-1 trials,” JAMA, 320:2344–53, 2018.
C.J. Nikles et al., “Using n-of-1 trials as a clinical tool to improve prescribing,” Br J Gen Pract, 55:175–80, 2005.

Interested in reading more?

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!

Already a member?

N-of-1 Trials Take on Challenges in Health Care

From personalizing clinical practice to aiding biomedical research, individualized clinical trials have been heralded as a solution to several problems plaguing modern medicine. But they’ve yet to be widely adopted.

N-of-1 trials in clinical care

All for One

TYPICAL
N-OF-1 TRIAL

STANDARD RANDOMIZED
CONTROLLED TRIAL (RCT)

Aggregating N-of-1 trials for clinical research

A patient-centered approach

References

Interested in reading more?

Become a Member of

From personalizing clinical practice to aiding biomedical research, individualized clinical trials have been heralded as a solution to several problems plaguing modern medicine. But they’ve yet to be widely adopted.

N-of-1 trials in clinical care

All for One

TYPICALN-OF-1 TRIAL

STANDARD RANDOMIZED CONTROLLED TRIAL (RCT)

Aggregating N-of-1 trials for clinical research

A patient-centered approach

References

Interested in reading more?

Become a Member of

TYPICAL
N-OF-1 TRIAL

STANDARD RANDOMIZED
CONTROLLED TRIAL (RCT)