Collecting Cancer Data

NATIONAL INSTITUTES OF HEALTH

The Broad Institute and Sanger Institute announced yesterday (March 28) details from their separate cancer cell line databases, the largest such repositories of genomic and drug profiling data to date. With preliminary results published in two Nature papers, the databases should help researchers identify which drugs to use against which cancers to streamline drug development efforts.

“This continues to move us towards cancer being understood as a molecular disease instead of an anatomical disease,” said Eileen Dolan, who studies pharmacogenomics at the University of Chicago and was not involved in either study. “It will help us understand our existing drugs, as well as new drugs, to make more informed decisions in phase I and phase II trials.”

In recent years, researchers have become increasingly aware that whether a tumor will respond to a given drug treatment depends on its genomic profile. But because of the...

“For any variety of cancer drugs that are being developed, we can’t necessarily know in advance which cancers are going to be vulnerable,” said Levi Garraway, a cancer biologist at the Dana Farber Cancer Institute who spearheaded the Broad project. “If you have a large collection of cell lines that are deeply annotated genetically and molecularly, you can probe the biology linked to many types of genetic alterations of interest.”

Four years ago, Garraway and his colleagues began a massive screen of 947 cancer cell lines, sequencing cancer-associated genes, profiling drugs, collecting RNA expression data using microarrays, and combing the cancer genomes for repeated regions. And they weren’t too far along when they learned of a parallel project at the Sanger Institute, led by genomicist Mathew Garnett.

The projects aren’t identical; they screen different genes and different drugs using slightly different methods. For this reason, Garnett views the two databases as “complementary.” “There was sufficient non-overlap that it was possible to make different observations,” agreed Garraway. (See table for details.)

Plus, having two separate databases rather than pooling the data, as previous databases have done, could lend more weight to certain findings. “I think having two independent resources is a good thing,” said Jian Ma, a computational genomicist at the University of Illinois, who did not participate in the research. “If two different groups have the same result for one cell line, it would be more reliable.”

The two Nature papers, submitted as a pair, describe how the data for each project were collected, and include confirmatory experiments to demonstrate how the databases could enhance cancer drug development. Garnett’s project, called the Cancer Cell Line Encyclopedia, identified a mutation in Ewing’s sarcoma cells that is highly sensitive to PARP inhibitors, for example. Meanwhile, Garraway’s database, the Genomics of Drug Sensitivity in Cancer project, includes data suggesting that MEK inhibitors, a class of cancer drugs that target the RAS oncogene, may have increased efficacy in cancers with a mutation in another gene, AHR.

The ultimate hope is that the databases will be used to help people with cancer by better matching a cancer type to a drug, and identifying which patients to enroll in clinical trials based on their genetic flavor of cancer. “Often, drugs fail [in clinical trails] simply because they’re not tested in the right people,” said Garnett. A better understanding of how drugs respond to genetic mutations, helped by the databases, could help clinicians single out “what populations are most likely to respond.”

	Cancer Cell Line Encyclopedia	Genomics of Drug Sensitivity in Cancer
Institute	Broad	Sanger
Lead scientist	Levi Garraway	Mathew Garnett
# cancer cell lines	947	639
# genes sequenced	1,600	64
# drugs	24	130
Genomic data (some yet unpublished)	100 Terabytes	20 Terabytes

Importantly, all the information in the databases is in the public domain, available for all researchers. “What will be immediately useful is that a lot of bioinformatics-savvy groups will be able to download the data,” said Scott Powers, a cancer genomicist at Cold Spring Harbor Laboratory who was not involved in the projects. “There’s no question that there’s gong to be value in relating cancer genotypes to potential compounds and cancer drugs.”

“It’s a resource that, once it’s out, people will be using to publish more studies,” agreed Ma.

However, the databases are not very user friendly right now, lamented Powers. “They’ve got a long way to go to produce a really useful website,” he said. The sites are a bit buggy and difficult to navigate at this point, especially in comparison to the “gold standard” for cancer databases, Oncomine from the University of Michigan, which contains genomic and gene expression data but lacks the extensive drug profiling data the Broad and Sanger databases aim to provide.

Both projects aim to clean up the user interface, and will continue to deposit new data over time, adding more cell lines and drugs, as well as metabolic profiles, epigenetic data, and more genomic sequences.

“I’m optimistic this will be a very useful resource for people to test hypotheses,” said Garnett. Garraway agreed: “We need this kind of resource and hopefully the field will use it to make many more discoveries.”

J. Barretina et al., “The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity,” Nature, 483:603-7, 2012.

M. Garnett et al., “Systematic identification of genomic markers of drug sensitivity in cancer cells,” Nature, 483:570-5, 2012.

Interested in reading more?

Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!

Already a member?

Collecting Cancer Data

Two new cancer cell line databases bursting with genomic and drug profiling data may help researchers identify drug targets.

Interested in reading more?

Become a Member of