McMaster University

McMaster University

Faculty of
Health Sciences

Microsoft Excel-based algorithm predicts cancer prognosis

Published: August 9, 2010
John Hassell
John Hassell, professor in the Department of Biochemistry and Biomedical Sciences and director of the Centre for Functional Genomics

Predicting breast cancer prognosis no longer requires specialized software or extensive bioinformatics training. Researchers from McMaster University have developed a new system that uses Microsoft Excel to forecast the long-term outcomes of patients with malignant breast tumours.

Robin Hallett, a PhD student working under the supervision of McMaster molecular biologist John Hassell, has developed an algorithm to identify genes that are useful in predicting breast cancer outcomes.

The finding, published in the Journal of Experimental & Clinical Cancer Research, will enable researchers to input their gene expression profiling data into the common spreadsheet program Excel to predict specific characteristics of disease, including the likelihood of surviving breast cancer.

Currently, there are two predictive tests in the United States (one experimental and the other approved by the FDA) that determine prognosis for breast cancer patients. The tests can be used to identify patients who will or will not benefit from chemotherapy. Findings from the National Surgical Adjuvant Breast and Bowel Project have shown that roughly 80 per cent of breast cancer patients would derive little benefit from adjuvant chemotherapy after surgery, yet still suffer the side effects of the treatment.

"Most biologists are limited by their abilities at running these complicated software programs," said Hallett, a molecular biologist with an interest in bioinformatics. "The point here was to make something a lot more user friendly, but still produce the same kind of predictive accuracy as the more sophisticated and computer-intensive techniques."

The researchers used gene expression profiles from a total of 295 patients with stage I or stage II breast cancer, as well as clinical data relating to their survival. They then split the patients into two groups – one to "train" the Excel-based algorithm and the other one to validate it.

Based on results from the first group of 144 patients, the researchers used the top 20 genes whose expression levels correlated with patient survival or patient death. Using that data, they established a model (also known as a gene expression signature) that predicted good or poor prognosis. The algorithm was then validated using the second group of 151 patients.

The researchers found the Excel-based algorithm had slightly higher overall predictive accuracy than other predictive tests currently available.

"Our algorithm produces prediction models with comparable accuracy to other feature selection techniques while having generally better accessibility and usability for biological and biomedical research scientists," said Hassell, a professor in the Department of Biochemistry and Biomedical Sciences and director of the Centre for Functional Genomics at McMaster.

"This algorithm derives a very simple gene signature that predicts whether breast cancer patients have a good or bad prognosis. That, in turn, can indicate whether these patients receive chemotherapy or not."

Hallett said the algorithm can also be applied to other fields where gene expression profiling data is available.

"Most biologists know how to use Excel," he said. "And if you can come up with a gene signature that’s predictive, it’s really useful — not just in the cancer field, but in any disease field."
Valid XHTML 1.0 Transitional Level Double-A conformance, W3C WAI Web Content Accessibility Guidelines 2.0