Researchers at the University of California-Los Angeles have developed a computer program that can simultaneously detect cancer and identify from a patient's blood sample where in the body the cancer is located. The scientists call the program CancerLocator, which works by measuring tumor DNA circulating in the blood.
The program works by looking for specific molecular patterns in cancer DNA that are free flowing in the patients' blood and comparing the patterns against a database of tumor epigenetics, from different cancer types, collated by the authors. DNA from tumor cells is known to end up in the bloodstream in the earliest stages of cancer so offers a unique target for early detection of the disease.
“In general, the higher the fraction of tumor DNAs in blood, the more accurate the program was at producing a diagnostic result,” said UCLA Professor Jasmine Zhou, co-lead author of the paper. “Therefore, tumors in well-circulated organs, such as the liver or lungs, are easier to diagnose early using this approach than in less-circulated organs such as the breast.”
In the study, the new computer program and two other methods — Random Forest and Support Vector Machine — were tested with blood samples from 29 liver cancer patients, 12 lung cancer patients and 5 breast cancer patients. Tests were run 10 times on each sample to validate the results. The Random Forest and Support Vector Machine methods had an overall error rate (the chance that the test produces a false positive) of 0.646 and 0.604 respectively, while the new program obtained a lower error rate of 0.265. 
Twenty-five out of the 29 liver cancer patients and 5 out of 12 lung cancer patients tested in this study had early stage cancers, which the program was able to detect in 80 percent of cases. Although the level of tumor DNA present in the blood is much lower during the early stages of these cancers, the program was still able to make a diagnosis demonstrating the potential of this method for the early detection of cancer, according to the researchers.
Similarly, Zhou and her colleagues tested CancerLocator on real data from breast, liver, and lung cancer patients — though their model was developed to distinguish between those and non-cancer as well as colon and kidney tumors — and compared it to random forest and support vector machine approaches. Again they found that their approach outperformed the others, with an error rate of 0.265.
The research results were published in Genome Biology.