By Suzette Norris

Imagine if the same sort of round-the-clock patient care and monitoring found in an intensive care unit (ICU) happened in every hospital room for every single patient.

As machine learning techniques are developed to detect health deterioration in patients earlier and with higher accuracy, today’s ICU as we know it may become a relic.

Medical personnel in today’s ICUs carefully monitor a hospital’s most fragile patients. But advances in machine learning could eventually extend that level of care to all patients by continuously monitoring, and accurately detecting even the most subtle signs of deterioration, said Vaibhav Rajan, a senior research scientist at the Xerox Research Centre India (XRCI). Vaibhav is developing novel statistical models and algorithms for disease risk predictions.

Put 4,000 Brains on the Problem

Vaibhav’s work inspired a hackathon that drew more than 4,000 university computer science students. The challenge: How can machine learning predict patient mortality in an ICU?

XRCI Hackathon

Hackathon winners: Their machine learning analytics model achieved nearly 95 percent accuracy, and was able to identify patients with high risk of complications and mortality many hours in advance of their actual death. Team Deep Dreamers from IIT Kanpur: Kundan Kumar, Peeyush Agarwal, Sanil Jain.*

Student teams from several Indian campuses, like the Indian Institute of Technology and the Indian Institute of Science, took up the gauntlet. The hackathon also attracted students from around the world including Stanford, the University of California Berkeley, John Hopkins and the Imperial College in London. (Winning team, photo, right)

HackerRank, which hosted the event, told us this was by far the highest registration for any machine learning based challenge it has hosted,” Vaibhav said. “The difficulty of the challenge in terms of the data analysis required to build the models, and its potential to make a real impact on the world, I’m sure attracted the students.”

Why is it Difficult to Predict Death and Disease?

A labyrinth of issues makes it difficult to directly apply well-known mathematical models to make predictions about death and disease. In addition to privacy restrictions, healthcare data sets contain a lot of “noise” — inconsistencies or uncertainties — such as:

  • Sparse information that is difficult to interpret.
  • Lab investigations, like blood glucose levels, may not be measured in all patients.
  • Errors due to faulty equipment, or humans incorrectly entering data into electronic medical records.
  • Significance of the information varies for different patients and medical conditions.
  • Delays in information gathering. Initial entries that are paper-based, for instance, may not be entered into a patients electronic medical record for several hours.

So when will machine learning technology start affecting the everyday operations of treating patients? Vaibhav estimates it’s some years away for more “digitally progressive” hospitals, perhaps a decade or more to become pervasive.*

“The barriers are not in the machine learning – which has progressed far enough to be useful in clinical decision support tools,” he said. “A key barrier is the availability of real-time, digital data in hospitals.” Another one, he adds, is the time it takes for hospitals to adopt such tools in routine clinical practice.

Winning Students Pushed to Take on Tougher Challenge

Winners of XRCI’s data science challenge achieved close to 95 percent accuracy, and were able to identify patients with high mortality risk (on average) many hours in advance of their actual death.

After the challenge, Xerox invited 25 India-based students from the top nine highest-scoring teams to participate in a “Winter School” on machine learning run by XRCI researchers along with invited academic speakers from India and the United States. They also began work on phase two of the technical challenge which required them to build models that could predict the risk of 16 chosen complications such as stroke, pneumonia and heart failure.

“They also made use of data, such as medical ontologies (exact descriptions of things and their relationships) to classify drug names, and incorporated that information into their predictive models to boost accuracy levels,” Vaibhav said.*

At the end of the challenge, the students had created models that predicted some complications with more than 95 percent accuracy.

*The photo in this article was updated to show the 1st-place team. The previous photo was of team “decodejps,” which placed third — still, a most worthy accomplishment! This article was updated on March 8, 2016, in order to clarify the facts about medical data.