Industry News Details

How Big Data Improves Care at Children’s Healthcare of Atlanta Posted on : Feb 05 - 2015

What began as a limited use of Hadoop at Children’s Healthcare of Atlanta is becoming a full-fledged big data initiative that is helping the organization provide better care for patients and deliver information that could potentially help citizens of Georgia avoid health problems in the future.

Children’s Healthcare provides a variety of healthcare services for children throughout the state, operating three hospitals and more than 20 neighborhood locations including five urgent care centers.

The institution’s foray into Hadoop began in 2013. A clinical research project it was working on with Georgia Institute of Technology needed bedside vital-monitor data—including heart rate, blood pressure, respiratory rate and oxygen saturation—from Children’s Pediatric Intensive Care Unit (PICU). Georgia Tech wanted to leverage historical and granular data from the monitors to understand what , if any, effect the environment of care—such as noise and light—had on patient vital signs. With that information, care givers could improve the environment by instituting quiet hours, moving noisy machines or redesigning care areas to improve the environment of care.

“Their timeframe was short, and we needed a solution quickly” to gather and analyze data, says Tod Davis, manager of business intelligence at Children’s Healthcare. “After investigating the data volume, flow and processing needs, it was clear that our current systems, already stretched to their max, could not handle the workload. Enter Hadoop.” Apache Hadoop is an open-source software framework for distributed storage and processing of large data sets on computer clusters.

Davis worked with an Oracle contractor to devise a plan to assemble a workstation-based cluster over a weekend. The first proof-of-concept -- a six-node cluster built out of 20 scavenged PCs from a hardware refresh -- was created in October 2013. “We nicknamed it ‘Frankendoop,’” Davis says. “Since then, we’ve migrated to an eight-node HP cluster and are currently implementing a 23-node Cisco cluster.”

The IT staff then began collecting data and sending it Georgia Tech.

“About three months later I got a call from a nurse who said [the unit needed] to be able to understand what’s happening to babies when they have stressful procedures,” Davis says. “I said, ‘What if I told you we [collected] all this data…. She couldn’t believe that we had the data.”

With the Hadoop tools already in place, Davis and his team created a new project for gathering and analyzing bedside vitals in real time.

The data analysis conducted as part of the project resulted in improved patient outcomes. It showed that the vital signs deviated from the baseline for much longer than anyone knew, Davis says. “Erratic or elevated vitals is an indicator of patient stress,” he says. “The clinicians are now aware of these extended stressful periods and can stay with the patient to comfort and assist them in recovery from the procedure.”

The results led to a retraining of hospital staff to equip them to better assess and understand neonatal pain, agitation and sedation scale (N-PASS) scores, providing information to clinicians so they can improve pain management in premature babies.

“What really struck me the most is we took something that we had no idea what it might be and turned it into this really powerful story about technology and helping people, and specifically helping babies,” Davis says. “Much of this data was previously being thrown away. We thought it wasn’t useful or we didn’t have the storage capacity.”

In many cases the vital sign data wasn’t being saved beyond three days because of the cost of storage. “There is so much data [being gathered] that we only keep a tiny percentage in our electronic medical record and associated data warehouses,” Davis says. “We now have bedside vital data from October 2013 to the present. The word is out among the physician researchers at Children’s and demand for analysis is high and growing.”

The bedside vital data project was fortuitous from an IT standpoint, because Children’s Healthcare technology leaders had been eager to launch data analytics efforts and this proved to be a good starting point.

“We needed to get into Hadoop and a Hadoop-ready project landed in our lap,” Davis says. “What started as a minor technical solution to a temporary need has opened many doors to develop technical solutions to help take better care of our patients.”

 The next Hadoop initiative the organization launched, which is ongoing, involves an asthma research study that combines 20 years of air quality data from the Environmental Protection Agency (EPA) with the hospital’s own asthma research.The goal: reduce emergency room visits and inpatient readmissions for asthma-related issues prevalent in pediatric populations.

The EPA has sensors located throughout Georgia that collect a variety of data related to weather conditions such as temperature, humidity, wind direction, particulates and pollutants.

Children’s Healthcare received permission from the EPA to pull all of that data, dating from 1985 to the present, into its Hadoop platform so it could analyze the information and graph the data across patient visits related to asthma or other respiratory conditions.

The study, which Children’s is also conducting with Georgia Tech, is aimed at helping to find the causes of readmissions for asthma.

“Part of the study is air quality data that can be correlated with patient admissions to the emergency room and the return rate of asthma patients who had been discharged,” Davis says. “Asthma is a big focus of pediatric healthcare.”


Now that the organization has tested the Hadoop waters, big data and analytics has become a more strategic part of its IT operations, Davis says. Among the technology components supporting big data are Cloudera’s CDH5 Hadoop distribution platform; Cloudera Manager, a management application for Apache Hadoop;  HP eight-node cluster; an enterprise analytics and visualization tool from QlikView; Cloudera Hadoop CDH 5.3 Enterprise Data Hub; Cisco six-node development cluster; and Cisco 17-node production cluster.  View more