Anonymized EMR-based Data Analysis – The Next “Big Thing” in Healthcare

One of the most important things to come out of the effort to digitize medical records is the ability to look at health data and understand relationships that might not have been previously evident. Very rapid advances in our understanding of disease, and how best to treat these conditions, can unfold much faster than ever before.

Digitization of medical records is rapidly becoming mainstream. National policy, boosted by the CMS EHR Incentive Program, has helped accelerate this transition. The emergence of new technologies, such as web-based EHRs ideally suited to solo and small group practices, has resulted in tools that allow all clinicians in all settings access to Electronic Health Record (EHR) systems.

Data generated by these systems can be thought of in two ways: (1) personal health information (PHI) used by treating clinicians in order that all important information about a patient is at-hand at the point of care, including wellness and chronic disease-management prompting for best practices; and (2) anonymized population data, which can be used to study patterns and correlations.

Recognizing the tremendous value that can come from analyzing anonymized data, some very large-scale projects have emerged. The federal Department of Health and Human Services (HHS) has undertaken the Open Government Plan, which makes health data gathered by government agencies (such as the Centers for Disease Control, and the Veterans Administration Hospital system) available to the public for free.

Also, in collaboration with Microsoft’s Windows Azure MarketPlace, Practice Fusion is offering a set of de-identified health information to help with this kind of research. Based on this data, we are sponsoring a developer’s challenge – “Analyze This” – as part of the upcoming Health 2.0 Conference in San Diego, hoping to encourage coders and analysts to use the data that is becoming available and identify important relationships and observations.

What kind of relationships might be found? As a rudimentary example, here is a quick study of anonymized health data from the Practice Fusion experience that relates the incidence of diabetes to body mass index (BMI). Body mass index (weight-for-height) is a commonly used marker of obesity, and its measurement and reporting is part of Meaningful Use criteria. The fact that diabetes risk increases with obesity is already well-established in the medical literature – so no new, earthshaking findings are seen in this example – but it does support the observation that the risk of diabetes increases with increasing obesity.

In the following chart, a large anonymized sample size is segmented into different BMI bands, and the incidence of diabetes (percentage of patients in each band who have the diagnosis of diabetes) is calculated for each band. Only patients 18 years old and above (where BMI is valid, rather than BMI percentile used in pediatric measures) are counted. These represent “real-world” patients in physician practices, particularly small group and solo primary care practices not necessarily affiliated with any hospital or institution, and are distributed across the entire country.

Of course, more detailed analysis, looking at risks by age, by geography, by the simultaneous occurrence of other diagnoses, or by correlation to other external data elements (such as census-based socioeconomic status, or the incidence of fast-food restaurants in each geography, etc.) would provide useful new insights. This is where interesting findings from the Analyze This developer’s challenge can come out.

Anonymized health data, freely and publicly available, is emerging as a new resource in 2011 – certainly to a scale never before seen. The speed with which we can identify new risks, correlations, treatment efficacy, and all other elements of Comparative Effectiveness Research – now referred to as “patient-centered outcomes research” – will emerge more rapidly than we have ever seen. The potential is staggering.

Robert Rowley, MD
Chief Medical Officer
Practice Fusion EMR