A new EHR dataset becomes available for an ambitious data challenge

Today, Practice Fusion is releasing a new HIPAA-compliant research dataset of 10,000 de-identified medical records at the Health Data Initiative conference in Washington DC. The dataset includes information on lab results, diagnoses, medications, allergies, immunizations, smoking status, visits to the doctor, and vital signs.

We’re partnering with Kaggle, a platform for predictive data modeling competitions, to challenge developers, designers, data scientists and researchers use this dataset to improve public health. We have two challenges:

  • Open Challenge: Competitors are invited to combine Practice Fusion’s clinical dataset posted on Kaggle with one or more public datasets available at www.data.gov. The mash-up can be used to create virtually anything. For example: a map of chronic disease across the country, a personal health app or a tool for running clinical trials. Submissions close on August 30, 2012
  • Prediction Challenge: In the second challenge, Practice Fusion is soliciting ideas on prediction problems based on the dataset provided. Competitors can submit their prediction problems on Kaggle, using a newly launched tool for community voting. The selected prediction challenge will then be open to competition by the entire research community. For example: a model to predict diagnosis of diabetes or cholesterol changes. Submissions for ideas close on June 30, 2012. One predictive problem will be selected from the top voted submissions and the contest will then run from July 5 to September 10.

Practice Fusion will be giving away a total of $20,000 in cash prizes as well as beta access to our API, Dell computers, consultations with Practice Fusion’s founders, recognition on this site, and access to larger datasets.

At the Health Data Initiative conference, there is palpable excitement for the potential of publicly available health data to catalyze a new generation of apps, analyses and businesses to transform healthcare in the United States. The government has been releasing data in droves, everything from air quality reports to Medicare claims. Practice Fusion has long believed that big EMR data can lead to big improvements in public health. With a new, richer dataset and the Kaggle community of 40,000 data scientists, this competition is part of the growing movement to make healthcare better through data and innovation unleashed.