Primary tabs Reaches 3,000 Datasets: A Look Back and What’s Ahead just passed 3,000 datasets! What a fantastic way to begin 2017, and end amazing year for this online data sharing platform.

As a matter of fact, the last several years have been extraordinary in health innovation for the U.S. Department of Health & Humans Services (HHS) and across the ecosystem of innovators. I’m taking this opportunity, before I leave HHS, to reflect on a few of my colleagues’ accomplishments, share some uses of the data and cool ways to access more, and look ahead at the horizon for open data. is the central catalogue and communications vehicle for the Health Data Initiative (HDI), offering access to HHS and other sources of health and social services data. These are high value structured data holdings whose availability have become both a leadership tool for expanded knowledge and understanding across the Department and fuel for entrepreneurs to deploy amazing innovations. Success of the site is driven by broad-based enthusiasm for and commitment to the HDI in providing a diverse array of data from disease surveillance and mortality data, to health care cost and quality, to substance abuse and mental health, and so much more.

It all began in 2010 when HHS, in collaboration with the Institute of Medicine and other external partners, held a one day meeting featuring the work of 21 developers who were granted access to only 10 datasets with which they created new applications. Today, the Health Datapalooza gathers thousands of health innovators and data geeks to meet the data’s curators in a forum that gives birth to new ideas, helps connect individuals and companies that have similar experiences, and launches new business on strong foundations of how the data may be used more effectively. Our model at HHS has been to free the data and allow innovation to flourish.

For decades, before my time here, our health and social services systems strode to move away from high cost, inefficient methods, navigating their way toward better care, smarter spending, and healthier people. Today, modern information technology, more intelligent health policy, and increased data availability are helping to re-engineer the health system. Our evolved health systems will deliver better care--where health professionals efficiently coordinate and have a full understanding of an individual’s needs; be smarter -- paying for what works and spending our money more wisely; and help each of us be healthier—with engaged, empowered, IT enabled patients, families, and communities. To my mind, data is the current propelling everything; and an important ingredient is government data that are openly available, generated to be useable, and intelligently applied to help solve some of our most vexing challenges in healthcare.

At HHS the HDI was one of the first seismic culture changes that cracked open calcified government processes. By opening up HHS data to people who think differently about the data than we do, to innovators who see those data through a different lense than the one for which they were originally collected and curated, we’ve discovered even greater value from our data resources.

Accessing and Innovative Use of HHS Data

Now, over 3,000 datasets are available on underscoring federal, state and local government commitments to making huge volumes of information available to catalyze health innovation nationwide. HHS has achieved amazing accomplishments to provide more data and use its own data effectively. A few examples include:

  • Centers for Medicare & Medicaid Services (CMS) announced the Virtual Research Data Center, a new secure way of accessing its program data through virtual access to the CMS Virtual Research Data Center (VRDC). The CMS VRDC is a virtual research environment that provides timelier access to Medicare and Medicaid program data in a more efficient and cost effective manner. Open data are also available through the Data Navigator &
  • The Department of Health and Human Services Office of Inspector General, along with our state and federal law enforcement partners participated in the National Health Care Fraud Takedown in June 2016, the largest health care fraud takedown in history. Using data from CMS, in collaboration with the Department of Justice, approximately 300 defendants in 36 judicial districts were charged with participating in fraud schemes involving about $900 million in false billings to Medicare and Medicaid.
  • Centers for Disease Control & Prevention (CDC) provide data about everything from outbreaks like Zika, Chikungunya, Flu, or Ebola to Chronic Disease & Health Promotion Data & Indicators and injury and mortality data at The statistical arm of CDC the National Center for Health Statistics (NCHS) also grants researchers access to restricted data via their Research Data Center.
  • Food & Drug Administration (FDA) released openFDA which delivers high-value APIs (Application Programming Interfaces), making it easy for developers and researchers to query and build upon data related to drug adverse events, drug labeling, medical devices (including adverse events and device registration), and food recalls.

Data Takes Work

It’s important to acknowledge that government open data requires planning and cultivation, and hard work. High quality openly available data resources require information technology funding, skilled data savvy staff, strong governance, and supportive leadership. Resource allocations have been a challenge for open data projects at every level of government. In the past, the availability of program data for public use were not a planned output from our programs, it was an afterthought. Government agencies have begun to adopt approaches that plan for the data’s alternative uses from the program’s inception. We’re working toward a government where data will be proactively planned for during the program planning stages. This new approach is already in the works through policy changes like those articulated in White House circular A130 regarding management of federal information resources. It requires, among other things, that open data will be a budgeted component of many IT procurement requests.

HHS is working on stronger strategies for well-managed data in several ways. We’ve appointed Chief Data Officers (CDOs) because we recognize the need to truly manage our data resources as assets. Our CDOs are surrounding themselves with personnel that have an intense curiosity and passion for playing in the data as well as the ability to manipulate it to solve problems and tell stories. Many data management offices have dramatically updated their infrastructure to support a higher demand for faster, more effective participation in the growing health data innovation ecosystem.

I hope in the near future CDOs and prolific data science teams will be more prevalent throughout HHS and all levels of public service. I’m also hopeful the Department will appoint a Chief Data Officer who will strategize across the myriad initiatives like the DATA Act, Increasing Access to Federally Funded Scientific Research, Open Data Policy, Administrative Data for Statistical Purposes to formulate an integrated strategy for the Department to execute on its goals.

During my tenure at HHS, I’ve learned that the public needs those of us in public service to provide timely, well managed data resources to support the incredibly talented pool of health tech innovators, legions of researchers, data driven media reporters, and so many more. We’re pushing to be more efficient, creative and modern in how we deliver data resources and services while trying to get every barrier that we can out of your way so that you can do amazing things in health care. In turn, we need you to be on the cutting edge of innovation, pushing the data’s use to the limits and helping us prioritize where and how to release more. Use the data and tell us where it’s impactful. Those stories of the data’s use are part of the currency we use to sell open data’s far reaching benefits to this country.

It has been my honor to serve here at HHS. I’ve learned way more than I ever imagined possible from the talented cadre of HHS Health Data Leads, program managers, and their staff who represent the immense energy behind the data’s production. It’s been my privilege to be a representative of their work, time, and dedication in validating the data’s quality and privacy while responding to public inquiries about the data resources. There’s much more to come from HHS, and I’ll continue to be energetically cheering for the Department’s progress in making data available and spurring innovation.