The Little Guy vs Big Data: Is ICD-10 Coding Still Valuable in Health Care?

This article originally appeared here.
Share this content:
While ICD-10 was a new concept for physicians in the United States, the international variant has been available since the early- to mid-1990s.
While ICD-10 was a new concept for physicians in the United States, the international variant has been available since the early- to mid-1990s.

In October 2015, physicians across the United States anxiously awaited the long-hyped transition from the International Classification of Diseases (ICD), 9th revision to the 10th revision (ICD-10). The Centers for Medicare and Medicaid services (CMS), as well as insurance payers, warned physicians to have at least 3 months of revenue on hand to cover operating costs while the bugs were worked out.

Although ICD-10 was a new concept for physicians in the United States, the international variant has been available since the early to mid 1990s. Early adopters of the World Health Organization's (WHO) ICD-10 classification system include Brazil, the Czech Republic, The Netherlands, Russia, Sweden, and the United Kingdom — to name a few. However, one of the main differences between the US version of ICD-10 (ICD-10-CM) and the WHO version (ICD-10) is the sheer number of diagnostic codes included in the American classification system. That is, ICD-10 has approximately 14,400 codes, whereas ICD-10-CM has more than 144,000 codes.1

Our version of ICD-10 is so granular that it has been described as needlessly specific, absurd, and unnecessarily detailed.1 For example, ICD-10-CM includes codes for burn caused by water skis on fire (V91.07XA), prolonged stay in weightless environment (X52.XXXA or X52.XXXD), and being sucked into a jet engine for both initial and subsequent encounters (V97.33XA and V97.33XD, respectively).1 ICD-10-CM also differentiates between being bitten by a cow (W55.21), a sea lion (W56.11), or a parrot (W61.01).1 And if you ever find yourself in an argument with your in-laws, just remember, there's a code for that too (Z63.1).1

Given our unparalleled dependence on ICD-10 coding for reimbursement, it is no surprise that, even before its implementation, feverish debate raged regarding the utility, cost, and maintenance of such a complicated and possibly unnecessary coding system.1 In fact, some authors have gone so far to argue that ICD-10-CM is by design a threat to small independent practices.1

This is not a surprising conclusion when you consider that the major cooperating players in managing ICD-10-CM are an alliance of non-physician groups — namely CMS, the Centers for Disease Control and Prevention, The American Hospital Association, the American Health Information Management Association, 3M, and Blue Cross Blue Shield.1 It should come as no surprise that some of these groups are profit-driven behemoths in the world of medical billing. In contrast, of the 138 countries that use the WHO version of ICD-10, only 10 include the coding system in their reimbursement process; 6 of those 10 countries have universal health care systems.1

Click through our slideshow to view 10 of the most unusually specific ICD-10 codes.

The debate is further complicated by advancements in big data, predictive analytics, and machine learning. Technologic progress over the last decade has allowed for cheaper and faster data storage, more powerful computation, and greater automation through artificial intelligence. In the corporate world, these breakthroughs have allowed marketing companies to unlock answers about consumer behaviors by collecting data on every accessible aspect of individual lives. It seems that corporate America might have something to teach us about using technology to predict and improve clinical outcomes efficiently.

Some argue that this is precisely the point of ICD-10 implementation. Its granular nature allows, in theory, the accurate labeling of diagnoses required for both billing and observational analysis.2 However, the value and utility of those data is highly dependent on how accurately and effectively the classification system is used.3 As many opponents of ICD-10 point out, in clinical practice, coding is often “inconsistent, inaccurate, and incomplete.”2 ICD-10-CM classification also does not allow clinicians to express “clinical concern” when there is insufficient, incomplete, or inconclusive evidence to support a firm diagnosis.3 Moreover, because the codes are collected for billing purposes, some argue that their use in research is “intrinsically flawed.”1-3

If, for example, a clinically relevant ICD-10 code is not useful for billing purposes, it is likely to be left out of the coding altogether. On the other hand, a useful billing code, despite being clinically irrelevant, may prompt coders to ask physicians to add the code and amend the clinical documentation.3 Along those lines, physicians are often interrupted and forced, by the electronic health record, to select a code that may not accurately represent the medical issue at hand just to move on with their work. These interruptions are further exacerbated by other impediments to efficient workflows, including expired code warnings, lack of coverage warnings, specificity prompts, retrospective prompts, or the simple inability to find the right code.3

In an article published in Chest highlighting the continued value of ICD-10 in the big data era,2 Mark G. Weiner, MD, Assistant Dean of Informatics and professor of clinical sciences and medicine at the Temple University Lewis Katz School of Medicine in Philadelphia, Pennsylvania, points out that “much of the hate, misdirected at the coding itself, is more appropriately directed at the billing requirements tightly linked to coding, and the work of doing the coding.”2 In a way, he may be conceding that modern analytics can offer an improvement to healthcare operations and ultimately patient care. Over the past decade, advancements in natural language processing and machine learning have yielded software that can automate the process of selecting an ICD, procedural, or diagnosis-related group code by analyzing the clinical documentation.3 Autocoding can allow for even more granular data than are currently available in ICD-10. Those data are free of billing bias or user coding errors that might confound the conclusions drawn from them. The improved accuracy and reliability of software-driven solutions to medical coding could improve patient care in much the same way autonomous vehicles may someday improve safety over human drivers.3

Critics of big data argue that they are far from perfect, especially when based on electronic medical records. For example, issues like copying and pasting between clinical notes may result in the perversion of clinical findings and diagnoses that are no longer relevant.2 This type of noise in datasets can result in abnormal, unexpected, and inaccurate findings. However, these are human problems with automatable solutions. Software can be written to spot imperfections and identify issues with electronic health record data that humans would not be capable of identifying. Software algorithms could handle the coding more accurately and with less bias than humans, but it can also make ICD coding obsolete.3 Currently, we use clinical data to generate databases of ICD-10 codes, but machine learning could replace coding altogether by using those underlying data — notes and laboratory results — as the database. This is a far more granular, bias-free approach than having to recode that information into ICD-10-CM codes.

Maybe we are asking the wrong question. Asking whether ICD-10 remains important in the era of big data is like asking whether a bicycle is still useful in the era of autonomous cars. Sure, there may be some uses for ICD-10, but there is nothing novel about it. The CMS ICD-10 implementation was yet another example of the long-standing tradition in medicine of slowly rolling out old technology as a novel tool for improving health care.

Big data are rapidly evolving, and we are underusing them. Rather than focusing on an outdated process of classification and coding, we should be asking ourselves how to improve healthcare operations in this era of automation and machine learning. We ought to seek to disrupt the status quo. New and innovated strategies aimed at a more efficient and less expensive healthcare system can help power better clinical predictions and improve clinical outcomes. In some ways, ICD-10-CM may be an improvement on an old process, but the revolution that will forever change health care is big data.


  1. Kaye AD, Singh V, Boswell MV, Manchikanti L. The tragedy of the implementation of ICD-10-CM as ICD_10: is the cart before the horse or is there a tragic paradox of misinformation and ignorance? Pain Physician. 2015;18(4):E485-E495.
  2. Weiner MG. POINT: Is International Statistical Classification of Diseases and Related Health Problems, 10th Revision diagnosis coding important in the era of big data? Yes [published online February 1, 2018]. Chest. doi:10.1016/j.chest.2018.01.025
  3. Liebovitz DM, Fahrenbach J. COUNTERPOINT: Is the International Statistical Classification of Diseases and Related Health Problems, 10th Revision diagnosis coding important in the era of big data? No [published online February 5, 2018]. Chest. doi:10.1016/j.chest.2018.01.034
You must be a registered member of Rheumatology Advisor to post a comment.

Sign Up for Free e-newsletters