Big Data is a general term used to describe voluminous amounts of unstructured or structured data. It’s that simple.
Unstructured data is a generic term used to describe data not contained in a database or some other type of prescribed data container. Examples of unstructured data are claim adjuster and medical case manager notes. It can also include emails, videos, social media, instant messaging and other free-form types of input. Gaining reliable information from unstructured data is significantly more difficult than from structured data.
Structured data is that which is housed in a predefined container that can be mined for information. It has a specified format for organizing and storing data. Structured data is designed to organize data for a specific purpose so that it can be accessed and manipulated.
To evolve ordinary data to Big Data in Workers’ Compensation, data from multiple silos must first be integrated. The industry uniquely maintains claim-related data in separate places such as bill review, claims systems, UR, medical case management, and pharmacy or PBM. While integrating data is an achievable task, other issues remain.
Poor quality data
Unfortunately, much of the existing data in this industry has quality issues. Data entry errors, omissions, and duplications occur frequently, and if left unchanged, will naturally become a part of Big Data. Poor data quality is amplified when it is promoted to Big Data.
Value of Big Data
The reason Big Data is so attractive is that it provides the quantity of data necessary for reliable analytics and predictive modeling. More data is better because analysis is statistically more valid when it is informed by more occurrences. Nevertheless, greater volumes of data cannot produce the desired information if it is wrong.
Predicting that a devastating earthquake will occur in the next twenty-five years does not generate urgency. Likewise, knowing “clean” Big Data will be needed to remain competitive and viable in the future does not inspire aggressive corrective action now. But it should.
Correcting smaller data sets is easier than trying to fix huge data sets. It may not even be possible to adequately cleanse Big Data. Moreover, preventing erroneous data before it occurs is an even better approach. Data quality should be valued. Those responsible for collecting data, whether manually or electronically, should be held accountable for its accuracy. Existing data should be evaluated and corrected now to create complete and accurate data. Doing so will speed migration to Big Data without drowning in Big Bad Data.
Karen Wolfe is the founder and President of MedMetrics®, LLC, a Workers’ Compensation medical analytics and technical services company. MedMetrics offers online apps that super-charge medical management by linking analytics to operations to make them actionable. email@example.com