Unlocking the power of dark data

Published: October 10, 2019

What is dark data?

Dark data, in layman's terms, comprises every click and move made by an organization while conducting business. However, organizations fail to utilize this dark data beyond the immediate requirement. Often companies collect dark data for regulatory and compliance purposes as a precautionary measure which lies redundant in their storage forever. Since storing dark data is cheaper than analyzing it, organizations find it convenient to store dark data indefinitely, resulting in loss of valuable insights over time.

The age-old quest for exploration of unknown realms forms the fundamental basis of scientific research and development. Research primarily involves three key steps: observation, recording, and inference, with inference being the most significant element of the entire process.

With exponential growth of the research community, scientists have successfully observed and recorded events over the decades, eventually leading to a data explosion. This humongous data can be classified into three forms - structured, semi-structured, and unstructured. Unstructured data is the centerpiece of the puzzle when it comes to unlocking invaluable insights.

The advent of data analysis has helped unearth the root cause of numerous business challenges right from customer pain points to lynch pins in a supply chain. While data has immense power to transform organizations, are we able to harness its full potential?

Putting the spotlight on dark data

Experts estimate that about 85% of the data recorded is unstructured in nature comprising of logs, huge volumes of text, unlabeled images. According to Gartner, unstructured information assets that organizations collect, process and store during regular business activities, but generally fail to use for other purposes is referred to as dark data.

For a long time, dark data remained an unexplored territory, often used as a backup for compliance teams. This was primarily due to the nature of data treatment. While structured data is represented in a longitudinal format in relational databases (Oracle/SQL), unstructured data is stored on non-relational databases (Mongo DB), which have become popular only recently. One size doesn't fit all, and dark data certainly requires a different treatment if significant inroads must be made into the volumes collected.

Data mining of unstructured data needs a bit more attention when compared to traditional mining techniques. What does it take to meddle with this artifact?

  • Sophisticated skill at a user level
  • Robust infrastructure to handle the volumes
  • Domain expertise to tap the sweet spots
  • Advanced Machine Learning (ML) algorithms that can read between the lines

Dark data in life sciences

Every domain, organization, and department has its own scale and method of storing dark data. In the life sciences industry, it is a trend to lock away doctor's prescription once the patient is cured. Minimal time and effort are invested in collecting all the clinical notes for analysis of hidden patterns from the physician's recordings. As a single prescription, the document might not offer much value, but analysis of similar notes from a population group can throw open the pandora's box for a specific group. These insights can offer critical clues on various aspects, such as:

  • Comorbidity analysis
  • Common medications
  • Symptom analysis
  • Treatment pathways
  • Patient outcomes

Convergence of IoT and dark data

While IoT is unlocking new possibilities for life sciences industry, integration of IoT will only add to the existing pile of data with sensors and devices flooding the storage disks across the world with information about both man and machine.

A report suggests that IoT devices may add 269 times more data than that which is currently available. Out of this, 80% will be in the form of dark data, which cannot be ignored by any organization intending to make an impact in today's competitive world.

IoT devices will continue to play a significant role in healthcare advancements by turning dark data into useful insights.


By tapping into non-clinical dark data (available on IoT devices) such as patient location, healthcare spends, and social media, providers can deliver better health outcomes. Insights uncovered from dark data enables them to offer the right treatment to the right patient at the right time. Personalized services not only generate new revenue streams for providers but also improve the efficacy of treatment for patients.

Digging the gold mine of dark data

With the help of artificial intelligence and machine learning, it is possible to quickly and easily turn dark data, lying in dormant state, into insights for healthcare and life sciences industry. These insights can pave the way for innovation, drug discovery, and research and development. In a nutshell, dark data is a potential gold mine of healthcare insights just waiting to be explored!

Maximize clinical trial success with Sample Collection, Orchestration and Reconciliation (SCORE)

by digitizing and automating end-to-end sample collection processes

Related content