Getting a handle on data sprawl through classification and discovery - By Prashanth GJ, CEO, Technobind

Data has been awarded many titles by now—the new oil, the biggest business asset and so on. Whatever we like to call it, the power of data in driving transformational changes for a business and the overall society is undisputed.

While data is at the core of almost all the business, technology and costumer initiatives that enterprises are embarking on, it’s also one of the hardest beast to tame—because of the overwhelming volumes that we are forced to deal with.

As Mary Meeker (publisher of the widely consumed annual Internet Trends Report and often referred as the Queen of the internet) famously said, “If it feels like we’re all drinking from a data firehose, it’s because we are.”

Data growth leading to data sprawl

Quite interestingly, data growth has been shattering predictions and shooting up at unprecedented rates. According to IDC estimates, newly created data volumes in 2020 was predicted to see a 44X growth from 2010 and reach 35 zettabytes. In 2018, we already touched the 33 zettabytes milestone, following which IDC predicted that 175 zettabytes of new data will be created by 2025.

This is both good news and bad news. For the optimists, this means, even more data available to be analysed and drawn insights to drive better customer experiences, new revenue models and much more.

The flip side is a lot of this data is unstructured, raw, and too complex for traditional data processing software to act upon. And most importantly, it consumes outrageous amounts of expensive storage. It is estimated that close to 80 percent of any organization’s data remains unstructured, fragmented, and unused.

Today’s hybrid multi-cloud environments and distributed workforce are only adding further complexity to this conundrum. Imagine this: organizations, on an average, use close to 30 different cloud services today! IT teams are having a hard time knowing where the sensitive data is stored and processed across a disparate environment.

As enterprises continue to capture and generate increasing volumes of data-- stored across silos-- data sprawl is emerging as the biggest concern.

Data classification to address the massively detrimental data sprawl

Data sprawl is not just about increased cost, management complexities and difficulty in leveraging the data. It is a lot more about security and compliance as data becomes much harder to keep track of. IT departments are grappling with the challenge of developing visibility across the data silos. Most organizations struggle to have that visibility and hence the ability to protect critical data. How do they protect something that can be barely located?

It becomes all the more challenging for IT teams to eliminate unauthorized access to sensitive data which is further putting the organization at higher risk of non-compliance with a number of regulations—be it GDPR or HIPPA or anything else. If you do not have complete visibility into your data, both on-premise and cloud, you are certainly risking yourself from a compliance perspective.

Complexity around data sets is pointed as the biggest challenge for CIOs and CISOs in deploying effective security and compliance practices within their organizations.

This is where Data Discovery and Classification solutions are increasingly becoming relevant. Classification technologies are helping enterprise IT and security teams to understand what data is stored where, offering complete visibility into sensitive data across heterogeneous data stores.

From a compliance point of view, data classification helps understand risks, uncover gaps, and make better decisions about aspects such as third-party data sharing and cloud.

For the partner community, this is indeed a great opportunity. Multicloud usage is exploding and organizations today have highly distributed workforce. Even the traditional organizations are aspiring to becoming digital businesses, which in turn exacerbates the compliance problem.

At the same time, the regulatory landscape is getting highly complex. Non-compliance can cost organizations dearly and every single organization is looking at quality, compliance, and security at scale.

The time is ripe for IT solution providers to take the data conversation beyond storage and data management. Compliance, privacy, and security have clearly become the critical priorities for enterprises as they navigate the global disruption. Data classification thus has the potential to a business-critical conversation in the coming days.