The sheer volume of structured and unstructured data in the world has increased manifold over the past couple of decades, hence the term “Big Data”. However, a recent survey of business executives has shown that up to 78% of them have challenges making data-driven decisions. So is big data really useful for business decision-making? Big Data is like a beast that should be tamed before it could be of any utility.
Dimensions of Big Data:
The three dimensions (or three Vs) of Big Data demonstrate the need for managing data:
- Volume : With the advent of so many new technologies, we have large volumes of data that could provide businesses with opportunities to grow. They need to find a way to tap into this “dark” data.
- Velocity : Data is generated very quickly now. Its relevance is sometimes lost if it cannot be processed quickly enough, to be able to respond to a business opportunity.
- Variety: There is a lot of variety in the existing data these days. For example, social media posts can be in any file format such as a photo or a video. They can also relate to any random subject.
In order to make use of any of the data available, appropriate management of the data lifecycle is needed. Otherwise, an organisation could incur unnecessary and huge costs. A generic lifecycle could be described in the following stages:
- Creation: Data is created or acquired in any form such as a social media post, or a sales transaction leading to the generation of a sales invoice.
- Storing and Sharing: The information contained in the data file from stage 1 would be subject to numerous considerations, including security and privacy concerns. Therefore, appropriate backup should be made and it should be distributed to relevant users only.
- Processing and Using: Most data is unstructured or semi-structured. An appropriate digital transformation strategy of a company would help to provide structure to the data as well. Then the users would be able to analyse and use this data for their needs.
- Archiving: Data does not remain relevant forever. Much of it becomes old and does not need to be readily accessible after a certain period of time. Sometimes, regulatory considerations may also apply such as a company being required to maintain its accounting source documents for 10 years.
- Destruction: Eventually, the storage costs of such large volumes of data would become too high, not only in monetary but also environmental terms. Therefore, this data should be purged when no longer needed.
The mismanagement of this lifecycle could pose various risks to an organisation. They could include regulatory and reputational risks in case of a data breach, or a financial risk in case of loss of data or disproportionate backup and recovery costs. Timeliness processing of data is also important in order to be able to effectively make use of it.
The role of Data Science:
A data science project is quite similar to a classic project in the sense that it is kicked off to address a business need and follows roughly the same logic of project management as any other project. A team of data analysts and data scientists can really add value to the processing and analysis of big data used in a company. A typical data science project life cycle would start by addressing a business pain that needs a solution. Team members with expertise in different domains would come together with their business and data understanding while a data scientist would play an integral role in propelling the project towards the right direction.
Challenges for Executives:
Many organisations are not yet prepared for the changes that the Fourth Industrial Revolution would bring. Business needs are evolving with new technologies and a real time insight is now needed. About two-thirds of business and IT leaders surveyed in 2020 expect the quantity of their data to increase by 5 times by 2025. Yet, 86% organisations do not expect to be prepared in time for the new Data Age. Lack of data is not a problem anymore, rather its timeliness and quality for it to be of any value to a decision-maker are the primary concerns.
This is where data science would be of prime importance in the coming years since Big Data will have to be analysed in a manner that it can be useful for survival. Data would become one of the most valuable assets for any company. The question is not about whether Big Data is useful or not, rather about how well would an organisation be able to use it.
The author, Aamina Khan, who is also the editor of Ed-watch, is an international polyglot citizen who likes to explore the world differently. A Chartered Accountant by profession, she likes to read and write in various languages as an amateur.