Data Manager
What is big data?
The term Big Data refers to a very large set of data that no traditional database management or information management tool can really work with.
The information coming from a multitude of sources are contained in the messages we send each other, the videos we post, the weather information, GPS signals, online purchase transaction records, and much more. This data is Big Data.
Big data is more and more an integral part of a company strategy in terms of data management and usage and we are going to see the issues that emerge from these massive volumes of data.
The 4 fundamental Vs
Big Data specialists, especially at IBM, define Big Data by the following four Vs: Volume, Variety, Velocity and Veracity. These four dimensions characterize and distinguish big data from ordinary data.
#1 Volume
The main characteristic of big data is volume. The term is indeed taken directly from the immense mass of data generated on a daily basis.
According to IBM, an average of 2.5 quintillion bytes of data is created each day, or about 2.3 trillion gigabytes. This data is growing day by day with the constant addition of data sources. The rise of the Internet of thing (IoT) is proof of this.
Year after year, the amount of data increases dramatically. Over the whole of 2020, 40 zettabytes of data will be created, or 43 trillion gigabytes.
This data must be stored somewhere and the cloud is one of the available solutions.
#2 Variety
Beyond the simple quantity, this data is also more diverse than ever. This phenomenon is linked to the diversification of the uses of the Internet and the digital media. The source of data, its format, but also the field to which it is linked are experiencing unprecedented variety.
New types of data originating from social media, machine-to-machine and smartphones add new dimensions to traditional transactional data. This therefore requires changes in data organization models that no longer fit into neat and easy-to-use structures (see Key-Value, Columnar, Document, Graph).
#3 Velocity
The speed and directions from which data arrives in the enterprise is increasing due to interconnection and advancements in network technology, so sometimes it arrives faster than we can make sense of. The faster the data arrives and the more varied the sources, the more difficult it is to derive value from that data. Traditional calculation methods are getting limited and in some cases not working on data arriving at today's speeds!
#4 Veracity
The speed and directions from which data arrives in the enterprise is increasing due to interconnection and advancements in network technology, so sometimes it arrives faster than we can make sense of. The faster the data arrives and the more varied the sources, the more difficult it is to derive value from that data. Traditional calculation methods are getting limited and in some cases not working on data arriving at today's speeds!
To a lesser extent, there are other Vs also related to Big Data:
- Value: for all the information that can be extracted from Big Data
- Variability: for the stability of data models and links that can be made in these mountains of data.
With the development of the Internet of Things (IoT) and the ongoing digitalization in many areas of society, science and business, the amount of data is not diminishing. Big Data, as a generic term, describes the large volume of data - both structured and unstructured - that floods a business and to a larger extent the world on a daily basis.
💡 Big data V typical characteristics like storage, accessibility, speed of execution, processing and analysis represent a challenge and an opportunity for many years to come.
It's also good to remember too that in many cases it's not the amount of data that matters, but what organizations do with the data that matters. Big data needs to be analyzed to gain insight that drives better decisions and supports business strategy.
Sources :