Big Data is a term used for large data sets which are so huge that traditional software cannot deal with them. In a business sense, the term also refers to extremely large data sets that can be analysed to reveal patterns, trends and various insights. The rise of big data in business is being pushed by the exponential growth and availability of data, both structured and unstructured.
There are a number of challenges surrounding the industry, including capture, storage, analysis, curation, search, sharing, transfer, visualisation, querying, updating and information privacy.
While there are many misconceptions, such as believing that it is necessary to collect and store large amounts of your own data, it cannot be denied that this buzzword has great significance to businesses and the world.
Industry analyst Doug Laney said in 2001 that the mainstream definition of big data consists of three V’s: volume, velocity and variety.
What are the three Vs in big data?
Volume simply means that there is more data. This could be from transactional data that has been stored over the years, unstructured data which can be from social media or sensor and machine data that is being collected.
Velocity refers to the speed at which data is streaming from sensors, smart metering and RFID tags. This has helped to drive an increase in the need to analyse data in real-time.
The third V is variety, which looks at the various different formats that data comes in. These different formats include structured data, numeric data, data created from a line of business applications, unstructured data and financial transaction data, in addition to others.
Some companies such as SAS consider there to be two additional ‘V’s: variability and veracity.
Variability looks at the inconsistency of data flows, such as social media activity that reacts to an event to create an increase in usage.
Veracity refers to the fact that data is coming from multiple different sources which can be difficult to keep consistent.