The 5 V’s of Big Data: Velocity, Volume, Value, Variety, and Veracity
One of the greatest innovations of the technological age has been the ability for individuals and businesses to collect large amounts of data about themselves and their organizations. This data can be as simple as the number of active visitors on a site or as complex as thousands of IoT sensors tracking the supply chain of a production line.
There are five innate characteristics of big data known as the “5 V’s of Big Data” which help us to better understand the essential elements of big data. In this article we will outline what Big Data is, and review the 5 Vs of big data to help you determine how Big Data may be better implemented in your organization.
What is Big Data?
Big Data can be defined as data that is so large that it cannot be processed using conventional methods. The actual amount of data that constitutes Big Data is undefined and growing each year as computational power and data analytics become cheaper and more accessible. But generally speaking, it is a data source which would be impractical or unfeasible to be analyzed by humans.
Big Data is collected by a variety of mechanisms including software, sensors, IoT devices, or other hardware and usually fed into a data analytics software such as SAP or Tableau. This analytics software sifts through the data and presents it to humans in order for us to make an informed decision.
SAP Analytics software can be used to analyze e-commerce data and determine how one aspect of data impacts another.
See Also: (Live Webinar) Meet ServerMania: Transform Your Server Hosting Experience
Why is Big Data Important for business?
Big Data has a variety of implications on business. At its core, Big Data helps business owners, employees, and executives better understand what exactly is going on in various aspects of the company. Things like:
- When a customer visits our site, what pages do they visit, how do they interact with them, and what causes the visitor to abandon a purchase?
- During production of our product, what circumstances in the supply chain lead to product defects?
- In the lifecycle of a customer, which events lead to a customer cancelling the service, and how can we proactively avoid these events?
When this Big Data is effectively and purposely captured, it helps people to make better decisions about how to improve operational efficiencies, increase profitability, and decrease customer churn.
Note the keywords effectively and purposely – because Big Data can also lead businesses to make the wrong decisions or even become inundated with so much data that they can’t make any decision at all when Big Data is collected improperly.
What are the 5 V’s of Big Data?
The 5 V’s of big data are Velocity, Volume, Value, Variety, and Veracity. We will discuss each point in detail below.
Velocity
Velocity is the speed at which the Big Data is collected. This speed tends to increase every year as network technology and hardware become more powerful and allow business to capture more data points simultaneously.
Example: Google receives over 63,000 searches per second on any given day.
Volume
Volume refers to the amount of data being collected. This is where Big Data largely gets its name due to the sheer size of the data being collected. The actual size will vary based on the data being collected. For example, the user analytics of the Netflix database will be astronomical compared to e-commerce data for a small business, but both could be considered Big Data as it is a large amount of data which is being collected.
Example: Netflix has over 86 million members globally, streaming over 125 million hours of content per day. This results in a data warehouse which is over 60 petabytes in size.
Value
Value is the worth of the data being collected. Some Big Data that a business stores may have little or no value in decision making or improving operations. A company may be required for compliance reasons to capture and store large sums of data which has no value. However, for Big Data that is voluntarily collected, a business should review exactly what data is being collected and how it can be valuable to the business. If the data has no value now or in the near future, it may be best to simply stop collecting it. Data that has no value can often serve as a distraction and only hinder the data analysis process.
Valuable Data | Data With No Value |
|
|
Variety
Variety is the different types of data which are captured. This could be structured data such as first name or email. It can also be unstructured such as a product review. In these cases, the data must be processed in order to analyze it. For a product review, this could be performing a sentiment analysis to determine whether the review is positive or negative. From there, a result of “percent of positive reviews” could be generated.
Unstructured Data | Structured Data |
Review sentimentFree-form comments | Email AddressPhone Number |
Veracity
Veracity is the quality or trustworthiness of the data. There is little point to collecting Big Data if you are not confident that the resulting analyze can be trusted.
For example, if you are piping all order data in but also including fraudulent or cancelled orders, you can’t trust the analysis of the e-commerce conversion rate because it will be artificially inflated.
Further Reading
If you’re interested in learning more about Big Data, take a look at our data storage solutions page, which outlines some critical decisions in choosing a Big Data storage server.
If you’re looking for servers to help power your Big Data needs, consider booking an expert server consultation. Our team will review your business goals and help design a custom server package that is tailored for your needs and budget.