There is no single factor that determines whether something is big or traditional data. Some of the fundamental differences include the value, as mentioned above, and whether it can be effectively analyzed by traditional or older tools. Traditional data is structured, like in databases, and relies on statistical methods and traditional querying tools like SQL to be analyzed. Big data is fast-moving and includes vast datasets in disparate formats, including structured, unstructured and semi-structured data. Traditional data analysis tools cannot process or analyze the scale or complexity of big data, so big data requires distributed systems and advanced tools like machine learning for analysis.
Traditional data analytics handles a manageable volume of information, like running an end-of-day sales report from a single, structured financial database by processing it in predictable batches. Conversely, big data analytics solutions are necessary when dealing with a massive volume of streaming data, such as a global ride-sharing app monitoring millions of vehicles. In this situation, for example, data needs to be ingested and processed with high velocity (in milliseconds) to calculate real-time estimated times of arrival and dynamic pricing.
Here, big data must also handle massive variety, integrating structured GPS coordinates with unstructured driver feedback text and images. Sophisticated techniques are required to manage the veracity (trustworthiness) and ensure that the ultimate business value is extracted, a complexity for which traditional systems are simply not built.