There is a major part of Big Data that is often lost in the discussion. But it is exactly this part that brings the most benefits. And not only when the data is big.
The handling of Big Data is comprised of two fundamental parts:
- The architecture and software (algorithm, tools) for data storage, processing and retrieval to deal with the three V’s (volume, velocity, variety) when there is an enormous amount of data involved.
- The analytical part that make predictions, find relationships and trends, detect anomalies, and make inferences.
The often forgotten big side of Big Data
Often when people speak of Big Data, they talk about it incompletely, meaning only the first part above. But the second part, Big Data analytics, is critical. Otherwise all the large amount of data would be just a pile of data of no use. The second part can also be applied to data sets of any size.
It is this second part that actually and directly generates benefits to many different areas such as health care, medicine, marketing, sales, manufacturing, public administration, law enforcement, insurance, financial planning and analysis, power industries, human resources, banking (fraud detection, fraudulent credit card and account transaction detection, risk management, loan application and negotiation, credit scoring) logistics and production, stock market, commodity and energy trading.
There are also many levels and phases in the second, analytical part, most of them iterative: situation analysis, data collection, data cleaning, principal component analysis, variable selection, dataset subsetting, aggregation and reshaping, missing data handling, graphical analysis, resampling, data modeling, machine learning model selection, model fit and testing, performance testing, dynamic report generation. And this is where Big Data can bring Big results and Big return on investment.