Some reports state that Big Data has no clear definition and see Big Data as Big Hype.
Big Data, the illusion and the hype, according to these reports, looks very real and successful. A search today for job positions with the keyword Hadoop or Spark (we will write a post about their difference), considered the operating systems of Big Data, in the site Indeed brings over 4,200 open positions in California. The same search with the keyword Tableau (a popular data visualization tool, often used in the Business Intelligence area) brings about 3,200 open positions in California.
The numbers speak for themselves. Hadoop/Spark is nowadays adopted in many settings. But Big Data requires expertise in a rather large number of disciplines that are typically unknown to those who aren’t experts in the area: Hadoop, Spark, Scala, Python, Hive, Cassandra, among others. This may look scary to someone who is still holding on to his/her Business Intelligence (BI) skills. But it shouldn’t be.
Some People See a Hype?
Some complain that they can’t find a clear definition of Big Data. Of course there is a lack of clear definitions in the field, but this often happens when new technologies are born and are still in full development. Anyway, here we offer a clear one: when the amount of processed digital data in analytical projects require, including for performance reasons, the use of a distributed computing architecture such as the one offered by Hadoop/Spark, we are dealing with Big Data.
One typical example is real-time streaming data, for example real-time sensor data coming from sensors installed in machines and components in a manufacturing setting. These data can be used for predictive maintenance, potentially bringing extraordinary performance improvements in production stops and greater efficiency.
The great promise of the new technologies is to do well what we don’t do well. Data visualization is fine when a few dimensions are involved, but we humans can’t think in more than 3 or 4 dimensions. Even when only a few dimensions are involved, we humans can’t easily find hidden patterns, trends and relationships.
The new data analytics offers a lot more than the old BI where the author is at home. New technologies are usually disruptive and Data Science, including Big Data, is no exception. It is bringing a lot more than the old BI had to offer and it is leaving old ways behind. As the title of a best-selling business book says, who moved my cheese? Some people don’t want to move on as new and better technologies arise. This can be very prejudicial not only to business, but also to these people themselves.