Photo by NASA on Unsplash

Big Data is one of the biggest buzz words to have existed, but the question of its usefulness is more controversial than one might expect. “The Mirage of Data” has been an enlightening article I stumbled across in researching the objectivity of big data. Gerardo Dada, the author of this article details 7 Hazards that turn the useful tools of big data into an idol upon which companies sacrifice their business plans.

The danger of empirical data is not that the data itself can do something. The danger of data is that it can be misunderstood, incomplete, or unrelated to the topic for which it was collected.

Social marketers are quick to point out the higher customer value of those who follow a company on social networks, implying they buy more because they follow them. However, it is more plausible that they follow the retailer on social networks because they are loyal customers. (Dada)

Big Data has been used to find many correlations that can be quite helpful. The above example is perfect for showing this. Why do customers pick a specific store? No business can definitively answer this. If they could, they would dominate the market without question. Dada points out that data analysts can link repeated purchasing to participating in a mailing list, or social media website, but this only helps us rephrase the question: not answer it. Was it pre-existing loyalty that made the customer follow a social media page, or perhaps the social media page helped pull unsuspecting customers into a purchase they didn’t plan? Perhaps these are both symptoms of a deeper cause for returning customers? Or they may simply be coincidences resulting from a larger unrelated movement. While we may try to force causation onto correlation, this is simply a misunderstanding of the data we had access to.

You can survey thousands of people and ask them who invented the light bulb, which will result in a false sense of security in perfectly inaccurate data. (Dada)

If you have a survey that receives 1000 responses, but I have a survey that receives 2000 responses, are my results more accurate than yours? Of course not! But this is exactly what we do with big data. The data we have may be useful for our studies, but it may also lack important information which we are ignorant of.

“It is so easy to confuse information with evidence” (Dada). The data found by analysts may help them understand one facet of the process they study, but assuming that you have found the singular “right” answer is more dangerous than giving up entirely.

Data can show us what customers purchased, how they paid, and how often they visited. It cannot tell us why. (Dada)

Photo by Clay Banks on Unsplash

The weight of a television set has nothing at all to do with the clarity of its picture. Even if you measure to a tenth of a gram, this precise data is useless. (Dada)

Dada argues that the big data organizations are beginning to collect are often quite unrelated to the problems they are attempting to solve. In fact, the real causation of most decisions, human emotion, is almost impossible to gauge with any accuracy.

It is possible to be drowning in data and still none the wiser. (Paul Laughlin)

The misuse of Big Data, and data in general, has massive effects on the value of the results. The information generated from unrelated or incomplete data sets detracts from the objectivity of the science. No complete picture, much less an actionable business model, can be generated from data sets that simply don’t record the information needed.

While Big Data would seem to be as objective as any data set gathered, the issues that many scientists have had with data sets in the past reach a whole new level when the size of the data set scales up by such large numbers. To handle these new issues, common sense will need to play a bigger role than ever.

“A flood of data should never be allowed to wash away your common sense”(Jack Trout)

Logician, Debater, Developer, and Mathematician