domingo, 10 de abril de 2016

The prince and the Anscombe quartet-When the essential in not seen in business



The Anscombe's quartet and the Little Prince-When you lose sight of the essential and making wrong decisions



The prince, a children's novel published on April 6, 1943 by the French pilot Antoine de Saint Exupery, contains lessons and messages to be captured and applied in real life by children as well as adults to keep the spirit young and vitality. These lessons can also be applied in other areas of life. Just enough to regain naivete, open-minded and open-mindedness that characterizes a child to access the "secret of life". Paradoxically, the more complex is not meant assuming the position of the wise or omniscient but one who is supposed to know less and who erroneously considered as unfit to understand the complexity of the world, the child.

On ne voit le cœur qu'avec well. L'essentiel est pour les yeux unseen. (Le petit prince)
(Only it looks good with the heart. The essential is invisible to the eye)

The message for daily life is clear and has multiple interpretations. "Do not judge a person by appearance, but by their personal values." "Do not evaluate a situation by the apparent facts, look for the deeper reasons". And much more.

But in the business world, in everyday business practice is this message also applies? If, in all areas, from matters where only involved people to situations where technology, impersonal, cold and seemingly objective is imposed and accurate to solve or help "solve business problems" or improve the process of "taking making. " impressive phrases but many reveal the weaknesses or deficiencies of those who use or apply because they see the appearance and not the essence of the situation.

In some cases, studies or practical life, certain formulas derived from logical associations and common sense become a kind of gods, idols that magical powers are attributed and whose relevance or veracity should not hesitate . (Note 1)

In other cases, it comes procedures, methods or generally mathematical algorithms, internally consistent testing, but whose application should be performed judiciously and weighting of the circumstances, as they are tools and should be a hammer serves to put the nails, screwdriver screw; the reverse use of the tools is ineffective and may exacerbate a problem.

Consider the case of linear regression, a statistical technique that allows a projection of a dependent variable (Y) given a value of the independent variable (X); after previously found functional mathematical relationship between them from a set of predetermined values. The algorithm exists, it can be done manually or with the aid of computers, powerful allies for the calculation.

Imagine this situation. For 24 consecutive months, it is available data on the total sales and the number of vendors. You want to estimate or project sales over the next six months. The person using the algorithm, after calculations, triumphant shows the linear regression equation and delivers it to who will make the decisions and give the necessary orders.

If the person who calculated is the same as the user, if conscious, responsible and cautious, and also takes into account the message of The Little Prince (what is essential is invisible to the eye) can stop and check the reverse procedural, the formula , calculations and finally data might discover that something is wrong and therefore do not fall into the trap of Anscombe's quartet. You can avoid losses and costs of bad decisions.

If two different people, systems engineer or someone similar and general manager, is almost certain catastrophe. The user (general manager) can order the expansion of production if the regression equation suggests gradual increase in sales, when in fact the data showed a downward trend and one outlier distorted this trend.

 


                                                       Manager desperate because sales fall

And what is Anscombe quartet? Statistical Francis Anscombe, in 1973 discovered the hidden trap in what is called the "Anscombe quartet". It is four datasets that have properties identical basic statistics (mean, variance, correlation index and the same regression equation) but appear very different when plotted in the Cartesian plane, showing striking differences. Each group has 11 data and example was designed to demonstrate the importance of graph data before analysis and take into account the impact of extreme or anomalous data. Note that the four scatter plots do not suggest the linear plot and linear regression as a solution (Note 2).

 



 

 
It can be said that when there are many "unknown unknown" (unknown unknowns) or things we do not know we do not know (things we do not know we do not know), it is important to human judgment relied on data to make decisions. Data visualization, a clever way to apply the judgment or common sense, is an unusual practice and for which many executives do not have proper training, can reveal patterns and shapes that otherwise could not have been discovered. Once again "intrudes" The Little Prince.

One could argue that the lesson of Anscombe, proposed more than 40 years ago is no longer valid in the present, where technology facilitates calculations and processes. This claim is easily refuted because precisely at this time that the data is available by millions, at all times and in different formats, forming the context of Big data, the need for visualization is absolutely true.

According to Jewett (2014: 5) Big Data must be set to eye level, so a report by the Aberdeen Group in 2013, indicates that in organizations using discovery tools for visualization, 48% of users of Business Intelligence are able to find the information they need without assistance computer equipment. Often they are the "geniuses" that are not available when they are needed. Without the help of visualization, independence rate falls to only 23%; executives also employing data visualization, were 28% more likely to find the required information on time; those who interacted with the data were 33% more productive compared to 15% who do not. That is, they were the value of Big Data with greater intensity, speed and relevance.

Moraleja. Technology, algorithms, mathematical models and procedures are not gods or fetishes, are just tools and before use, we must assess the context to avoid falling into the trap of "Anscombe quartet". Take a little air, a little time and try to see what is essential, if it succeeds, the rest will be easier and will be done correctly. Incidentally, if you read The Little Prince, do it again and if not, it's time to do so.

Note 1. I remember in my classes Master, the formula of an accounting balance, used to calculate the amount produced in revenues equaled revenues and costs, which could be deduced quickly with a few calculations. Some comrades do not know whether out of laziness or deficiency in their professional training, resisted that and for exams, preferred memorize "the sacred formula" other less honest used their means of plagiarism (strips of paper, write in the folder, writing on clothing or arms); there were some who were willing to to tattoo the formula. At that time there were no Ipods, but all had chosen this medium. These people certainly were and are victims of "Anscombe quartet".

Note 2. The initial chart data allows better to choose the mathematical model. The regression model can not be applied to everything. A hammer used to drive nails, but not to cut wood, unless it is very clever to give the precise blow.

References
Dan Jewett (2014) 7 Tips to Succeed with Big Data in 2014
VP Product Management

Wil M. P. van der Aalst (2014)  Data Scientist: The Engineer of the Future

Anscombe's quartet
https://en.wikipedia.org/wiki/Anscombe%27s_quartet