Wednesday, 23 July 2014

How to fit data to a normal distribution using MLE and Python

MLE, distribution fittings and model calibrating are for sure fascinating topics. Furthermore, from the outside, they might appear to be rocket science. As far I'm concerned, when I did not know what MLE was and what you actually do when trying to fit data to a distribution, all these tecniques did looked exactly like rocket science.
They are not that much complicated though. MLE is a technique that enables you to estimate the parameters of a certain random variable given only a sample by generating a distribution which makes the observed results the most likely to have occurred. Distribution fittings, as far as I know, is the process of actually calibrating the parameters to fit the distribution to a series of observed data.

Let's see an example of MLE and distribution fittings with Python. You need to have installed scipy, numpy and matplotlib in order to perform this although I believe this is not the only way possible. For some reason that I ignore, the methods in scipy.stats related to the normal distribution use loc to indicate the mean and scale to indicate the standard deviation. I maybe can grasp why use "scale" to indicate the stdv however I really do not get "loc" I do not understand why... If you know that, please leave a comment.

The result should look somewhat like this:

Hope this was useful.