Friday 30 January 2015

How to estimate probability density function from sample data with Python

Suppose you have a sample of your data, maybe even a large sample, and you want to draw some conclusions based on its probability density function. Well, assuming the data is normally distributed, a basic thing to do is to estimate mean and standard deviation, since to fit a normal distribution those two are the only parameters you need.

However, sometimes you might not be that happy with the fitting results. That might be because your sample does not looks exactly bell shaped and you are wondering what would happened if the simulation you ran had taken this fact into account.

For instance, take stock returns, we know they are not normally distributed furthermore there is the “fat tails” problem to take into account. At least it would be interesting estimate a probability density function and then compare it to the parametric pdf you used before.

Here below is a simple example of how I estimated the pdf of a random variable using gaussian_kde from scipy

and here is the plot that we get


What we can see is that the estimated pdf is “more dense” around the mean and has some more density on the tails.

Let’s use this data to simulate a sample

For sure in this randomly generated sample there are some extreme values and they look close to the actual sample.


Hope this was interesting and useful.


This article is for educational purpose only. The author is not responsible for any consequence or loss due to inappropriate use. The article may well contain mistakes and errors. The numbers used might not be accurate. You should never use this article for purposes different from the educational one.


  1. Thanks. It's a nice post about probability density function. I really like it :). It's really helpful. Good job.

  2. Can probability density function take values greater than 1?

  3. whats the format of the data file?

  4. This comment has been removed by the author.


  5. This is a great inspiring blog.You have shared really very helpful information thank you.
    Data Scientist Course in Jaipur

  6. Very informative message! There is so much information here that can help me thank you for sharing
    Data Analytics Course in Lucknow

  7. I couldn't leave your website until I told you that I really appreciated the high quality information it presents to your visitors. I will come back frequently to check for new posts.

    Data Scientist Course in Durgapur

  8. I enjoyed reading your articles. I have bookmarked it and I am looking forward to reading new articles. Thanks for sharing.
    Data Science Course in Ahmedabad

  9. Thank you for your message. I've been thinking about writing a very similar article for the last few weeks, I'll probably keep it short and to the point and link to this article instead if that's interesting. Thank you.

    Business Analytics Course in Ernakulam