Sunday, 20 July 2014

Simulating random points with Python

Random, randomness and order are surely fascinating topics. In fact, even thought one might think random is a relatively easy idea, it is actually much more complicate than one might think.
This video is really interesting and if you are interested in randomness and random variables you should definetly check it out:https://www.youtube.com/watch?v=nAxEzxHkqyY


Computers can simulate random values if ordered to do so. Although they appear random these values are not really random. In fact, they are generated through a certain procedure and can be replicated. This feature of course can be useful if you run some kind of simulation with random numbers and then you want to replicate your simulation with the very same "random" numbers. Remember the line set.seed(a)? If you run this line in Python before you run your script, it will generate random number in a certain way according to the parameter (a) which should be a number (I believe positive only are allowed). Next time when you run the script again, if you want to get the same "random" numbers, you just need to add the line set.seed(a) and make sure the parameter a is the same you used the last time.

Simulating values can be very useful, say for instance that you know the distribution of a random variable for example you know that each number on a dices has the same probability of showing. Now you can simulate that random variable. Python and R have many interesting functions to generate random numbers. In future I am going to make a comparison post between Python and R. Today I am going to show you just some basic random generating functions in Python.

Random uniform distribution:
The probability density function of the uniform distribution is the following
The probability mass is uniformely distributed in the interval [a,b]. In Python the function random.uniform(a,b) generates a random value x, a < x < b. Let's see an example where a = 0 and b = 1:

import random
from matplotlib import pyplot as plt
import pylab

#random uniform sample
sample1 = []
sample2 = []

i = 0
while i < 10000: sample1.append(random.uniform(0,1)) sample2.append(random.uniform(0,1)) i += 1 plot = plt.plot(sample1,sample2, "bo") plt.show()

Here is the result if we plot the two random vectors we have generated above:
In fact it could be fun to combine each random number generating function with the others and plot the result. Here some examples:

sample1 = []
sample2 = []

i = 0
while i < 2000:
    sample1.append(random.gauss(0,1))
    sample2.append(random.uniform(3,1))
    i += 1

plot = plt.plot(sample1,sample2, "bo")
plt.show()
and the result:
In this example we combine a uniform distribution with a gaussian one. Let's see a final example with gammavariate distribution and gaussian
sample1 = []
sample2 = []

i = 0
while i < 2000:
    sample1.append(random.gauss(0,1))
    sample2.append(random.gammavariate(2,1))
    i += 1

plot = plt.plot(sample1,sample2, "bo")
plt.show()
and the result:


With this method you can more or less simulate different behaviour of random variables by combining different distributions. Perhaps I'll post more examples when I'll write about the random module in Python and R.

Another tool which is useful to study random variables and their joint behaviour are copulas. A copula is a function which joints together many CDF and returns the joint CDF (Cumulative Distribution Function). It enables you to express the joint cumulative distribution of two or more random variables as a function of their marginals. As far as I know, only R has some functions for copulas, perhaps I'll make a post on it in the future.

You can check out this wikipedia page for more information on copulas: http://en.wikipedia.org/wiki/Copula_%28probability_theory%29

No comments:

Post a Comment