Monday, 21 July 2014

Copula functions in R

A copula function is an application which "couples" (joins) a multivariate distribution to its univariate margins (marginal distributions).
Copula functions can be really helpful in building multivariate distributions given the marginals. Here is a fast introduction to copulas.


A copula C can be defined as follows:


where I is the interval [0,1].


Archimedean copulas are a particular class of copulas which can be built using a function phi known as the copula generator, from the following relation:



In this post, we are going to see the main formulas for using a particular Archimedean copula in R: the Gumbel copula.
The gumbel copula is built using the generator function below

and has the following expression

The package, available in R, which has some useful functions on the gumbel copula is called gumbel, you need to install it and then call it like this
library(gumbel)

Once you called it, here are some basic functions
here is the density function

#plot the density
x <- seq(.01, .99, length = 50)
y <- x
z <- outer(x, y, dgumbel, alpha=2)
persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = "lightgreen", ltheta = 100,xlab = "x",ticktype = "detailed", ylab = "y", zlab = "Density of the Gumbel copula")


and the cumulative distribution function (CDF)

z <- outer(x, y, pgumbel, alpha=2)
persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = "lightgreen",ltheta = 100, ticktype = "detailed",xlab = "u", ylab = "v", zlab = "Cumulative distribution function")


Finally, we are going to take a look at the random number generating function. The range of dependece simulated by the Gumbel copula, depends only on the parameter theta. As theta increases, so does the dependence between observations. As you may have noticed, if theta is equal to 1 (theta is in [1,Inf) for the Gumbel copula), then we fall back in the independece case here below

#we simulate 2000 observations with theta = 1
r_matrix <- t(rgumbel(2000,1))plot(r_matrix[1,], r_matrix[2,], col="blue", main="Gumbel, independence case")


When we increase theta we obtain a different result, as expected

#we simulate 2000 observations with theta = 2
r_matrix <- (rgumbel(2000,3))plot(r_matrix[1,], r_matrix[2,], col="blue", main="Gumbel, Positive dependence")

you can see that Gumbel copula can be used to simulate positive and asymetric dependence, in fact the correlation seems to be higher on larger values. Below an example for theta = 3



On YouTube, I uploaded a simple animation created with R and windows movie maker you can watch it in the embedded video below



Hope this was useful.

5 comments: