The Beginner Programmer: February 2015

Sunday, 22 February 2015

A plus: 2 pistons engine, get the timing right!

This is a nice add-on to my previous project: here below is a video where I simulate two pistons working together (moving a common crankshaft for instance). In order to do this effectively, one should calculate how to provide the right timing to each piston. In this example it’s quite easily done just by shifting the position of piston 2 by 180 degrees (or pi radians). A nice exercise would be trying to add more pistons.

Here is the animation on you tube:

Crankshaft connecting rod and piston mechanism simulation with Python

What is a crankshaft connecting rod and piston mechanism? It basically is a mechanical part which converts rotational motion into reciprocating motion. The applications are pretty vast, car engines are one of the most obvious I can recall right now.
It turns out the physics behind this mechanism is pretty interesting and easy to code, therefore I thought I could give it a go and try to make a simulation.
First of all we need to define the problem and solve it:

This is the basics from which we start: the crankshaft (rod ‘a’), the connecting rod (rod ‘b’) and the piston whose position is denoted with c. The motion of rod ‘a’ is a pure rotational motion while the motion of rod ‘b’ is somewhat more complex. The piston is performing a linear motion since it is constrained onto the real axis.
Now you might ask yourself why I am using the imaginary plane… It turns out that you can represent each vector (a,b and c) with a complex number, and this semplifies our problem into a more manageable system of two equations.
By using vector properties we can easily write:

The vector equation above states that the position of c is the sum of a and b as it can easily be seen on the picture. Now, we can get our real and imaginary coordinates in the plane by using Euler’s formula:

Note that every one of these parameters is a function of time and assuming that alpha(t), the length of the rods a and b are known, the system can be solved for c and beta. Furthermore by deriving the original equation we can obtain velocity and acceleration for each time t, since assuming alpha(t) is known then the derivatives of alpha(t) are known too (assuming alpha is derivable two times with respect to t).
Here are velocity and acceleration (vectors), respectively:

Now, for the sake of this example, I am assuming

however that’s not necessary. For instance one could try angular acceleration constant and so on.
Given my assumption,

is known and the initial angle can be assumed later. Now we can solve our first system for

and

. The solution looks something like this:

Ok, so far so good, now that we have the position at each time we just need to translate everything in a language that Python can understand: I used a class, however you can easily avoid using it since it is not really necessary

Once we have coded all what is above, we can create a My_mechanism instance and call the methods, be sure not to call all the methods at once since it will not run them all. Call a method at each time:

Here below are the videos I made using the animations of the mechanism:

Hope this was interesting.

Another Excel spreadsheet: Savings with LED lights

I recently uploaded a new Excel spreadsheet where I made a calculation of how much can be saved by replacing neon tubes with LED tubes.

As you might have noticed, I do like using Excel for calculations too, and I set up a page where some of my little projects and calculations with Excel are shared. Here it is: Excel page.

Tuesday, 17 February 2015

Some exercises with plots and matplotlib on currencies

EDIT: Apparently, some of the prices in the .csv files I used were missing, and this caused some problems with pandas dataframe since it replaces missing values with ‘na’. Should you encounter the same problem you could check every line with an if statement or use a method to replace the na. You can check the pandas’ documentation here.

Yesterday I was really bored and had some time which could be put to good use, therefore I decided to write a quick script to plot percentage change of currencies of some countries and compare them. The code I ended up with is a bit sloppy I guess, but that’s fine since I was primarily interested in improving my (limited, as of now) use of pandas and having fun.

First of all I gathered data from Quandl, namely the prices of the selected currencies in terms of dollars, that’s to say the value per day of every selected currency with respect to the dollar:

XYZ/USD (daily)

Gathering data from Quandl is really easy and fast using the Quandl API for Python. By the way, an API for R is available too.

I then computed the percentage change for 2 years and defined some plotting functions. Here is the main result the plotting function produces given a reference currency: it plots the percentage change for every currency (with respect to the dollar) against the percentage change of the reference currency. Some of the plotted data looks definitely weird, I wonder if I did something wrong or lost some information during the process.

Here is the code I used:

Disclaimer
This article is for educational purpose only. The author is not responsible for any consequence or loss due to inappropriate use. The article may well contain mistakes and errors. The data used might not be accurate. You should never use this article for purposes different from the educational one.

Saturday, 14 February 2015

CopulaClass a Python class for using copulas: a fitting example

As I have already said in my previous post Copulalib is really user-friendly, it is difficult to write something easier, however I thought I might give it a try.

This class is built around Copulalib and since it is to be used with 2-dimensional copulas, it implements plots for data visualization and some other functions such as:
-showAvailableCopulas() a method to show visually what copulas are included in the package
-generateCopula() a method to generate the copula
-printCorrelation() this method prints out Spearman’s rho, Kendall’s tau and the fitted parameter
-getSimulatedData() this method retrieves your simulated data from the copula assuming your original data is normally distributed. It would be nice to implement some tool which could figure out the most likely distribution of your data and then use it to get the simulated observations. Perhaps in the future I’ll do it.

Furthermore, the class does not mind if you feed in python lists of numpy arrays as it turns x and y in numpy arrays. Be careful however that it does not check if your lists/arrays are of the same length.

Anyway, as for the result of the testing script below, the fitting of the Frank copula to the data seems to have been successful, our simulated data seems to fit the real quite nicely:

Originally our data was (very) approximately normally distributed, with some sort of positive correlation as you can clearly see from the plots below

Here below you can see 1000 simulated pseudo-observations from the Frank copula

Below you can find the code I used to generate this simple model:

And here are the correlation details

Copulalib: How to use copulas in Python

When dealing with copulas, R is a better option in my opinion, however, what could you do if you wish to use Python instead? There’s a good starting package called Copulalib which you can easily download here.

The package is really simple to use and very user-friendly I would say, it basically handles everything (pseudo-observations etc…) once you fed in the raw data. There is a simple example of implementation in the download page. Once the data has been fed into the function, the fitting is done automatically and the following parameters are generated:
-Spearman’s rho
-Kendall’s tau
-Theta (the parameter of the copula)

As of my understanding of the package, only Frank, Gumbel and Clayton copulas are available, and this of course could be a limitation, however it is for sure a good start. Another point which is problematic is that multidimensional copulas seem not to be supported.

Now for the real complaints: for some reason once the sample size is larger than 300 observations per variable (say 300 x and 300 y) the script raises an error saying that x and y must be of the same dimensions which is strange since they are already of the same size. Anyway maybe I did something incorrect.

Here below is the short piece of code which generated the plots of the data and of the available copulas

Next I’m going to post a class for copulas.

Tuesday, 10 February 2015

How to fit a copula model in R

I have been working on this topic for a great amount of time and to be honest I find R documentation not that user-friendly as the documentation for most Python modules. Anyway the fact that copulas are not the easiest model to grasp has contributed to further delays too. But mainly the lack of examples and users of these models was the biggest obstacle. Then again, I might have looked in the wrong places, if you have any good resource to suggest please feel free to leave a comment. At the bottom of this page I’ll post some links that I found very useful.

If you are new to copulas, perhaps you’d like to start with an introduction to the Gumbel copula in R here.

The package I am going to be using is the copula package, a great tool for using copulas in R. You can easily install it through R-Studio.

The dataset
For the purpose of this example I used a simple dataset of returns for stock x and y (x.txt and y.txt). You can download the dataset by clicking here. The dataset is given merely for the purpose of this example.

First of all we need to load the data and convert it into a matrix format. Optionally one can plot the data. Remember to load the copula package with library(copula)

The plot of the data

Now we have our data loaded, we can clearly see that there is some kind of positive correlation.

The next step is the fitting. In order to fit the data we need to choose a copula model. The model should be chose based on the structure of data and other factors. As a first approximation, we may say that our data shows a mild positive correlation therefore a copula which can replicate such mild correlation should be fine. Be aware that you can easily mess up with copula models and this visual approach is not always the best option. Anyway I choose to use a normal copula from the package. The fitting process anyway is identical for the other types of copula.

Let’s fit the data

Note that the data must be fed through the function pobs() which converts the real observations into pseudo observations into the unit square [0,1].
Note also that we are using the “ml” method (maximum likelihood method) however other methods are available such as “itau”.

The parameter of the fitted copula, rho, in our case is equal to 0.7387409. Let’s simulate some pseudo observations

By plotting the pseudo and simulated observations we can see how the simulation with the copula matches the pseudo observations

This particular copula might not be the best since it shows a heavy tail correlation which is not that strong in our data, however it’s a start.

Optionally at the beginning we could have plot the data with the distribution for each random variable as below

And get this beautiful representation of our original dataset

Now for the useful documentation:

Copula package official documentation:
http://cran.r-project.org/web/packages/copula/copula.pdf

R blogger article on copulas
http://www.r-bloggers.com/copulas-made-easy/

An interesting question on CrossValidated
http://stats.stackexchange.com/questions/90729/generating-values-from-copula-using-copula-package-in-r

A paper on copulas and the copula package
http://www.jstatsoft.org/v21/i04/paper

That’s all for now.

Pages

Sunday, 22 February 2015

Tuesday, 17 February 2015

Saturday, 14 February 2015

Tuesday, 10 February 2015