Monday, 24 August 2015

RandomForestClassifier on the cars dataset ML

Since the beginning of this summer I have been practicing a lot with Scikit-learn to improve my knowledge of Machine Learning both on theory and practice. Last week I also tried to tackle some of the Kaggle competitions although they are really tough if you wish to get into the top 50 best scores. Perhaps I will post something about the experience in the future.

Scikit-learn is a (almost) ready to use package for Machine Learning in Python. It is so very user friendly and in some cases not that much coding around is needed to achieve interesting results such as in the case of the cars dataset.

The cars dataset, from the UCI Machine Learning Repository, is a collection of about 1700 entries of cars each with 6 features that can be easily recognized by the name (buying, maint, doors, persons, lug_boot, safety). Check the dataset description for more detailed information. The feature to be predicted is “class” and the possible values are unacc,acc,good,v-good. Most of the features are categorical, therefore they need to be encoded into numbers. Pandas is great for quick features encoding.

Thursday, 20 August 2015

Using Arduino to measure friction coefficient

As a sideproject I decided to design a simple experiment and use Arduino to measure the friction coefficient of an object sliding on a given material.

Ideally we would like our first object to slide (not roll) on a sheet of a given material as below:


If we know the angle (we can easily set it) and the mass of the wooden block, then the only unknown variable is the friction coefficient and we can easily estimate it by measuring how long it took for the block to go over a certain distance x.

European Option Pricing with Python, Java and C++

Please make sure to read the disclaimer at the bottom of the page before continuing reading.

Plain vanilla call and put european options are one of the simplest financial derivatives existing. A european call (put) option is essentially a contract which by paying a fee gives you the right to buy (sell) a stock at a predetermined price, the strike price, at a future date. This kind of contracts was originally intended as an insurance tool for companies to fix the selling price of their goods and hedge the price risk.

Options have some interesting features, for starters, if you are the buyer of an option the payoff is potentially unlimited and the loss is limited to the option price as you can see by the payoff diagram below


of course the reverse is true for the seller (huge downside and limited upside).

Controlling lights according to sunlight using only few electronic components

At the beginning of this summer, I was asked to provide a simple (and possibly cost-effective) solution to a simple problem: how can I do in order for my garden LED lights to turn themselves on and off according to the sunlight?

There are plenty of ready to use circuits and tools that one can use to answer this question, however I decided to try and design something “new” empowered by what I recently learned about NPN transistors, relays and LT-Spice IV. Let me talk you through my workflow:

The problem and the setting:

Senza titolo-1 copia

The LED lights which provide illumination in the garden are powered by a solar panel which during the day charges the battery. At night, the stored energy is used to power the lamp. The solar charge controller ensures that the charging process is smooth and that everything is going as it should during the charge and discharge processes. Since the solar charge controller is very minimal, it does not have a timer or a switch to turn on and off the lights. A manual switch is therefore used instead (not shown in the picture and to be replaced by this project). The lamp is a 12V 4.5W LED lamp.

The solution

Friday, 14 August 2015

Basic Hidden Markov model

A hidden Markov model is a statistical model which builds upon the concept of a Markov chain.

The idea behind the model is simple: imagine your system can be modeled as a Markov chain and the signals emitted by the system depend only on the current state of the system. If the states of the system are not visible and what you can observe are only the emitted signals, then this is a Hidden Markov model.

Saturday, 1 August 2015

Simple regression models in R

Linear regression models are one the simplest and yet a very powerful models you can use in R to fit observed data and try to predict quantitative phenomena.

Say you know that a certain variable y is somewhat correlated with a certain variable x and you can reasonably get an idea of what y would be given x. A class example is the price of houses (y) and square meters (x). It is reasonable to assume that, within a certain range and given the same location, square meters are correlated with the price of the house. As a first rough approximation one could try out the hypothesis that price is directly proportional to square meters.

Now what if you would like to predict the possible price of a flat of 80 square meters? Well, an initial approach could be gathering the data of houses in the same location and with similar characteristics and then fit a model.

A linear model with one predicting variable might be the following:


Alpha and Beta can be found by looking for for the two values that minimise the error of the model relative to the observed data. Basically the procedure of finding the two coefficient is equivalent to finding the “closest” line to our data. Of course this will still be an estimate and it will probably not match any of the observed values.