Here I am again with Machine Learning! This time I’ve achieved a great result though (for me at least!). By using another great dataset from UCI I was able to write a decent ML script which scored 95% in the testing part! I am really satisfied with the result.
Here is a sample of what the script should be able to read (in the example the number 9):
Some numbers, as the one above, were clear, others not so clear, since they were handwritten and then somehow (I do not know how) converted into digital images.
I had a hard time figuring out how the attributes in the dataset were coded but in the end I managed to figure it out! I guess making up such a dataset was a really long and boring work.
Anyway here is my script and below you can find the result of the test on the last 50 numbers or so.
This time I got 89% success rate! Pretty good I guess! I wonder whether I could train Python to recognize other things, maybe faces or other! Well first of all I have to figure out how to convert a picture into readable numpy arrays. Readable for Python of course!! If you have any suggestion please do leave a comment!
Here below is the citation of the source where I found the dataset “Semeion Handwritten Digits Data Set”:
Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
and
Semeion Research Center of Sciences of Communication, via Sersale 117, 00128 Rome, Italy
Tattile Via Gaetano Donizetti, 1-3-5,25030 Mairano (Brescia), Italy.
Hope this was interesting!