After a short hiatus - holidays and finishing up a paper - I am back writing and decided to write about my current after-work project, namely generative models for machine learning. I've been interested in machine learning for a quite some time now. However always found myself in the predicament that finding or generating high quality training data can be quite the challenge. This applies in particular, if you try to learn agent behavior in a complex setting.
One typical machine learning problem is the recognition of handwritten text. Here you get an image of text and want to get the characters of the image. As an example consider turning this very noisy picture of a three into the information this a three.
In am more abstract fashion you want to map some kind of data (here the pixels of the image) into a label ("3"). The combination of data and label is called labeled data. The problem with labeled data is that it is typically hard to come by. While there is an abundance of images of threes, usually your are not provided with the information that they show a three.
One solution to this problem is to split the problem of the recognizing characters into two parts:
- Let the computer learn, what are typical features in the data you want to recognize. For example most characters are composed of curved lines. If you could get a computer to recognize this fact, the problem of identifying numbers should become much easier.
- Now that the computers knows what the typical features of the data are, you train it to map these typical features to labels. In our example, the idea is that the computer no longer has to work on the raw pixels, but just to piece together that two specific curves make up a three.
So now the problem remains how to perform the first step, i.e., how to let the computer learn what the typical feature are. The idea of generative models is that we no longer ask the computer to recognize data, but just that it itself is able to reproduce variations of the data. This task requires exactly that the computer learns what makes the data special you are presenting it with.
As a physicist I am particularly drawn to a model called Restricted Boltzmann Machine (RBM) introduced by Geoffrey Hinton. If you start to combine multiple RBMs you end up with a simple deep model. In the coming weeks I will detail my quest to build such a generative model (in javasscript for additional fun) to recognize digits from noisy images.