2005-12-13

Pattern recognition

The question is: how do you recognize patterns?

Let's get it systematically.

What is a pattern? If you look at a string [AAAAAAA...] it is clear that there is a simple pattern here: A is always followed by an A. Expressing this as a pattern however is a little bit problematic. What pattern should we choose? [AA]? Or [AAA]? Or [A]?

Then another pattern: [ABABABAB...]. This is simpler. The pattern is [AB]. Thanslate [AB] to [X] and you have an output [XXXX...]. Which is the same as the problematic pattern above.

Then you can go to [ABCABCABCABC...]. With the pattern being [ABC].

Then what about [ABABXYABXYABXYXYABABXYXY] with two patterns repeating ([AB] and [XY]) seemingly randomly. Here we should learn the [AB] and [XY] patterns, translate it to [T] and [U], outputting [TTUTUTUUTTUU...], and let the next level find the higher order patterns.

Then there is the problem of noise. What if you have [ABABABXABABXABABXABAB] with a regular [AB] sometimes interrupted by an X?

Based on these cases I started to play with different algorithms. Some of them work well in simple cases, others work well in longer contexts, others perform equally poor on any input pattern.

Seems like a long way ahead...

I'll come back to keep you informed.

Sudden change of course

Long time passed. An update on what happened follows:

First: I created a project on sourceforge for the thinking machine. I called it brain-game. You'll find it here.

Second: I made some interesting experiments (programming-wise), which resulted in solving the layering problem. Sometimes it gets confusing this predecessor/successor thing.

Third: after I set up the layering and ran some tests I was deeply disappointed. The results were far worst than the single-layer prediction.

This failure to get results prompted me to start looking in the learning part of the Predictor.

And I found out that there are serious issues with it.

So I started to explore the core algorithms.

This somehow threw me out of my original track, and got me to a simple but essential problem: how do you make pattern recognition.

This is where I stand today.