Monday, January 30, 2012

Recognising Fingertips

Pulling an all nighter is always rewarding , so Friday night I made my first attempt at detecting the fingertips on the hand. Most people use the "convex irregularities" method. I really didn't like that method. It seems sloppy and doesn't detect all fingers. I prefer the kcosines method as described in "Vision-Based Finger Action Recognition by Angle Detection and Contour Analysis".

These are the results of my first attempt.

This week I'll concentrate on making tracking and contour extraction more robust because as you can see at some points the contours break up. I guess selecting my wooden office desk as a testing area proved to be quite a challenge.

Wednesday, January 25, 2012

Extracting and stabilizing contours


It's been a busy week. I'm now at the stage of contour extraction. Using my adaptive skin classifier and the samples gathered from the detector I build a histogram model for the hand and extract the contours around it. As you can see it is quite robust and works under different illumination. All these are possible with the assumption that the detector doesn't return a false positive. While HAAR cascades are fairly good at the job they don't have a 0% percent of false positives so I intend to add a Fourier hand validator.

Thursday, January 12, 2012

Using train cascades

The detector I trained back in November doesn't work for me anymore, so I'm attempting to train another one using the train cascades executable provided in the OpenCV framework.

I gathered 70k negatives , 30k positives and I run exactly the same code and boom crash burn. All I got was the message "Train dataset for temp stage can not be filled. Branch training terminated."

After 2 hours of screwing around all possible combinations and reading the same problems on the net I decided to look at the code (damn messy C++ code). After that I decided to reduce the number of negatives to the same number of positives and voila it worked.

I've come to believe that OpenCV was originally written by a couple of uber programmers and then left to be wrecked by drunken monkeys.

Wednesday, January 11, 2012

Building an adaptive skin classifier

Building an adaptive skin classifier is quite some work.
I've seen some examples on the net. The good ones are not real time and the ones that are  simply don't cut it.
A small lighting variation, a slightly complex background and the classifier is lost.

Many examples I've seen use hard coded variables. That is obviously wrong.
I've also seen many that use the RGB color space which is also wrong.
I strongly recommend the HSV color space because it is slightly more lighting invariant or at least normalized RGB.

I also recommend using more features than the color channels. You also have to experiment with different bin sizes in order to get real time performance.

I recently found out that arithmetic accuracy is also important because of all the normalization operations.

During segmentation is important to postpone thresholding  because of the mass loss of information. I prefer using the probability map during tracking and later threshold to extract the contours.






Saturday, January 7, 2012

Well I'm getting closer...

 

I'm usually very critical of the stuff I make but today I feel quite satisfied with the results.

Tuesday, January 3, 2012

The constraints of histogram tracking

While in the last months I have implemented and tested a couple of histogram based tracking algorithms I only recently realized their inherent constraints. If you use it for hand tracking and you have your arms naked or your face exposed it is very easy for the tracker to get confused because of the coarse quantization of the histograms.
The worst part is that there is little you can do :
  • I tried using a different color space such as HSV. While it is far better that RGB for tracking skin and coping for small lighting variances it is still not enough and the tracker often gets confused especially when the hand goes out of view.
  • I tried incorporating different features such as edge magnitude (bad idea) and edge orientation (much better) and it had the effect of better localizing the detector. 
Integral histograms are still an option but there are far too slow for a real time application like the one I'm working on.