dimanche 29 juillet 2012

Qt + OpenCV part 2

About a year ago I posted here about a Qt+OpenCV project in Code::blocks I was working on. Since then I switched to Qt Creator, simplifying the whole thing, and then moved the code on Google Code. It is a very, very simple computer vision algorithm development sofware and should be seen as an helper for beginners who wish to use Qt and OpenCV. It comes as a movement detector with a sleep/auto-rearm function (that's where the PoolWatcher Qt .pro project name comes from, put it on ("surveiller") and an alarm will ring in case of intrusion, the goal there being to prevent young kids drowning), but can be easily modified to suit your needs (since it is based on a strategy pattern all you will have to take care of is your vision class which will be a sub-class of AlgoVision, and you can easily modify the UI as well, using Qt Designer). The PoolWatcher vision algorithm (detecKidAlgo) is at its early stage and needs big improvments, so far it uses only image subtraction with mixture of gaussian and then cvBlob for the tracking part. Being involved in several other projects I don't have much time to work on that, so let me know if you are interested and wish to contribute to it or to other parts of the project. You will find the code here.

If you are not interested in the development and want to use the PoolWatcher as a videosurveillance software without having to compile it, an executable file with its dependencies is available here. It has been tested under Windows 7 64 bits, it is probably buggy though, so let me know if you have any problem or suggestion and I'll do my best to improve it.

Below are two videos showing the program running with and without debug mode.


mercredi 25 avril 2012

Kohonen neural networks applied to computer vision

The past semester I enrolled to a masters degree course in computer vision at ETS. Each student had to design his own computer vision project. I like project oriented courses, they have many advantages; one of them is that,  through my colleagues projects, I had the opportunity to get to know better many areas of image understanding that otherwise would have probably stayed unknown to me.

The subject I chose for my project was "Word recognition in unconstrained environments". Computer vision algorithms can usually be roughly decomposed into two parts : first preprocessing/segmentation, using basic digital  image processing algorithms, and then recognition, using machine learning algorithms (which I see as statistics applied to computer science. One accepted definition of machine learning is : field of study that gives computers the ability to learn without being explicitly programmed (Arthur Samuel, 1959)).

The recognition part is the subject of this post. During my second internship, I had the chance to learn a few machine learning algorithms. Among them, a neural network known as Kohonen network, or Self-Organized Map (SOM). I used it to perform biomedical signals clustering. It did well, however I was really wondering how such a neural network would perform in computer vision. Actually, at that time I had never used machine learning in my computer vision algorithms, so I was really looking forward to it. This masters degree course was my perfect occasion to, at last, mix digital image processing with some machine learning.

I will present here today the theory about Kohonen networks.

Kohonen networks assume a topological structure among the cluster units, a property observed in the brain (1). This property, which allow us to visualize the state of a trained network, is unique among all other kinds of neural networks.

SOM are composed of m units (neurons) of weights w, stored into a 1 or 2 dimensional array. The input vectors are n dimensional. 
Neural network architecture (http://www.globalspec.com/reference/41143/203279/8-5-an-unsupervised-ann-for-static-security-classification)
This is a competitive network : neurons are in competion with each other, to be named the winner in comparison with input vector. The winner will be the one closest to the input in terms of distance (usually squared euclidean distance). Input vector will be then classified into the winner neuron's cluster, or class.

Training of such a neural network is about the same as for any other, the neurons weights are first randomly initialized, then the training input vectors are presented iteratively to the network, until convergence is reached. During training, not only the winner neuron's weights are updated, but also those of its neighborood, respectively to a decreasing function : the more the neighbor is close to the winner, the more its weights are going to change (adapt).

Neighborhood function, the winning neuron is at the center (maximum excitation)
The video below illustrates the training of a Kohonen network. The task here is classification of colors, according to their channels R, G, and B (which stand for Red, Green, Blue). That gives 3-dimensional input vectors. Classifying colors is a trivial task, but it is very useful to understand how SOM training actually work :




In my next post I will present my results, and some steps of the preprocessing/segmentation algorithms.

In the meantime, you will find here the original publication of Teuvo Kohonen, along with a very well designed "SOM toolbox" for Matlab.

[EDIT] In the video below you will se my first results; we are today the 16th of July and I still did not have time to work on that any further ! But a few other good things in computer vision are coming though, so stay tuned !



(1) : Laurene Faussett, Fundamentals of neural networks, architecture, algorithms, and applications. 1994, Prentice Hall.

lundi 31 janvier 2011

Target tracking with Kalman filter



In this short video, I use Kalman filter to predict a target's path, or more precisely its next position.

The predicted position is represented by the red circle.

After pre-processing, I use OpenCV's polygon approximation, then I get the center of the biggest polygon detected in the image, which I use to update the measurements of the Kalman filter (coordinates x and y of the target center).

All of the steps, pre-processing and polygon approximation, are done using OpenCV. The Kalman filter was a bit tricky at first, but as soon as you digg a little into its math it turns out to be quite straightforward to implement.

However, the use that I make from Kalman filter in this video footage is not very good, it lacks consistency. Actually in my code Kalman's predictions are not yet used to search for a particular target and get stick to it, until some conditions are reached. That should be the next step, if I find the time to work on this little project again.