Personal Trainer

When training a custom cascade classifier with opencv we need to provide a log (or info) file for the positive entry, these are the files which the objects we want to train the classifier to identify actually exist in, along with one for the negative entry, the challenge with generating the first file is that it needs to contain the information of where the object is placed in the image – the rect which bound the image, something like the following:

 so if you want to write a classifier that will learn how to identify if someone is eating a ice cream in an image (or if there is a ice cream in the picture) you’ll need first to have a database of ice cream and people eating ice cream images and then you’ll need to save an entry for each file in your collection. obviously since I am more concern with cats rather than ice cream  images I chose to go ahead and download this cat images database of more than 2gig of pure cat images, designed for image training, the great thing about this database is that it also comes with a map file for each cat that define the shape of it face. but lets ignore that for now and go back to the original problem, what if we want to create a new database for our cats.

cats face area

So I wrote a simple c++ utility which I call personal trainer, you’ll see why when you use it, it gets very personal after the 400 image or so. the app lets you define an images directory, and then run through the file in side it one by one – allowing you to crop the area which you’ll like to save an object and sort it in a log file, you can grab the full code from this git page –

here is what the trainer is based on, first we read the content of the directory using the tinydir lib which makes it very easy to handle directories in c++ (for me at least)

 Then we run a call to load the next – first image and assign the mouse callbacks for it (which will draw the rect on the image)

The cat and mouse game 

This is where opencv makes it fun to handle images, with the built in events for any of the mouse gestures. we’ll define 3 variable which will handle the drawing and clicking.

drawing_box – a bool that will notify us if we are currently drawing or not.

mouseBox – a rect which we’ll use to hold the mouse drawing position

train_box_on  – a bool which will tell us if there is a rectangle already drawn on the image

 we’ll switch between 3 different mouse events: down, up and move.

CV_EVENT_LBUTTONDOWN – when the mouse is down we’ll either reset the rectangle or if it is clicked inside an existing rectangle we’ll use that as an indication to save the current position. 

CV_EVENT_MOUSEMOVE – when the mouse is moving, if we are currently in drawing mode, we’ll add the mouse pos to the mouseBox rect.

CV_EVENT_LBUTTONUP – when the mouse is up, we’ll find out where our rect ends and draw it over the image, if we had only a click, we’ll use that as an indication to reset any existing rectangle.

 When we highlight the object we’ll also add a call to action to click inside to save the current position.

 lets save the rect into our log file and we are close to done.

 done? well – now we have time to go through another 1000 cat images and build our data, here are some of the highlights, taken from the cat images database.

personal trainer cats

Cascade Training

now you should have a log file with a similar structure to the one outlined above, you’ll also need a log file for the negative sample, the one that don’t contain your object, this could be a simple file that list the images in the directory you sort them, something like this:

 now you’ll need to generate a .vec file which will include the results of both files, we’ll use the opencv_createsamples.exe and the opencv_traincascade.exe to build this file and train the classifier (you can find the both in the bin dir – C:\opencv\release\bin), you can see the full documentation and specification of the samples and the trainer at the opencv cascade training page.

createsamples.exe will take the following arguments

 running the following command will generate the cats.vec file

now we’ll use the data in the vec file to train our classifier and save an xml file with the results, if you had a chance to check out the opencv docs, you’ll see that there are two type of training you can do. Haar training which is much slower yet much more accurate, it could be based on the viola jones algorithm approach or the  LBP – Local Binary Patterns which uses integers as features for faster processing, note that you can get great results with LBP it is mainly depend on the quality of the training you do, the number of sources you use and how well they are mapped.  if you run the opencv_traincascade.exe path into your cmd/terminal window you’ll see the following specs

 The following will generate cascade training xml file using the data we collected and LBP trainer.

 The structure of the LBP cascade xml file will look somewhat similar to this

 Finally you are ready to test  the cascade training file and see if your object appear in an image or not. first we’ll define a cascadeclassifier and load the cascade xml file we just trained to it.

next we’ll load an image we’ll like to test (or a camera feed), re-size it and convert it into a gray mat, this is better for performance to work with smaller images with minimal amount of channels. 

we’ll also equalize the histogram of the image so the calculation happen faster, this is less important when we deal with one image, but when we have a live camera feed that we want to detect, reducing the effort will make a significant difference. I apply a single equalizeHist filter here, though it is  recommended when you have a cropped version of an image to divided the image into two since many times a light hits the image from a specific direction resulting with bright and dark side of the image.

 we are ready to call the catsCascade.detectMultiScale and find out if we found any cats in the image.

 lets set the results to the original size of the image and verify that it is within the image frame

 now if we have found an object, lets frame it and show the image

 here is a test of me and my cat chachi, which was not apart of the original training set.


 I hope you find this useful in detecting your own objects with opencv, here are some of the resources I came across/mentioned through my learning of how to detect objects.

personal trainer –
tinydir file utils –
haar training tutorial –
opencv train cascade documentation –
cascade classifiers –
cats database –
viola jones –
Local binary patterns (LBP) –