Thursday, July 5, 2012

SURF Source Code (Part-2)

Many of you might have already figured out, how to strip the code. The below is the stripped version. Here, we can just supply the gallery image and probe image, the same way as the previous post as arg1 and arg2. But here, no GUI utilities are being used. Just the output needed is printed to standard output. This drastically reduces the time for the program execution. Also, I have removed the flann method of matching, since any one (findpairs and flannfindpairs - functions) can be used and both the methods given have the same recognition performance.

/*
 * A Demo to OpenCV Implementation of SURF
 * Further Information Refer to "SURF: Speed-Up Robust Feature"
 * Author: Liu Liu
 * liuliu.1987+opencv@gmail.com
 * Modified by : http://www.opencvuser.blogspot.com
 * Modifying Author : Dileep Kumar Kotha
 */

#include <cv.h>
#include <highgui.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>

#include <iostream>
#include <vector>

using namespace std;

static double dis=0;//Dileep:For calculating the distance




IplImage *image = 0;

double
compareSURFDescriptors( const float* d1, const float* d2, double best, int length )
{
    double total_cost = 0;
    assert( length % 4 == 0 );
    for( int i = 0; i < length; i += 4 )
    {
        double t0 = d1[i] - d2[i];
        double t1 = d1[i+1] - d2[i+1];
        double t2 = d1[i+2] - d2[i+2];
        double t3 = d1[i+3] - d2[i+3];
        total_cost += t0*t0 + t1*t1 + t2*t2 + t3*t3;
/*
We are sending a total cost, that's slightly greater or smaller than the best 

  */
      if( total_cost > best )
            break;
    }
    return total_cost;
}


int
naiveNearestNeighbor( const float* vec, int laplacian,
                      const CvSeq* model_keypoints,
                      const CvSeq* model_descriptors )
{
    int length = (int)(model_descriptors->elem_size/sizeof(float));
    int i, neighbor = -1;
    double d, dist1 = 1e6, dist2 = 1e6;
    CvSeqReader reader, kreader;
    cvStartReadSeq( model_keypoints, &kreader, 0 );
    cvStartReadSeq( model_descriptors, &reader, 0 );

    for( i = 0; i < model_descriptors->total; i++ )
    {
        const CvSURFPoint* kp = (const CvSURFPoint*)kreader.ptr;
        const float* mvec = (const float*)reader.ptr;
     CV_NEXT_SEQ_ELEM( kreader.seq->elem_size, kreader );
        CV_NEXT_SEQ_ELEM( reader.seq->elem_size, reader );
        if( laplacian != kp->laplacian )
            continue;
        d = compareSURFDescriptors( vec, mvec, dist2, length );

        if( d < dist1 )
        {
            dist2 = dist1;
            dist1 = d;
            neighbor = i;
        }
        else if ( d < dist2 )
            dist2 = d;
    }
dis=dis+dist1;

/*Dileep:We are finding the distance from every descriptor of probe image to every descriptor of the galley image. Finally in the findpairs function, we divide this distance with the total no. of descriptors to get the average of all the distances
*/
    if ( dist1 < 0.6*dist2 )
        return neighbor;
    return -1;
}

void
findPairs( const CvSeq* objectKeypoints, const CvSeq* objectDescriptors,
           const CvSeq* imageKeypoints, const CvSeq* imageDescriptors, vector<int>& ptpairs )
{ 
    int i;
    CvSeqReader reader, kreader;
    cvStartReadSeq( objectKeypoints, &kreader );
    cvStartReadSeq( objectDescriptors, &reader );
    ptpairs.clear();

    for( i = 0; i < objectDescriptors->total; i++ )
    {
        const CvSURFPoint* kp = (const CvSURFPoint*)kreader.ptr;
        const float* descriptor = (const float*)reader.ptr;
        CV_NEXT_SEQ_ELEM( kreader.seq->elem_size, kreader );
        CV_NEXT_SEQ_ELEM( reader.seq->elem_size, reader );
        int nearest_neighbor = naiveNearestNeighbor( descriptor, kp->laplacian, imageKeypoints, imageDescriptors);
//Dileep:For every descriptor, we are trying to find it's nearest neighbour in the probe image
        if( nearest_neighbor >= 0 )
        {
            ptpairs.push_back(i);
            ptpairs.push_back(nearest_neighbor);
        }
    }

printf("\n%lf\n",(dis/objectDescriptors->total));////Dileep:Here's where I am outputting the distance between the images
/*Dileep: If you are using this for recognition, write this distance to a file along with the name of the image you are matching against. After doing this for several images, you can then sort them in ascending order to find the best possible match - the one with the smallest distance. Here, I am outputting the distance to stdout
*/
}

int main(int argc, char** argv)
{
    const char* object_filename = argc == 3 ? argv[1] : "box.png";
    const char* scene_filename = argc == 3 ? argv[2] : "box_in_scene.png";
//Dileep:When you are excuting the object file, please write Command:./objectfile probe_image Gallery_image
/*Dileep:
Probe_image - This is the image for which you need to find the match
Gallery_image - This is one of the set of images, you use for matching

You keep the same probe image same, repeatedly changing the gallery image and outputting the distance in the format
<Gallery_name distance> into a file
Finally you can sort the distances in ascending order. And the one with the shortest distance - You can output it's name as the best possible match

It may become tedious to continually write the same command multiple times, changing the gallery file name. Try to use shell script with a for loop
*/
    CvMemStorage* storage = cvCreateMemStorage(0);

    

   

    IplImage* object = cvLoadImage( object_filename, CV_LOAD_IMAGE_GRAYSCALE );
    IplImage* image = cvLoadImage( scene_filename, CV_LOAD_IMAGE_GRAYSCALE );
    if( !object || !image )
    {
        fprintf( stderr, "Can not load %s and/or %s\n"
            "Usage: find_obj [<object_filename> <scene_filename>]\n",
            object_filename, scene_filename );
        exit(-1);
    }

    CvSeq *objectKeypoints = 0, *objectDescriptors = 0;
    CvSeq *imageKeypoints = 0, *imageDescriptors = 0;
    int i;
    CvSURFParams params = cvSURFParams(500, 1);

    double tt = (double)cvGetTickCount();
    cvExtractSURF( object, 0, &objectKeypoints, &objectDescriptors, storage, params );
    printf("Object Descriptors: %d\n", objectDescriptors->total);
    cvExtractSURF( image, 0, &imageKeypoints, &imageDescriptors, storage, params );
    printf("Image Descriptors: %d\n", imageDescriptors->total);
    tt = (double)cvGetTickCount() - tt;
    printf( "Extraction time = %gms\n", tt/(cvGetTickFrequency()*1000.));




    vector<int> ptpairs;

    findPairs( objectKeypoints, objectDescriptors, imageKeypoints, imageDescriptors, ptpairs );

  
    return 0;
}


SURF Souce Code (Part-1)

When you install openCV, openCV samples folder also gets installed. The samples will be present in your InstallationDirectory/samples/C. For me it's present in OpenCV-2.1.0/samples/c.This folder contains the sample codes of many good openCV programmes that can be used for a wide variety of purposes. One more thing is that, it also contains the compiled object files along with the source code for each programme. The programme , we will be looking is find_obj.cpp and the compiled code will be with the name find_obj. This programme uses SURF to do an object detection. The original code, written by Liu, is modified by me to give the below code. Note that the comments made by me, start with Dileep:
/*
 * A Demo to OpenCV Implementation of SURF
 * Further Information Refer to "SURF: Speed-Up Robust Feature"
 * Author: Liu Liu
 * liuliu.1987+opencv@gmail.com
 * Modified by : http://www.opencvuser.blogspot.com
 * Modifying Author : Dileep Kumar Kotha
 */

#include <cv.h>
#include <highgui.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>

#include <iostream>
#include <vector>

using namespace std;

static double dis=0;//For calculating the average descriptor distance
// define whether to use approximate nearest-neighbor search
//#define USE_FLANN


IplImage *image = 0;

double
compareSURFDescriptors( const float* d1, const float* d2, double best, int length )
{
    double total_cost = 0;
    assert( length % 4 == 0 );
    for( int i = 0; i < length; i += 4 )
    {
        double t0 = d1[i] - d2[i];
        double t1 = d1[i+1] - d2[i+1];
        double t2 = d1[i+2] - d2[i+2];
        double t3 = d1[i+3] - d2[i+3];
        total_cost += t0*t0 + t1*t1 + t2*t2 + t3*t3;
/*
We are sending a total cost, that's slightly greater or smaller than the best 

  */
      if( total_cost > best )
            break;
    }
    return total_cost;
}


int
naiveNearestNeighbor( const float* vec, int laplacian,
                      const CvSeq* model_keypoints,
                      const CvSeq* model_descriptors )
{
    int length = (int)(model_descriptors->elem_size/sizeof(float));
    int i, neighbor = -1;
    double d, dist1 = 1e6, dist2 = 1e6;
    CvSeqReader reader, kreader;
    cvStartReadSeq( model_keypoints, &kreader, 0 );
    cvStartReadSeq( model_descriptors, &reader, 0 );

    for( i = 0; i < model_descriptors->total; i++ )
    {
        const CvSURFPoint* kp = (const CvSURFPoint*)kreader.ptr;
        const float* mvec = (const float*)reader.ptr;
     CV_NEXT_SEQ_ELEM( kreader.seq->elem_size, kreader );
        CV_NEXT_SEQ_ELEM( reader.seq->elem_size, reader );
        if( laplacian != kp->laplacian )
            continue;
        d = compareSURFDescriptors( vec, mvec, dist2, length );

        if( d < dist1 )
        {
            dist2 = dist1;
            dist1 = d;
            neighbor = i;
        }
        else if ( d < dist2 )
            dist2 = d;
    }
dis=dis+dist1;

    if ( dist1 < 0.6*dist2 )
        return neighbor;
    return -1;
}

void
findPairs( const CvSeq* objectKeypoints, const CvSeq* objectDescriptors,
           const CvSeq* imageKeypoints, const CvSeq* imageDescriptors, vector<int>& ptpairs )
{ 
    int i;
    CvSeqReader reader, kreader;
    cvStartReadSeq( objectKeypoints, &kreader );
    cvStartReadSeq( objectDescriptors, &reader );
    ptpairs.clear();

    for( i = 0; i < objectDescriptors->total; i++ )
    {
        const CvSURFPoint* kp = (const CvSURFPoint*)kreader.ptr;
        const float* descriptor = (const float*)reader.ptr;
        CV_NEXT_SEQ_ELEM( kreader.seq->elem_size, kreader );
        CV_NEXT_SEQ_ELEM( reader.seq->elem_size, reader );
        int nearest_neighbor = naiveNearestNeighbor( descriptor, kp->laplacian, imageKeypoints, imageDescriptors);
        if( nearest_neighbor >= 0 )
        {
            ptpairs.push_back(i);
            ptpairs.push_back(nearest_neighbor);
        }
    }

printf("\n%lf\n",(dis/objectDescriptors->total));////Dileep:Here's where I am outputting the distance between the images
/*Dileep: If you are using this for recognition, write this distance to a file along with the name of the image you are matching against. After doing this for several images, you can then sort them in ascending order to find the best possible match - the one with the smallest distance. Here, I am outputting the distance to stdout
*/
}
////Dileep:I am commenting the below one since it's of flann method and what ever the method we choose to find distance, the result is the same
/*
void
flannFindPairs( const CvSeq*, const CvSeq* objectDescriptors,
           const CvSeq*, const CvSeq* imageDescriptors, vector<int>& ptpairs )
{
 int length = (int)(objectDescriptors->elem_size/sizeof(float));

    cv::Mat m_object(objectDescriptors->total, length, CV_32F);
 cv::Mat m_image(imageDescriptors->total, length, CV_32F);


 // copy descriptors
    CvSeqReader obj_reader;
 float* obj_ptr = m_object.ptr<float>(0);
    cvStartReadSeq( objectDescriptors, &obj_reader );
    for(int i = 0; i < objectDescriptors->total; i++ )
    {
        const float* descriptor = (const float*)obj_reader.ptr;
        CV_NEXT_SEQ_ELEM( obj_reader.seq->elem_size, obj_reader );
        memcpy(obj_ptr, descriptor, length*sizeof(float));
        obj_ptr += length;
    }
    CvSeqReader img_reader;
 float* img_ptr = m_image.ptr<float>(0);
    cvStartReadSeq( imageDescriptors, &img_reader );
    for(int i = 0; i < imageDescriptors->total; i++ )
    {
        const float* descriptor = (const float*)img_reader.ptr;
        CV_NEXT_SEQ_ELEM( img_reader.seq->elem_size, img_reader );
        memcpy(img_ptr, descriptor, length*sizeof(float));
        img_ptr += length;
    }

    // find nearest neighbors using FLANN
    cv::Mat m_indices(objectDescriptors->total, 2, CV_32S);
    cv::Mat m_dists(objectDescriptors->total, 2, CV_32F);
    cv::flann::Index flann_index(m_image, cv::flann::KDTreeIndexParams(4));  // using 4 randomized kdtrees
    flann_index.knnSearch(m_object, m_indices, m_dists, 2, cv::flann::SearchParams(64) ); // maximum number of leafs checked

    int* indices_ptr = m_indices.ptr<int>(0);
    float* dists_ptr = m_dists.ptr<float>(0);
    for (int i=0;i<m_indices.rows;++i) {
     if (dists_ptr[2*i]<0.6*dists_ptr[2*i+1]) {
      ptpairs.push_back(i);
      ptpairs.push_back(indices_ptr[2*i]);
     }
    }
}


// a rough implementation for object location 
int
locatePlanarObject( const CvSeq* objectKeypoints, const CvSeq* objectDescriptors,
                    const CvSeq* imageKeypoints, const CvSeq* imageDescriptors,
                    const CvPoint src_corners[4], CvPoint dst_corners[4] )
{
    double h[9];
    CvMat _h = cvMat(3, 3, CV_64F, h);
    vector<int> ptpairs;
    vector<CvPoint2D32f> pt1, pt2;
    CvMat _pt1, _pt2;
    int i, n;

#ifdef USE_FLANN
    flannFindPairs( objectKeypoints, objectDescriptors, imageKeypoints, imageDescriptors, ptpairs );
#else
    findPairs( objectKeypoints, objectDescriptors, imageKeypoints, imageDescriptors, ptpairs );
#endif

    n = ptpairs.size()/2;
    if( n < 4 )
        return 0;

    pt1.resize(n);
    pt2.resize(n);
    for( i = 0; i < n; i++ )
    {
        pt1[i] = ((CvSURFPoint*)cvGetSeqElem(objectKeypoints,ptpairs[i*2]))->pt;
        pt2[i] = ((CvSURFPoint*)cvGetSeqElem(imageKeypoints,ptpairs[i*2+1]))->pt;
    }

    _pt1 = cvMat(1, n, CV_32FC2, &pt1[0] );
    _pt2 = cvMat(1, n, CV_32FC2, &pt2[0] );
    if( !cvFindHomography( &_pt1, &_pt2, &_h, CV_RANSAC, 5 ))
        return 0;

    for( i = 0; i < 4; i++ )
    {
        double x = src_corners[i].x, y = src_corners[i].y;
        double Z = 1./(h[6]*x + h[7]*y + h[8]);
        double X = (h[0]*x + h[1]*y + h[2])*Z;
        double Y = (h[3]*x + h[4]*y + h[5])*Z;
        dst_corners[i] = cvPoint(cvRound(X), cvRound(Y));
    }

    return 1;
}
*/

int main(int argc, char** argv)
{
    const char* object_filename = argc == 3 ? argv[1] : "box.png";
    const char* scene_filename = argc == 3 ? argv[2] : "box_in_scene.png";

    CvMemStorage* storage = cvCreateMemStorage(0);

    cvNamedWindow("Object", 1);
    cvNamedWindow("Object Correspond", 1);

    static CvScalar colors[] = 
    {
        {{0,0,255}},
        {{0,128,255}},
        {{0,255,255}},
        {{0,255,0}},
        {{255,128,0}},
        {{255,255,0}},
        {{255,0,0}},
        {{255,0,255}},
        {{255,255,255}}
    };

    IplImage* object = cvLoadImage( object_filename, CV_LOAD_IMAGE_GRAYSCALE );
    IplImage* image = cvLoadImage( scene_filename, CV_LOAD_IMAGE_GRAYSCALE );
    if( !object || !image )
    {
        fprintf( stderr, "Can not load %s and/or %s\n"
            "Usage: find_obj [<object_filename> <scene_filename>]\n",
            object_filename, scene_filename );
        exit(-1);
    }
    IplImage* object_color = cvCreateImage(cvGetSize(object), 8, 3);
    cvCvtColor( object, object_color, CV_GRAY2BGR );
    
    CvSeq *objectKeypoints = 0, *objectDescriptors = 0;
    CvSeq *imageKeypoints = 0, *imageDescriptors = 0;
    int i;
    CvSURFParams params = cvSURFParams(500, 1);

    double tt = (double)cvGetTickCount();
    cvExtractSURF( object, 0, &objectKeypoints, &objectDescriptors, storage, params );
    printf("Object Descriptors: %d\n", objectDescriptors->total);
    cvExtractSURF( image, 0, &imageKeypoints, &imageDescriptors, storage, params );
    printf("Image Descriptors: %d\n", imageDescriptors->total);
    tt = (double)cvGetTickCount() - tt;
    printf( "Extraction time = %gms\n", tt/(cvGetTickFrequency()*1000.));
    CvPoint src_corners[4] = {{0,0}, {object->width,0}, {object->width, object->height}, {0, object->height}};
    CvPoint dst_corners[4];
    IplImage* correspond = cvCreateImage( cvSize(image->width, object->height+image->height), 8, 1 );
    cvSetImageROI( correspond, cvRect( 0, 0, object->width, object->height ) );
    cvCopy( object, correspond );
    cvSetImageROI( correspond, cvRect( 0, object->height, correspond->width, correspond->height ) );
    cvCopy( image, correspond );
    cvResetImageROI( correspond );

#ifdef USE_FLANN
    printf("Using approximate nearest neighbor search\n");
#endif
/*
    if( locatePlanarObject( objectKeypoints, objectDescriptors, imageKeypoints,
        imageDescriptors, src_corners, dst_corners ))
    {
        for( i = 0; i < 4; i++ )
        {
            CvPoint r1 = dst_corners[i%4];
            CvPoint r2 = dst_corners[(i+1)%4];
            cvLine( correspond, cvPoint(r1.x, r1.y+object->height ),
                cvPoint(r2.x, r2.y+object->height ), colors[8] );
        }
    }
*/


    vector<int> ptpairs;
#ifdef USE_FLANN
    flannFindPairs( objectKeypoints, objectDescriptors, imageKeypoints, imageDescriptors, ptpairs );
#else
    findPairs( objectKeypoints, objectDescriptors, imageKeypoints, imageDescriptors, ptpairs );
#endif

////Dileep:It is used to draw the the lines and rectangles in the windows that display images. If you want just the distance, you can comment the whole 
////below code
    for( i = 0; i < (int)ptpairs.size(); i += 2 )
    {
        CvSURFPoint* r1 = (CvSURFPoint*)cvGetSeqElem( objectKeypoints, ptpairs[i] );
        CvSURFPoint* r2 = (CvSURFPoint*)cvGetSeqElem( imageKeypoints, ptpairs[i+1] );
        cvLine( correspond, cvPointFrom32f(r1->pt),
            cvPoint(cvRound(r2->pt.x), cvRound(r2->pt.y+object->height)), colors[8] );
    }

    cvShowImage( "Object Correspond", correspond );
    for( i = 0; i < objectKeypoints->total; i++ )
    {
        CvSURFPoint* r = (CvSURFPoint*)cvGetSeqElem( objectKeypoints, i );
        CvPoint center;
        int radius;
        center.x = cvRound(r->pt.x);
        center.y = cvRound(r->pt.y);
        radius = cvRound(r->size*1.2/9.*2);
        cvCircle( object_color, center, radius, colors[0], 1, 8, 0 );
    }
    cvShowImage( "Object", object_color );

    cvWaitKey(0);
//cvSaveImage("Object_1.png",correspond);
//cvSaveImage("Object_2.png",object_color);

    cvDestroyWindow("Object");
    cvDestroyWindow("Object SURF");
    cvDestroyWindow("Object Correspond");

    return 0;
}

There are enough comments in it, explaining what each part does. compile it. If you compile it to the object file find_obj, the command below works.

./find_obj

Otherwise, replace it with a.out. Also, copy the two images in the samples folder named box.png and box_in_scene.png into the folder where you run the above command.

Now for ./find_obj Since we are not supplying arguements, it will take the default box.png and box_in_scene.png and tries to find the first object inside the second object. The below figure appears.

SURF example


What's actually happening here is object detection. But I included a "dis" variable and also printing it to standard ouput, so that you will also know the average distance of all the descriptors. If you want, you can keep two photographs of the same person with different facial expressions in the same folder and supply them as arg1 and arg2 (in which case you will get distance as zero, ofcourse :P)

./findobj arg1 arg2 

If we keep on changing the second arguement(the image we are supplying as arg2), we can actually use it for recognition. Then the one with the least distance is the best possible match for the first argument (the image we are supplying as arg1). In the next post, I will strip all the unnecessary components of this code and make this usable for recognition. You can try to do it yourself.

Saturday, March 3, 2012

SURF Source code

When you install openCV, openCV samples folder also gets installed. The samples will be present in your InstallationDirectory/samples/C. For me it's present in OpenCV-2.1.0/samples/c.This folder contains the sample codes of many good openCV programmes that can be used for a wide variety of purposes. One more thing is that, it also contains the compiled object files along with the source code for each programme. The programme , we will be looking is find_obj.cpp and the compiled code will be with the name find_obj. This programme uses SURF to do an object detection.

I have modified the code to make it work for recognition. I did my summer project at IIT Kharagpur, and there we performed the experiments on face recognition with SURF. For the first time in face recognition history, we worked with color FERET database. We wrote a paper, describing our approach and writing the results of our work. The Research paper was sent to the proceedings of an international conference. Till that gets published, I cannot share my work. But it's so simple and intuitive, when you look at the code I mentioned. Best of luck with your work.

Update (29/06/12) :
I will release the modified source code on or before 8th July

Thursday, March 1, 2012

SURF - Algorithm for matching

Many people talk about SURF and using it for recognition. But no one actually tells you, how it is used or what might be the algorithm for doing it. This series of posts, will detail you of using it practically. If you followed my previous posts, understanding this would be a lot more easier. You can skip the next paragraph. For noobs they have to read the next paragraph.

This post is made, assuming that you have openCV installed in your computer. If you haven't, go to my previous post for Installing openCV. To practice some openCV programming go to the assignments . If you don't know anything of openCV and are a complete absolute beginner, then go to the Easy navigation and start reading all the posts in the serial order. Most of them are practicals though to get your hands dirty. Also, if you have an open source OS, then this post will be more straight forward; although it isn't necessary. I know that many people use it in matlab. The code can be easily ported if you are well versed with C.

Let me start with giving a quick overview of SURF, so that understanding certain parts of the code we are going to write, will be easier to you. SURF stands for Speeded up Robust Features. In a given image, SURF tries to find the interest points - The points where the variance is maximum. It then constructs a 64-variable vector around it to extract the features (A 128-variable one is also possible, but it would consume more time for matching). In this way, there will be as many descriptors in an image as there are interest points. Now, when we use SURF for matching, we follow the below steps for every probe image (The image we need to match against)

1. Take a descriptor of the probe image
2. Compare this with all the descriptors of the gallery image (One of the set of possible matching images)
3. Find the nearest descriptor which is the one with the lowest distance to our descriptor compared to all the other descriptors
4. The nearest descriptor distance is stored
5. Take a descrptor other than the one's already taken from probe image and go to step-2
6. Now all the nearest descriptor distances are added, and the sum is divided with the total number of probe descriptors. This gives us the average distance
7. This average distance, along with the name of the gallery image we just matched, can be outputted to a file
8. For every gallery image, go to step-1
9. When all the gallery images get over, sort the distances in the outputted file and the one with the lowest distance is the best match for our probe image

There is already a function in openCV called cvExtractSURF to extract the SURF features of images. But there is no function to directly compare two images using SURF and give their distance. In the next post, we will be talking about doing this