Wednesday, March 28, 2012

I am too ambitious last week. I was planning to implement HOG on GPU, but now I am still trying to upload image to device and compute gradient image which is used for compute HOG. Still not familiar with CUDA APIs, though we have done homework2 for practice. I just had two mid-exams this week. Now I can concentrate on the final project. Overall, I just made little progress last week. I need to catch up and get well prepared for next week's presentation.

Wednesday, March 21, 2012

This week I get familiar with the openCV library. It's a library of programming functions for computer vision with C++/C/Python implementations. Though I have some previous experiences in computer vision, I always work in Matlab.  It's my first time to work with openCV. It seems quite convenient to use openCV. There are APIs for image operations like reading image to memory and writing images to file. Also, matrix operations are very similar as that in Matlab. In the coming week, I plan to implement HOG (hopefully on GPU) which is the feature descriptor for my detection system.

Monday, March 12, 2012


CIS 565 – Final Project Proposal                                             

GPU-Accelerated Logo Detection


Object detection is an important task in Computer Vision community. The goal of it is trying to enable computers to automatically detect semantic objects, like human faces, cars or barriers in the digital images or videos. There are a lot of applications for object detection such as digit image retrieval, visual tracking and automated vehicle systems. In computer vision tasks, many operations for images are inherently parallel in computer vision field, like applying a filter for each patch in an image, so we can utilize GPU to improve the performance, like GPU-accelerated face detection and pedestrian detection included in the OpenCV.

For my final project I propose to implement a GPU-Accelerated Logo Detection system by using Histogram of Oriented Gradients (HOG) as feature descriptor. For a given target logo, the Logo Detection system will find the possible locations of target in the image. The logos in images may have different scales or orientations as target logo. To match the target logo in the image, I will use HOG which is a very common feature descriptor in computer vision. For a specific image, we can divide it into a bunch of overlapping blocks with same size. HOG builds a histogram of gradient orientation for each block of image and measures local appearance of logo.

In order to find target logo with different scales and orientations, I will create an image stack for the target logo and build codebook in each scale and orientation. The codebook will store HOG features and offsets to center for each block in the target logo. For each possible scale or orientation, I will calculate HOG for the input image and get each block matched with minimum distance in the codebook. Each image block will have a weighted vote for the possible center according to the offsets. The logo centers in the input image would hopefully get a lot of votes in right scales or orientations. Running this logo detection algorithm on GPU would hopefully accelerate per-block voting operation and searching process for right scales and orientations in input images with large size.