I am too ambitious last week. I was planning to implement HOG on GPU, but now I am still trying to upload image to device and compute gradient image which is used for compute HOG. Still not familiar with CUDA APIs, though we have done homework2 for practice. I just had two mid-exams this week. Now I can concentrate on the final project. Overall, I just made little progress last week. I need to catch up and get well prepared for next week's presentation.
This week I get familiar with the openCV library. It's a library of programming functions for computer vision with C++/C/Python implementations. Though I have some previous experiences in computer vision, I always work in Matlab. It's my first time to work with openCV. It seems quite convenient to use openCV. There are APIs for image operations like reading image to memory and writing images to file. Also, matrix operations are very similar as that in Matlab. In the coming week, I plan to implement HOG (hopefully on GPU) which is the feature descriptor for my detection system.
Object detection is an important task in
Computer Vision community. The goal of it is trying to enable computers to
automatically detect semantic objects, like human faces, cars or barriers in
the digital images or videos. There are a lot of applications for object
detection such as digit image retrieval, visual tracking and automated vehicle
systems. In computer vision tasks, many operations for images are inherently
parallel in computer vision field, like applying a filter for each patch in an
image, so we can utilize GPU to improve the performance, like GPU-accelerated
face detection and pedestrian detection included in the OpenCV.
For my final project I propose to implement
a GPU-Accelerated Logo Detection system by using Histogram of Oriented Gradients
(HOG) as feature descriptor. For a given target logo, the Logo Detection system
will find the possible locations of target in the image. The logos in images
may have different scales or orientations as target logo. To match the target
logo in the image, I will use HOG which is a very common feature descriptor in
computer vision. For a specific image, we can divide it into a bunch of
overlapping blocks with same size. HOG builds a histogram of gradient orientation
for each block of image and measures local appearance of logo.
In order to find target logo with different
scales and orientations, I will create an image stack for the target logo and build
codebook in each scale and orientation. The codebook will store HOG features
and offsets to center for each block in the target logo. For each possible
scale or orientation, I will calculate HOG for the input image and get each
block matched with minimum distance in the codebook. Each image block will have
a weighted vote for the possible center according to the offsets. The logo
centers in the input image would hopefully get a lot of votes in right scales
or orientations. Running this logo detection algorithm on GPU would hopefully
accelerate per-block voting operation and searching process for right scales
and orientations in input images with large size.