CIS 565 – Final Project Proposal
GPU-Accelerated
Logo Detection
Object detection is an important task in
Computer Vision community. The goal of it is trying to enable computers to
automatically detect semantic objects, like human faces, cars or barriers in
the digital images or videos. There are a lot of applications for object
detection such as digit image retrieval, visual tracking and automated vehicle
systems. In computer vision tasks, many operations for images are inherently
parallel in computer vision field, like applying a filter for each patch in an
image, so we can utilize GPU to improve the performance, like GPU-accelerated
face detection and pedestrian detection included in the OpenCV.
For my final project I propose to implement
a GPU-Accelerated Logo Detection system by using Histogram of Oriented Gradients
(HOG) as feature descriptor. For a given target logo, the Logo Detection system
will find the possible locations of target in the image. The logos in images
may have different scales or orientations as target logo. To match the target
logo in the image, I will use HOG which is a very common feature descriptor in
computer vision. For a specific image, we can divide it into a bunch of
overlapping blocks with same size. HOG builds a histogram of gradient orientation
for each block of image and measures local appearance of logo.
In order to find target logo with different
scales and orientations, I will create an image stack for the target logo and build
codebook in each scale and orientation. The codebook will store HOG features
and offsets to center for each block in the target logo. For each possible
scale or orientation, I will calculate HOG for the input image and get each
block matched with minimum distance in the codebook. Each image block will have
a weighted vote for the possible center according to the offsets. The logo
centers in the input image would hopefully get a lot of votes in right scales
or orientations. Running this logo detection algorithm on GPU would hopefully
accelerate per-block voting operation and searching process for right scales
and orientations in input images with large size.