'Paper Reading/CVPR' 카테고리의 글 목록

2011. 7. 20. 17:44 Paper Reading/CVPR

Discriminatively Trained Deformable Part Models

http://people.cs.uchicago.edu/~pff/latent/

Discriminatively Trained Deformable Part Models

Version 4. Updated on April 21, 2010.

Over the past few years we have developed a complete learning-based system for detecting and localizing objects in images. Our system represents objects using mixtures of deformable part models. These models are trained using a discriminative method that only requires bounding boxes for the objects in an image. The approach leads to efficient object detectors that achieve state of the art results on the PASCAL and INRIA person datasets.

At a high level our system can be characterized by the combination of
1) Strong low-level features based on histograms of oriented gradients (HOG).
2) Efficient matching algorithms for deformable part-based models (pictorial structures).
3) Discriminative learning with latent variables (latent SVM).

PASCAL VOC "Lifetime Achievement" Prize

Here you can download a complete implementation of our system. The current implementation extends the system in [2] as described in [3]. The models in this implementation are structured using the grammar formalism presented in [4]. Previous releases are available below.

The distribution contains object detection and model learning code, as well as models trained on the PASCAL and INRIA Person datasets. This release also includes code for rescoring detections based on contextual information.

Also available (as a separate package) is the source code for a cascade version of the object detection system, which is described in [5].

The system is implemented in Matlab, with a few helper functions written in C/C++ for efficiency reasons. The software was tested on several versions of Linux and Mac OS X using Matlab versions R2009b and R2010a. There may be compatibility issues with other versions of Matlab.

For questions regarding the source code please contact Ross Girshick at r...@cs.uchicago.edu (click the "..." to reveal the email address).

Source code and model download: voc-release4.tgz (updated on 04/21/10).
Warning: fconvblas.cc does not work with matlab 2010b. You should use fconv.cc or fconvMT.cc (see compile.m).
Cascade detection code: here

This project has been supported by the National Science Foundation under Grant No. 0534820, 0746569 and 0811340.

References

Slides from a presentation given at the 2009 Chicago Machine Learning Summer School and Workshop pdf.

[1] P. Felzenszwalb, D. McAllester, D. Ramaman.
A Discriminatively Trained, Multiscale, Deformable Part Model.
Proceedings of the IEEE CVPR 2008.

[2] P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan.
Object Detection with Discriminatively Trained Part Based Models.
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, September 2010
pdf

[3] R. Girshick, P. Felzenszwalb, D. McAllester.
release4-notes.pdf -- also included in the download.

[4] P. Felzenszwalb, D. McAllester.
Object Detection Grammars.
University of Chicago, Computer Science TR-2010-02, February 2010.
pdf

[5] P. Felzenszwalb, R. Girshick, D. McAllester.
Cascade Object Detection with Deformable Part Models.
Proceedings of the IEEE CVPR 2010.
pdf

How to cite

When citing our system, please cite reference [2] and the website for the specific release. The website bibtex reference is below.

@misc{voc-release4, author = "Felzenszwalb, P. F. and Girshick, R. B. and McAllester, D.", title = "Discriminatively Trained Deformable Part Models, Release 4", howpublished = "http://people.cs.uchicago.edu/~pff/latent-release4/"}

Example detections

Detection results — PASCAL datasets

The models included with the source code were trained on the train+val dataset from each year and evaluated on the corresponding test dataset.
This is exactly the protocol of the "comp3" competition. Below are the average precision scores we obtain in each category.

Table 1. PASCAL VOC 2009 comp3
	aero	bicycle	bird	boat	bottle	bus	car	cat	chair	cow	table	dog	horse	mbike	person	plant	sheep	sofa	train	tv	*mean*
without context	39.5	48.2	11.4	12.3	28.6	42.3	40.4	25.0	17.4	20.5	15.3	14.5	42.1	44.4	41.9	12.7	24.3	16.5	43.3	32.2	28.6
with context	43.6	50.8	15.1	14.1	30.2	45.6	41.8	27.3	18.9	22.1	15.8	18.2	45.7	47.3	43.8	14.3	26.4	18.2	46.8	33.7	31.0

Table 2. PASCAL VOC 2007 comp3
	aero	bicycle	bird	boat	bottle	bus	car	cat	chair	cow	table	dog	horse	mbike	person	plant	sheep	sofa	train	tv	*mean*
without context	28.9	59.5	10.0	15.2	25.5	49.6	57.9	19.3	22.4	25.2	23.3	11.1	56.8	48.7	41.9	12.2	17.8	33.6	45.1	41.6	32.3
with context	31.2	61.5	11.9	17.4	27.0	49.1	59.6	23.1	23.0	26.3	24.9	12.9	60.1	51.0	43.2	13.4	18.8	36.2	49.1	43.0	34.1

Table 3. PASCAL VOC 2006 comp3
	bicycle	bus	car	cat	cow	dog	horse	mbike	person	sheep	*mean*
without context	67.1	65.8	70.7	26.8	47.7	15.8	48.3	66.0	41.0	45.6	49.5
with context	69.2	67.6	71.5	29.0	51.4	19.4	54.0	70.0	44.3	47.4	52.4

Detection Results — INRIA Person

We also trained and tested a model on the INRIA Person dataset.
We scored the model using the PASCAL evaluation methodology in the complete test dataset, including images without people.

INRIA Person average precision: 88.2

Plot of Recall / False positives per image (FPPI):

Previous Releases
voc-release3
voc-release2
voc-release1

back to pff's homepage

'Paper Reading > CVPR' 카테고리의 다른 글

Automatically Mining Person Models of Celebrities for Visual Search Applications (0)	2011.06.27
Scalable Face Image Retrieval with Identity-Based Quantization and Multi-Reference Re-ranking (0)	2011.06.22
A simple object detector with boosting (0)	2011.06.01
Homography 정리 잘된 것 (0)	2010.11.24
Microsoft Research Street Slide View (0)	2010.07.30

Posted by 한효정

2011. 6. 27. 09:42 Paper Reading/CVPR

Automatically Mining Person Models of Celebrities for Visual Search Applications

Methods and systems for automated identification of celebrity face images are provided that generate a name list of prominent celebrities, obtain a set of images and corresponding feature vectors for each name, detect faces within the set of images, and remove non-face images. An analysis of the...

Inventors: David ROSS, Andrew RABINOVICH, Anand PILLAI, Hartwig ADAM
Assignee: Google Inc.

http://www.google.com/patents/about?id=TXjhAQAAEBAJ

12_859_721_Automatically_Mining_Person_M.pdf

'Paper Reading > CVPR' 카테고리의 다른 글

Discriminatively Trained Deformable Part Models (0)	2011.07.20
Scalable Face Image Retrieval with Identity-Based Quantization and Multi-Reference Re-ranking (0)	2011.06.22
A simple object detector with boosting (0)	2011.06.01
Homography 정리 잘된 것 (0)	2010.11.24
Microsoft Research Street Slide View (0)	2010.07.30

Posted by 한효정

2011. 6. 22. 09:27 Paper Reading/CVPR

Scalable Face Image Retrieval with Identity-Based Quantization and Multi-Reference Re-ranking

Zhong Wu

Tsinghua Univ., Ctr Adv Study

Qifa Ke

y1

, Jian Sun

y2

Microsoft Research

1

Silicon Valley Lab,

2

Asia Lab

Heung-Yeung Shumy

Microsoft Corporation

http://research.microsoft.com/pubs/122158/cvpr2010.pdf

cvpr2010.pdf

'Paper Reading > CVPR' 카테고리의 다른 글

Discriminatively Trained Deformable Part Models (0)	2011.07.20
Automatically Mining Person Models of Celebrities for Visual Search Applications (0)	2011.06.27
A simple object detector with boosting (0)	2011.06.01
Homography 정리 잘된 것 (0)	2010.11.24
Microsoft Research Street Slide View (0)	2010.07.30

Posted by 한효정

2011. 6. 1. 09:28 Paper Reading/CVPR

A simple object detector with boosting

http://people.csail.mit.edu/torralba/shortCourseRLOC/boosting/boosting.html

A simple object detector
with boosting

ICCV 2005 short courses on
Recognizing and Learning Object Categories

Boosting provides a simple framework to develop robust object detection algorithms. This set of functions provide a minimal set to build an object detection algorithm. It is entirely written on Matlab in order to make it easily accesible as a teaching tool. Therefore, it is not appropriate for building real-time applications.

Setup

Download the code and datasets
Download the LabelMe toolbox

Unzip both files. Modify the paths in initpath.m
Modify the folder paths in paramaters.m to point to the locations of the images and annotations.

Description of the functions

Initialization
initpath.m - Initializes the matlab path. You should run this command when you start the Matlab session.
paremeters.m - Contains parameters to configure the classifiers and the database.

Boosting tools
demoGentleBoost.m - simple demo of gentleBoost using stumps on two dimensions

Scripts
createDatabases.m - creates the training and test database using the LabelMe database.
createDictionary.m - creates a dictionary of filtered patches from the target object.
computeFeatures.m - precomputes the features of all images and stores the feature outputs on the center of the target object and on a sparse set of locations from the background.
trainDetector.m - creates the training and test database using the LabelMe database
runDetector.m - runs the detector on test images

Features and weak detectors
convCrossConv.m - Weak detector: computes template matching with a localized patch in object centered coordinates.

Detector
singleScaleBoostedDetector.m - runs the strong classifier on an image at a single scale and outputs bounding boxes and scores.

LabelMe toolbox
LabelMe - Describes the utility functions used to manipulate the database

Examples

Setup
First run initpath.m and modify the folder paths in the script parameters.m

Boosting
First run the Boosting demo demoGentleBoost.m

This demo will first ask for a set of points in 2D to be used a training data (Left button = class +1, right button = class -1). The classifier will only be able to perform simple discrimination tasks as it uses stumps as weak classifiers (i.e., only lines parallel to the axis). If you use weak classifiers to be lines with any orientation, then you will get more interesting boundaries easily. However, stumps are frequently used in object detection as they can be used to do efficient feature selection. This demo will show you the limits of stumps. In object detection, some of these limitations are compensated by using a very large number of features.

A look to the database
This is a sample of the images used for this demo. They contain cars (side views) and screens (frontal views), with normalized scale. They are a small subset of the LabelMe dataset. The program createDatabase.m shows how the database used for this demo was created.

If you download the full database, the first thing you have to do is to actualize the folders in parameters.m. Then, you have to run the program createDatabase.m which will read all the annotation files and will create a struct that will be used later by the query tools. For more information about how the query tools work, you can check the LabelMe Toolbox.

Running the detector
Before trying to train your own detector, you can try the script runDetector.m. If everything is setup right, the output should look like:

Here there is an example of the output of the detector when trained to detected side views of cars:

Training a new detector
To train a new detector, first you need to collect a new set of images. If you use the full LabelMe database, then, you will only need to change the object name in the program parameters.m to indicate the object category you want to detect. Also, in parameters.m you can change training parameters such as the number of training images, the size of the patches, the scale of the object, the number of negative examples, etc.

createDictionary.m will create the vocabulary of patches used to compute the features.

computeFeatures.m will precompute all the features for the training images.

trainDetector.m will train the detector using Gentle Boosting [1].

Every one of these programs adds information to the 'data' struct which will contain information such as the precomputed features, list of images used for training, the dictionary of features, the parameters of the classifier.

Finally, with runDetector.m you can run the new detector.

Multiscale detector

In order to build a multiscale detectors, you need to loop on scales. Something like this:
scalingStep = 0.8;
for scale = 1:Nscales
img = imresize(img, scalingStep, 'bilinear');
[Score{scale}, boundingBox{scale}, boxScores{scale}] = singleScaleBoostedDetector(img, data);
end

References
[1] Friedman, J. H., Hastie, T. and Tibshirani, R., "Additive Logistic Regression: a Statistical View of Boosting." (Aug. 1998)

[2] A. Torralba, K. P. Murphy and W. T. Freeman. (2004). "Sharing features: efficient boosting procedures for multiclass object detection". Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). pp 762- 769.

'Paper Reading > CVPR' 카테고리의 다른 글

Automatically Mining Person Models of Celebrities for Visual Search Applications (0)	2011.06.27
Scalable Face Image Retrieval with Identity-Based Quantization and Multi-Reference Re-ranking (0)	2011.06.22
Homography 정리 잘된 것 (0)	2010.11.24
Microsoft Research Street Slide View (0)	2010.07.30
Interest Seam Image (0)	2010.07.28

Posted by 한효정

2010. 11. 24. 07:10 Paper Reading/CVPR

Homography 정리 잘된 것

Dubrofsky_Elan.pdf

'Paper Reading > CVPR' 카테고리의 다른 글

Scalable Face Image Retrieval with Identity-Based Quantization and Multi-Reference Re-ranking (0)	2011.06.22
A simple object detector with boosting (0)	2011.06.01
Microsoft Research Street Slide View (0)	2010.07.30
Interest Seam Image (0)	2010.07.28
SPEC Hashing: Similarity Preserving algorithm for Entropy-based Coding, (0)	2010.06.19

Posted by 한효정

2010. 7. 30. 14:30 Paper Reading/CVPR

Microsoft Research Street Slide View

Paper

http://research.microsoft.com/en-us/um/people/kopf/street_slide/paper/street_slide.pdf

'Paper Reading > CVPR' 카테고리의 다른 글

A simple object detector with boosting (0)	2011.06.01
Homography 정리 잘된 것 (0)	2010.11.24
Interest Seam Image (0)	2010.07.28
SPEC Hashing: Similarity Preserving algorithm for Entropy-based Coding, (0)	2010.06.19
Faster-than-SIFT Object Detection (0)	2010.06.18

Posted by 한효정

2010. 7. 28. 16:04 Paper Reading/CVPR

Interest Seam Image

We propose interest seam image, an efficient visual synopsis for video. To extract an interest seam image, a spatiotemporal energy map is constructed for the target video shot. Then an optimal seam which encompasses the highest energy is identified by an efficient dynamic programming algorithm. The optimal seam is used to extract a seam of pixels from each video frame to form one column of an image, based on which an interest seam image is finally composited. The interest seam image is efficient both in terms of computation and memory cost. Therefore it is able to power a wide variety of web-scale video content analysis applications, such as near duplicate video clip search, video genre recognition and classification, as well as video clustering, etc.. The representation capacity of the proposed interest seam image is demonstrated in a large scale video retrieval task. Its advantages are clearly exhibited when compared with previous works, as reported in our experiments.

[Reference]

http://videolectures.net/cvpr2010_hua_isi/

CVPR10.pdf

'Paper Reading > CVPR' 카테고리의 다른 글

Homography 정리 잘된 것 (0)	2010.11.24
Microsoft Research Street Slide View (0)	2010.07.30
SPEC Hashing: Similarity Preserving algorithm for Entropy-based Coding, (0)	2010.06.19
Faster-than-SIFT Object Detection (0)	2010.06.18
cvpr2010 papers top #10 (0)	2010.05.30

Posted by 한효정

2010. 6. 19. 17:06 Paper Reading/CVPR

SPEC Hashing: Similarity Preserving algorithm for Entropy-based Coding,

Ruei-Sung Lin, David Ross, Jay Yagnik. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.

http://www.cs.toronto.edu/~dross/LinRossYagnik_CVPR2010.pdf

Abstract:
Searching approximate nearest neighbors in large scale high dimensional data set has been a challenging problem. This paper presents a novel and fast algorithm for learning binary hash functions for fast nearest neighbor retrieval. The nearest neighbors are deﬁned according to the semantic similarity between the objects. Our method uses the information of these semantic similarities and learns a hash function with binary code such that only objects with high similarity have small Hamming distance. The hash function is incrementally trained one bit at a time, and as bits are added to the hash code Hamming distances between dissimilar objects increase. We further link our method to the idea of maximizing conditional entropy among pair of bits and derive an extremely efﬁcient linear time hash learning algorithm. Experiments on similar image retrieval and celebrity face recognition show that our method produces apparent improvement in performance over some state-of-the-art methods.

LinRossYagnik_CVPR2010.pdf

'Paper Reading > CVPR' 카테고리의 다른 글

Microsoft Research Street Slide View (0)	2010.07.30
Interest Seam Image (0)	2010.07.28
Faster-than-SIFT Object Detection (0)	2010.06.18
cvpr2010 papers top #10 (0)	2010.05.30
INRIA (0)	2010.05.16

Posted by 한효정

The Power of One

'Paper Reading/CVPR'에 해당되는 글 13건

Discriminatively Trained Deformable Part Models

Discriminatively Trained Deformable Part Models

Version 4. Updated on April 21, 2010.

'Paper Reading > CVPR' 카테고리의 다른 글

Automatically Mining Person Models of Celebrities for Visual Search Applications

'Paper Reading > CVPR' 카테고리의 다른 글

Scalable Face Image Retrieval with Identity-Based Quantization and Multi-Reference Re-ranking

'Paper Reading > CVPR' 카테고리의 다른 글

A simple object detector with boosting

'Paper Reading > CVPR' 카테고리의 다른 글

Homography 정리 잘된 것

'Paper Reading > CVPR' 카테고리의 다른 글

Microsoft Research Street Slide View

'Paper Reading > CVPR' 카테고리의 다른 글

Interest Seam Image

'Paper Reading > CVPR' 카테고리의 다른 글

SPEC Hashing: Similarity Preserving algorithm for Entropy-based Coding,

'Paper Reading > CVPR' 카테고리의 다른 글

카테고리

태그목록

공지사항

달력

최근에 올라온 글

최근에 달린 댓글

링크

글 보관함

티스토리툴바

« | » 2025.10
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31