Ruei-Sung Lin, David Ross, Jay Yagnik. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010. 

http://www.cs.toronto.edu/~dross/LinRossYagnik_CVPR2010.pdf 

Abstract: 
Searching approximate nearest neighbors in large scale high dimensional data set has been a challenging problem. This paper presents a novel and fast algorithm for learning binary hash functions for fast nearest neighbor retrieval. The nearest neighbors are defined according to the semantic similarity between the objects. Our method uses the information of these semantic similarities and learns a hash function with binary code such that only objects with high similarity have small Hamming
distance. The hash function is incrementally trained one bit at a time, and as bits are added to the hash code Hamming distances between dissimilar objects increase. We further link our method to the idea of maximizing conditional entropy among pair of bits and derive an extremely efficient linear time hash learning algorithm. Experiments on similar image retrieval and celebrity face recognition show that our method produces apparent improvement in performance over some state-of-the-art methods.

'Paper Reading > CVPR' 카테고리의 다른 글

Microsoft Research Street Slide View  (0) 2010.07.30
Interest Seam Image  (0) 2010.07.28
Faster-than-SIFT Object Detection  (0) 2010.06.18
cvpr2010 papers top #10  (0) 2010.05.30
INRIA  (0) 2010.05.16
Posted by 한효정

Borja Peleato and Matt Jones
peleato@stanford.edu, mkjones@cs.stanford.edu
March 14, 2009

We propose the use of generic trees for realtime object search, and improve on the classication time taken by SIFT approximately by a factor of 5. Our approach also supports very fast training, taking no longer than it takes to search for an object in one candidate image.

Section 2 provides the background and an overview of the previous related work. 
Section 3 explains in detail our proposed scheme, 
before going into the analysis and results in section 4. 
Finally, section 5 reviews the main features of our method and gives some possible directions for future work.



'Paper Reading > CVPR' 카테고리의 다른 글

Interest Seam Image  (0) 2010.07.28
SPEC Hashing: Similarity Preserving algorithm for Entropy-based Coding,  (0) 2010.06.19
cvpr2010 papers top #10  (0) 2010.05.30
INRIA  (0) 2010.05.16
Finding Paths through the World's Photos  (0) 2010.05.15
Posted by 한효정

Smoothed Local Histogram Filters
Pixar Technical Memo 10-02

Michael Kass
Pixar Animation Studios

Justin Solomon
Pixar Animation Studios and Stanford University

http://graphics.pixar.com/library/SmoothedHistogramsA/paper.pdf









'Paper Reading' 카테고리의 다른 글

ECIR 2011 Best Paper Awards and Other Highlights  (0) 2011.05.04
Object Recognition  (0) 2010.03.01
Conference Paper  (1) 2009.04.28
Posted by 한효정

  1. Cascade Object Detection with Deformable Part Models (PDF) (score: 100)
  2. Pedro Felzenszwalb, Ross B. Girshick, David McAllester
  3. A generative perspective on MRFs in low-level vision (PDFsupplementary material) (score: 92)
  4. Uwe Schmidt, Qi Gao, Stefan Roth
  5. Aggregating local descriptors into a compact image representation (PDFproject) (score: 82)
  6. Herv� J�ou, Matthijs Douze, Cordelia Schmid, Patrick P�ez
  7. Depth from Diffusion (PDF) (score: 71)
  8. Changyin Zhou, Oliver Cossairt, Shree Nayar
  9. Context-Aware Saliency Detection (PDFAbstract) (score: 62)
  10. Stas Goferman, Lihi Zelnik-Manor, Ayellet Tal
  11. Monocular 3D pose estimation and tracking by detection (PDF) (score: 60)
  12. Mykhaylo Andriluka, Stefan Roth, Bernt Schiele
  13. Food Recognition Using Statistics of Pairwise Local Features (PDF) (score: 58)
  14. Shulin Yang, Mei Chen, Dean Pomerleau, Rahul Sukthankar
  15. Global and Efficient Self-Similarity for Object Classification and Detection (PDF) (score: 56)
  16. Thomas Deselaers, Vittorio Ferrari
  17. What is an object? (PDF) (score: 47)
  18. Bogdan Alexe, Thomas Deselaers, Vittorio Ferrari
  19. A New Texture Descriptor Using Multifractal Analysis in Multi-orientation Wavelet Pyramid (PDF) (score: 46)
  20. Yong Xu, Xiong Yang, Haibin Ling, Hui Ji

Posted by 한효정

2010. 5. 16. 14:54 Paper Reading/CVPR

INRIA



인터페이스 깔끔하고 속도 빠르고 좋다.
Indexing 기술이 상당 수준에 있다.

다만 사람이 보기에 똑같은 이미지가 사이즈가 틀릴 경우 랭킹 값의 차이가 많이 발생한다.
크기는 다른 방법으로 풀어야 하는게 맞을거 같다.

Local Feature를 썼을텐데 Indexing을 어떻게 한 것인지 논문들을 찾아볼 필요가 있을 듯 싶다.
(Gray, 다른 색상일 때 상관 없이 잘 찾는다. Scale 변화에도 잘 찾는다. Rotation도 잘 찾는다.)

Paper
Accurate image search using the contextual dissimilarity measure

Aggregating local descriptors into a compact image representation

Packing bag-of-features

Improving web image search results using query-relative classifiers

Improving bag-of-features for large scale image search


Posted by 한효정
Posted by 한효정

2010. 3. 1. 21:26 Paper Reading

Object Recognition

CS395T: Special Topics in Computer Vision, Spring 2010

Object Recognition



Course overview        Useful links        Syllabus        Detailed schedule        eGradebook        Blackboard


Meets:
Wednesdays 3:30-6:30 pm
ACES 3.408
Unique # 54470
 
Instructor: Kristen Grauman 
Email: grauman@cs
Office: CSA 114
 
TA: Sudheendra Vijayanarasimhan
Email: svnaras@cs
Office: CSA 106

When emailing us, please put CS395 in the subject line.

Announcements:

See the schedule for current reading assignments.

Course overview:


Topics: This is a graduate seminar course in computer vision.   We will survey and discuss current vision papers relating to object recognition, auto-annotation of images, and scene understanding.  The goals of the course will be to understand current approaches to some important problems, to actively analyze their strengths and weaknesses, and to identify interesting open questions and possible directions for future research.

See the syllabus for an outline of the main topics we'll be covering.

Requirements: Students will be responsible for writing paper reviews each week, participating in discussions, completing one programming assignment, presenting once or twice in class (depending on enrollment, and possibly done in teams), and completing a project (done in pairs). 

Note that presentations are due one week before the slot your presentation is scheduled.  This means you will need to read the papers, prepare experiments, make plans with your partner, create slides, etc. more than one week before the date you are signed up for.  The idea is to meet and discuss ahead of time, so that we can iterate as needed the week leading up to your presentation. 

More details on the requirements and grading breakdown are here.

Prereqs:  Courses in computer vision and/or machine learning (378 Computer Vision and/or 391 Machine Learning, or similar); ability to understand and analyze conference papers in this area; programming required for experiment presentations and projects. 

Please talk to me if you are unsure if the course is a good match for your background.  I generally recommend scanning through a few papers on the syllabus to gauge what kind of background is expected.  I don't assume you are already familiar with every single algorithm/tool/image feature a given paper mentions, but you should feel comfortable following the key ideas.


Syllabus overview:

  1. Single-object recognition fundamentals: representation, matching, and classification
    1. Specific objects
    2. Classification and global models
    3. Objects composed of parts
    4. Region-based methods
  2. Beyond single objects: recognizing categories in context and learning their properties
    1. Context
    2. Attributes
    3. Actions and objects/scenes
  3. Scalability issues in category learning, detection, and search
    1. Too many pixels!
    2. Too many categories!
    3. Too many images!
  4. Recognition and "everyday" visual data
    1. Landmarks, locations, and tourists
    2. Alignment with text
    3. Pictures of people

(생략)

Other useful links:

 
 
Related courses:
 
Past semesters at UT:
 
By colleagues elsewhere:
reference
http://www.cs.utexas.edu/~grauman/courses/spring2010/schedule.html

'Paper Reading' 카테고리의 다른 글

ECIR 2011 Best Paper Awards and Other Highlights  (0) 2011.05.04
Siggraph 2010 papers  (0) 2010.06.07
Conference Paper  (1) 2009.04.28
Posted by 한효정

Navneet Dalal and Bill Triggs

RGB colour space with no gamma correction; [-1; 0; 1]

gamma correction
비디오 카메라, 컴퓨터 그래픽 등에서 비선형 전달 함수(nonlinear transfer function)를 사용하여 빛의 강도(intensity) 신호를 비선형적으로 변형하는 것을 말한다.
인간의 시각은 베버의 법칙(Weber's law)에 따라 밝기에 대해 비선형적으로 반응한다. (청각과 같은 다른 감각들도 자극에 대해 비선형적으로 반응한다.) 이 때문에 예를 들어 채널 당 8 bit와 같이 한정된 정보표현량(bit depth)안에서 선형적으로 빛의 밝기를 기록하면 사람의 눈으로 보기에는 양이 변할때 부드럽게 느껴지지 않고 단절되어 보이는 현상(posterization)이 발생한다. 따라서, 주어진 정보표현량의 한계 안에서 최적의 화질을 보여주기 위해선 비선형적으로 부호화해야 할 필요가 있다. (예. Rec. 709 transfer function과 같은 비선형 함수 사용)
디지털 카메라의 경우 내부에 저장되어 있는 데이터를 JPEG, TIFF 형식으로 저장하는 과정에서 감마 보정이 이루어진다. 디지털 카메라에서 지원하는 대부분의 RAW 그래픽 파일 포맷은 감마 보정이 적용되지 않은 데이터이다. 하지만, 니콘의 압축형 NEF (compressed NEF)의 경우 bit depth를 9.4 bit으로 줄이는 양자화(quantization)과정에서 감마 곡선과 유사한 비선형 곡선(전달 함수)을 사용한다.

gradient filter with no smoothing;

gradient filter [-1 1] [-1 0 1] 영상 데이타의 미분은 x축은 이렇게... 연속적인 공간과 이산적인 공간은 다른 세계인가? 영상 데이타의 경우 이산 공간에 씌여지는 것이므로 x->0으로 간다는 것이 있을 수 없는 것 .. 한 픽셀이 크기가 굉장히 작으므로 그것이 거의 비슷하게 동작한다고 생각해도 괜찮지 않을까? smoothing을 하지 않는 다는 것은 그만큼 에지가 있는 것의 소실을 줄인다는 의미.... 하지만 어떤 shape 형태가 아니고서야 gradient 데이타가 의미가 있을까 하는 의문도...

linear gradient voting into 9 orientation bins in 0~180;

linear 한 gradient 값이 어떤 것인가? .... 그냥 일반적인 gradient겠지....
9 방향 값이면 20도씩 나눴다는 것인데 9개로 나누었을때 문제는 20, 40, 60, 80 100, 120, 140, 160, 180(0) 근처에 값들이 많을 경우 보기에는 똑같은 이미지라도 하더라도 다른 histogram을 그리게 될 확률이 높다. 따라서 대체로 자기 옆의 bin들을 이용해서 자신의 bin과 평균을 내어서 보정을 해주게 된다. 옆의 bin을 몇 개까지 하느냐는 것은 얼마나 데이타 bin을 잘게 나눴냐에 따라서 다른 문제이겠지만....

16x16 pixel blocks of four 8x8 pixel cells;

이건 뭐? block 단위로 뭐 조작하는게 있겠지..

Gaussian spatial window with 시그마 8 pixel

Gaussian 필터 돌리나보지.. 시그마 8로 근데 pixel이란 말은 왜 붙은거?

1. Input Image
2. Normalize gamma & colour

false positives per window(FPPW)

3. Compute gradients
4. Weighted vote into spatial & orientation cells
5. Contast normallize over overlapping spatial blocks
6. Collect HOG's over detection window
7. Linear SVM



'Paper Reading > CVPR' 카테고리의 다른 글

SPEC Hashing: Similarity Preserving algorithm for Entropy-based Coding,  (0) 2010.06.19
Faster-than-SIFT Object Detection  (0) 2010.06.18
cvpr2010 papers top #10  (0) 2010.05.30
INRIA  (0) 2010.05.16
Finding Paths through the World's Photos  (0) 2010.05.15
Posted by 한효정
이전버튼 1 2 3 이전버튼

블로그 이미지
착하게 살자.
한효정

카테고리

공지사항

Yesterday
Today
Total

달력

 « |  » 2024.12
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31

최근에 올라온 글

최근에 달린 댓글

글 보관함