Please use this identifier to cite or link to this item: http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/815
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKirishanthy, T.-
dc.contributor.authorRamanan, A.-
dc.date.accessioned2016-01-04T07:18:19Z-
dc.date.accessioned2022-06-28T04:51:45Z-
dc.date.available2016-01-04T07:18:19Z-
dc.date.available2022-06-28T04:51:45Z-
dc.date.issued2015-11-24-
dc.identifier.urihttp://repo.lib.jfn.ac.lk/ujrr/handle/123456789/815-
dc.description.abstractIn a patch-based object recognition system the key role of a visual vocabulary is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual vocabulary determines the quality of the vocabulary model, whereas the size of the vocabulary controls the complexity of the model. A compact visual vocabulary provides a lower-dimensional representation whereas a large-sized vocabulary may overfit to the distribution of visual words in an image and lead to heavy computational load. The generic framework of a bag-of-features approach follows a standard routine extracting local image descriptors and clustering with a user-designated number of clusters. The problem with this routine lies in that constructing a vocabulary for each single dataset is not efficient. Usually the construction of a vocabulary is achieved by cluster analysis using K-means algorithm. However, one of its drawbacks is the choice of a suitable value for K which determines the size of a visual vocabulary. The choice of the size of a vocabulary should be balanced between the recognition rate and computational needs. In this paper we propose a two-staged approach to map an initial high-dimensional vocabulary into a compact vocabulary while maintaining its discriminative power. Using an initial larger vocabulary we first represent the training images using a coding scheme that maps the importance of each visual word within an image as visual bits. These set of visual bits of images then form a sparse representation of every visual word with respect to the set of category-specific training images that is used for the compression. We have tested our vocabulary compression technique on four computer vision tasks: (i) Xerox7 (ii) PASCAL VOC Challenge 2007 (iii) UIUC texture and (iv) MPEG7 CE Shape-1 Part B Silhouette image datasets. Testing results show that the proposed method slightly outperforms vocabularies learnt by K-means by achieving just half the size of initial vocabulary. Our compression technique could help to optimize larger vocabularies to fewer visual words with stable performance.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectBag-of-features; Compact Visual Vocabulary; Object recognition; Visual bit representationen_US
dc.titleCreating Compact and Discriminative Visual Vocabularies using Visual Bitsen_US
dc.typeArticleen_US
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Visual Vocabularies Using Visual Bits.pdf177.14 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.