Results 1 - 10
of
10
The fastest deformable part model for object detection
- In CVPR
, 2014
"... This paper solves the speed bottleneck of deformable part model (DPM), while maintaining the accuracy in de-tection on challenging datasets. Three prohibitive steps in cascade version of DPM are accelerated, including 2D cor-relation between root filter and feature map, cascade part pruning and HOG ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
(Show Context)
This paper solves the speed bottleneck of deformable part model (DPM), while maintaining the accuracy in de-tection on challenging datasets. Three prohibitive steps in cascade version of DPM are accelerated, including 2D cor-relation between root filter and feature map, cascade part pruning and HOG feature extraction. For 2D correlation, the root filter is constrained to be low rank, so that 2D cor-relation can be calculated by more efficient linear combi-nation of 1D correlations. A proximal gradient algorithm is adopted to progressively learn the low rank filter in a dis-criminative manner. For cascade part pruning, neighbor-hood aware cascade is proposed to capture the dependence in neighborhood regions for aggressive pruning. Instead of explicit computation of part scores, hypotheses can be pruned by scores of neighborhoods under the first order ap-proximation. For HOG feature extraction, look-up tables are constructed to replace expensive calculations of orien-tation partition and magnitude with simpler matrix index operations. Extensive experiments show that (a) the pro-posed method is 4 times faster than the current fastest DPM method with similar accuracy on Pascal VOC, (b) the pro-posed method achieves state-of-the-art accuracy on pedes-trian and face detection task with frame-rate speed. 1.
Predicting parameters in deep learning
- In Proc. NIPS
, 2013
"... We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of t ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95 % of the weights of a network without any drop in accuracy. 1
Toward Fast Transform Learning
, 2013
"... The dictionary learning problem aims at finding a dictionary of atoms that best represents an image according to a given objective. The most usual objective consists of representing an image or a class of images sparsely. Most algorithms performing dictionary learning iteratively estimate the dictio ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
The dictionary learning problem aims at finding a dictionary of atoms that best represents an image according to a given objective. The most usual objective consists of representing an image or a class of images sparsely. Most algorithms performing dictionary learning iteratively estimate the dictionary and a sparse representation of images using this dictionary. Dictionary learning has led to many state of the art algorithms in image processing. However, its numerical complexity restricts its use to atoms with a small support since the computations using the constructed dictionaries require too much resources to be deployed for large scale applications. In order to alleviate these issues, this paper introduces a new strategy to learn dictionaries composed of atoms obtained as a composition of K convolutions with S-sparse kernels. The dictionary update step associated with this strategy is a non-convex optimization problem. We reformulate the problem in order to reduce the number of its irrelevant stationary points and introduce a Gauss-Seidel type algorithm, referred to as Alternative Least Square Algorithm, for its resolution. The search space of the considered optimization problem is of dimension KS, which is typically smaller than the size of the target atom and is much smaller than the size of the image. The complexity of the algorithm is linear with regard to the size of the image. Our experiments show that we are able to approximate with a very high accuracy many atoms such as modified DCT, curvelets, sinc functions or cosines when K is large (say K = 10). We also argue empirically that, maybe surprisingly, the algorithm generally converges to a global minimum for large values of K and S.
Sparse space-time deconvolution for Calcium image analysis
"... We describe a unified formulation and algorithm to find an extremely sparse rep-resentation for Calcium image sequences in terms of cell locations, cell shapes, spike timings and impulse responses. Solution of a single optimization problem yields cell segmentations and activity estimates that are on ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
We describe a unified formulation and algorithm to find an extremely sparse rep-resentation for Calcium image sequences in terms of cell locations, cell shapes, spike timings and impulse responses. Solution of a single optimization problem yields cell segmentations and activity estimates that are on par with the state of the art, without the need for heuristic pre- or postprocessing. Experiments on real and synthetic data demonstrate the viability of the proposed method. 1
SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 Multiscale Centerline
"... Abstract—Finding the centerline and estimating the radius of linear structures is a critical first step in many applications, ranging from road delineation in 2D aerial images to modeling blood vessels, lung bronchi, and dendritic arbors in 3D biomedical image stacks. Existing techniques rely either ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Finding the centerline and estimating the radius of linear structures is a critical first step in many applications, ranging from road delineation in 2D aerial images to modeling blood vessels, lung bronchi, and dendritic arbors in 3D biomedical image stacks. Existing techniques rely either on filters designed to respond to ideal cylindrical structures or on classification techniques. The former tend to become unreliable when the linear structures are very irregular while the latter often has difficulties distinguishing centerline locations from neighboring ones, thus losing accuracy. We solve this problem by reformulating centerline detection in terms of a regression problem. We first train regressors to return the distances to the closest centerline in scale-space, and we apply them to the input images or volumes. The centerlines and the corresponding scale then correspond to the regressors local maxima, which can be easily identified. We show that our method outperforms state-of-the-art techniques for various 2D and 3D datasets. Moreover, our approach is very generic and also performs well on contour detection. We show an improvement above recent contour detection algorithms on the BSDS500 dataset. F 1
Dynamic Texture Recognition via Orthogonal Tensor Dictionary Learning
"... Dynamic textures (DTs) are video sequences with sta-tionary properties, which exhibit repetitive patterns over space and time. This paper aims at investigating the sparse coding based approach to characterizing local DT patterns for recognition. Owing to the high dimensionality of DT sequences, exis ..."
Abstract
- Add to MetaCart
(Show Context)
Dynamic textures (DTs) are video sequences with sta-tionary properties, which exhibit repetitive patterns over space and time. This paper aims at investigating the sparse coding based approach to characterizing local DT patterns for recognition. Owing to the high dimensionality of DT sequences, existing dictionary learning algorithms are not suitable for our purpose due to their high computational costs as well as poor scalability. To overcome these obsta-cles, we proposed a structured tensor dictionary learning method for sparse coding, which learns a dictionary struc-tured with orthogonality and separability. The proposed method is very fast and more scalable to high-dimensional data than the existing ones. In addition, based on the pro-posed dictionary learning method, a DT descriptor is de-veloped, which has better adaptivity, discriminability and scalability than the existing approaches. These advantages are demonstrated by the experiments on multiple datasets. 1.
TILDE: A Temporally Invariant Learned DEtector
"... We introduce a learning-based approach to detect repeatable keypoints under drastic imaging changes of weather and lighting conditions to which state-of-the-art keypoint detectors are surprisingly sensitive. We first iden-tify good keypoint candidates in multiple training images taken from the same ..."
Abstract
- Add to MetaCart
(Show Context)
We introduce a learning-based approach to detect repeatable keypoints under drastic imaging changes of weather and lighting conditions to which state-of-the-art keypoint detectors are surprisingly sensitive. We first iden-tify good keypoint candidates in multiple training images taken from the same viewpoint. We then train a regressor to predict a score map whose maxima are those points so that they can be found by simple non-maximum suppression. As there are no standard datasets to test the influence of these kinds of changes, we created our own, which we will make publicly available. We will show that our method sig-nificantly outperforms the state-of-the-art methods in such challenging conditions, while still achieving state-of-the-art performance on the untrained standard Oxford dataset. 1.
JADERBERG, VEDALDI, AND ZISSERMAN: SPEEDING UP CONVOLUTIONAL... 1 Speeding up Convolutional Neural Networks with Low Rank Expansions
"... The focus of this paper is speeding up the evaluation of convolutional neural networks. While delivering impressive results across a range of computer vision and machine learn-ing tasks, these networks are computationally demanding, limiting their deployability. Convolutional layers generally consum ..."
Abstract
- Add to MetaCart
(Show Context)
The focus of this paper is speeding up the evaluation of convolutional neural networks. While delivering impressive results across a range of computer vision and machine learn-ing tasks, these networks are computationally demanding, limiting their deployability. Convolutional layers generally consume the bulk of the processing time, and so in this work we present two simple schemes for drastically speeding up these layers. This is achieved by exploiting cross-channel or filter redundancy to construct a low rank basis of filters that are rank-1 in the spatial domain. Our methods are architecture agnostic, and can be easily applied to existing CPU and GPU convolutional frameworks for tuneable speedup performance. We demonstrate this with a real world network designed for scene text character recognition, showing a possible 2.5 × speedup with no loss in accuracy, and 4.5 × speedup with less than 1 % drop in accuracy, still achieving state-of-the-art on standard benchmarks. 1
published by EURASIP SEPARABLE COSPARSE ANALYSIS OPERATOR LEARNING
"... The ability of having a sparse representation for a certain class of signals has many applications in data analysis, image pro-cessing, and other research fields. Among sparse represen-tations, the cosparse analysis model has recently gained in-creasing interest. Many signals exhibit a multidimensio ..."
Abstract
- Add to MetaCart
(Show Context)
The ability of having a sparse representation for a certain class of signals has many applications in data analysis, image pro-cessing, and other research fields. Among sparse represen-tations, the cosparse analysis model has recently gained in-creasing interest. Many signals exhibit a multidimensional structure, e.g. images or three-dimensional MRI scans. Most data analysis and learning algorithms use vectorized signals and thereby do not account for this underlying structure. The drawback of not taking the inherent structure into account is a dramatic increase in computational cost. We propose an algorithm for learning a cosparse Analysis Op-erator that adheres to the preexisting structure of the data, and thus allows for a very efficient implementation. This is achieved by enforcing a separable structure on the learned operator. Our learning algorithm is able to deal with multi-dimensional data of arbitrary order. We evaluate our method on volumetric data at the example of three-dimensional MRI scans.
Supervised by
, 2013
"... Learned image features can provide great accuracy in many Computer Vision tasks. However, when the convolution filters used to learn image features are numerous and not separable, feature extraction becomes computationally de-manding and impractical to use in real-world situations. In this thesis wo ..."
Abstract
- Add to MetaCart
(Show Context)
Learned image features can provide great accuracy in many Computer Vision tasks. However, when the convolution filters used to learn image features are numerous and not separable, feature extraction becomes computationally de-manding and impractical to use in real-world situations. In this thesis work, a method for learning a small number of separable filters to approximate an arbitrary non-separable filter bank is developed. In this approach, separable filters are learned by grouping the arbitrary filters into a tensor and opti-mizing a tensor decomposition problem. The separable filter learning with tensor decomposition is general and can be applied to generic filter banks to reduce the computational burden of convolutions without a loss in perfor-mance. Moreover, the proposed approach is orders of magnitude faster than the approach of a very recent paper based on `1-norm minimization [34]. Acknowledgements I would like to express my deepest gratitude to my supervisor, Prof. Pascal