#### DMCA

## Learning body pose via specialized maps (2001)

Venue: | In NIPS |

Citations: | 53 - 2 self |

### Citations

992 | A view of the EM algorithm that justifies incremental, sparse and other variants
- Neal, Hinton
- 1998
(Show Context)
Citation Context ... function number Yi generated data point number i. Using Bayes' rule and assuming independence of observations given 8, we have the log-probability of our data given the modellogp(ZI8), which we want to maximize: argm;x 2:)og LP(1jIi lvi, Yi = k,8)P(Yi = kI8)p(Vi ), i k (1) where we used the independence assumption p(vI8) = p(v). This is also equivalent to maximizing the conditional likelihood of the model. Because of the log-sum encountered, this problem is intractable in general. However, there exist practical approximate optimization procedures, one of them is Expectation Maximization (EM) [3,4, 12]. 3.1 Learning The EM algorithm is well known, therefore here we only provide the derivations specific to SMA's. The E-step consists of finding P(y = klz, 8) = P(y). Note that the variables Yi are assumed independent (given Zi)' Thus, factorizing P(y): p(y) = II P(t)(Yi) = II[(AYiP(1/Jilvi,Yi,B))/(2:AkP(1/Jilvi,Yi = k,B))] (2) kEC However, p( 1/Ji lVi, Yi = k, B) is still undefined. For the implementation described in this paper we use N(1/Ji; ¢k(Vi,Bk), ~k)' where Bk are the parameters of the k-th specialized function, and ~k the error covariance of the specialized function k. One way to inte... |

879 | Hierarchical mixtures of experts and the em algorithm
- Jordan, Jacobs
- 1994
(Show Context)
Citation Context ...upervised learning problems. Our approach consists in generating a series of m functions ¢k : ~c -+ ~t. Each of these functions is specialized to map only certain inputs (for a specialized sub-domain) better than others. For example, each sub-domain can be a region of the input space. However, the specialized sub-domain of ¢k can be more general than just a connected region in the input space. Several other learning models use a similar concept of fitting surfaces to the observed data by splitting the input space into several regions and approximating simpler functions in these regions (e.g., [11,7, 6]). However, in these approaches, the inverse map is not incorporated in the estimation algorithm because it is not considered in the problem definition and the forward model is usually more complex, making inference and learning more difficult. The key algorithmic problems are that of estimating the specialized domains and functions in an optimal way (taking into account the form of the specialized functions), and using the knowledge of the inverse function to formulate efficient inferIThus, ( is a computer graphics rendering, in general called forward kinematics ence and learning algorithms. ... |

834 |
Visual perception of biological motion and a model for its analysis”. In: Perception and Psychophysics
- Johansson
- 1973
(Show Context)
Citation Context ...utions to these problems employ the EM algorithm and alternating choices of conditional independence assumptions. Performance of the approach is evaluated with synthetic and real video sequences of human motion. 1 Introduction In everyday life, humans can easily estimate body part locations (body pose) from relatively low-resolution images of the projected 3D world (e.g., when viewing a photograph or a video). However, body pose estimation is a very difficult computer vision problem. It is believed that humans employ extensive prior knowledge about human body structure and motion in this task [10]. Assuming this , we consider how a computer might learn the underlying structure and thereby infer body pose. In computer vision, this task is usually posed as a tracking problem. Typically, models comprised of 2D or 3D geometric primitives are designed for tracking a specific articulated body [13, 5, 2, 15]. At each frame, these models are fitted to the image to optimize some cost function. Careful manual placement of the model on the first frame is required, and tracking in subsequent frames tends to be sensitive to errors in initialization and numerical drift. Generally, these systems cann... |

659 | Contour tracking by stochastic propagation of conditional density
- Isard, Blake
- 1996
(Show Context)
Citation Context ...ask is usually posed as a tracking problem. Typically, models comprised of 2D or 3D geometric primitives are designed for tracking a specific articulated body [13, 5, 2, 15]. At each frame, these models are fitted to the image to optimize some cost function. Careful manual placement of the model on the first frame is required, and tracking in subsequent frames tends to be sensitive to errors in initialization and numerical drift. Generally, these systems cannot recover from tracking errors in the middle of a sequence. To address these weaknesses, more complex dynamic models have been proposed [14, 13,9]; these methods learn a prior over some specific motion (such as walking). This strong prior however, substantially limits the generality of the motions that can be tracked. Departing from the aforementioned tracking paradigm, in [8] a Gaussian probability model was learned for short human motion sequences. In [17] dynamic programming was used to calculate the best global labeling according to the learned joint probability density function of the position and velocity of body features. Still, in these approaches, the joint locations, correspondences, or model initialization must be provided by... |

609 |
Maximum likelihood estimation from incomplete data via the EM algorithm (with Discussion).
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ... function number Yi generated data point number i. Using Bayes' rule and assuming independence of observations given 8, we have the log-probability of our data given the modellogp(ZI8), which we want to maximize: argm;x 2:)og LP(1jIi lvi, Yi = k,8)P(Yi = kI8)p(Vi ), i k (1) where we used the independence assumption p(vI8) = p(v). This is also equivalent to maximizing the conditional likelihood of the model. Because of the log-sum encountered, this problem is intractable in general. However, there exist practical approximate optimization procedures, one of them is Expectation Maximization (EM) [3,4, 12]. 3.1 Learning The EM algorithm is well known, therefore here we only provide the derivations specific to SMA's. The E-step consists of finding P(y = klz, 8) = P(y). Note that the variables Yi are assumed independent (given Zi)' Thus, factorizing P(y): p(y) = II P(t)(Yi) = II[(AYiP(1/Jilvi,Yi,B))/(2:AkP(1/Jilvi,Yi = k,B))] (2) kEC However, p( 1/Ji lVi, Yi = k, B) is still undefined. For the implementation described in this paper we use N(1/Ji; ¢k(Vi,Bk), ~k)' where Bk are the parameters of the k-th specialized function, and ~k the error covariance of the specialized function k. One way to inte... |

493 | Articulated body motion capture by annealed particle filtering
- Deutscher, Blake, et al.
- 2000
(Show Context)
Citation Context ...se) from relatively low-resolution images of the projected 3D world (e.g., when viewing a photograph or a video). However, body pose estimation is a very difficult computer vision problem. It is believed that humans employ extensive prior knowledge about human body structure and motion in this task [10]. Assuming this , we consider how a computer might learn the underlying structure and thereby infer body pose. In computer vision, this task is usually posed as a tracking problem. Typically, models comprised of 2D or 3D geometric primitives are designed for tracking a specific articulated body [13, 5, 2, 15]. At each frame, these models are fitted to the image to optimize some cost function. Careful manual placement of the model on the first frame is required, and tracking in subsequent frames tends to be sensitive to errors in initialization and numerical drift. Generally, these systems cannot recover from tracking errors in the middle of a sequence. To address these weaknesses, more complex dynamic models have been proposed [14, 13,9]; these methods learn a prior over some specific motion (such as walking). This strong prior however, substantially limits the generality of the motions that can b... |

447 | Tracking people with twists and exponential maps
- Bregler, Malik
- 1998
(Show Context)
Citation Context ...se) from relatively low-resolution images of the projected 3D world (e.g., when viewing a photograph or a video). However, body pose estimation is a very difficult computer vision problem. It is believed that humans employ extensive prior knowledge about human body structure and motion in this task [10]. Assuming this , we consider how a computer might learn the underlying structure and thereby infer body pose. In computer vision, this task is usually posed as a tracking problem. Typically, models comprised of 2D or 3D geometric primitives are designed for tracking a specific articulated body [13, 5, 2, 15]. At each frame, these models are fitted to the image to optimize some cost function. Careful manual placement of the model on the first frame is required, and tracking in subsequent frames tends to be sensitive to errors in initialization and numerical drift. Generally, these systems cannot recover from tracking errors in the middle of a sequence. To address these weaknesses, more complex dynamic models have been proposed [14, 13,9]; these methods learn a prior over some specific motion (such as walking). This strong prior however, substantially limits the generality of the motions that can b... |

296 | Voice puppetry
- Brand
- 1999
(Show Context)
Citation Context ...thods learn a prior over some specific motion (such as walking). This strong prior however, substantially limits the generality of the motions that can be tracked. Departing from the aforementioned tracking paradigm, in [8] a Gaussian probability model was learned for short human motion sequences. In [17] dynamic programming was used to calculate the best global labeling according to the learned joint probability density function of the position and velocity of body features. Still, in these approaches, the joint locations, correspondences, or model initialization must be provided by hand. In [1], the manifold of human body dynamics was modeled via a hidden Markov model and learned via entropic minimization. In all of these approaches models were learned. Although the approach presented here can be used to model dynamics, we argue that when general human motion dynamics are intended to be learned, the amount of training data, model complexity, and computational resources required are impractical. As a consequence, models with large priors towards specific motions (e .g., walking) are generated. In this paper we describe a non-linear supervised learning algorithm, the Specialized Maps ... |

243 | T.: Model-based tracking of self-occluding articulated objects. In: Computer Vision
- Rehg, Kanade
- 1995
(Show Context)
Citation Context ...se) from relatively low-resolution images of the projected 3D world (e.g., when viewing a photograph or a video). However, body pose estimation is a very difficult computer vision problem. It is believed that humans employ extensive prior knowledge about human body structure and motion in this task [10]. Assuming this , we consider how a computer might learn the underlying structure and thereby infer body pose. In computer vision, this task is usually posed as a tracking problem. Typically, models comprised of 2D or 3D geometric primitives are designed for tracking a specific articulated body [13, 5, 2, 15]. At each frame, these models are fitted to the image to optimize some cost function. Careful manual placement of the model on the first frame is required, and tracking in subsequent frames tends to be sensitive to errors in initialization and numerical drift. Generally, these systems cannot recover from tracking errors in the middle of a sequence. To address these weaknesses, more complex dynamic models have been proposed [14, 13,9]; these methods learn a prior over some specific motion (such as walking). This strong prior however, substantially limits the generality of the motions that can b... |

212 |
Multivariate adaptive regression splines. Annl Stat
- Friedman
- 1991
(Show Context)
Citation Context ...upervised learning problems. Our approach consists in generating a series of m functions ¢k : ~c -+ ~t. Each of these functions is specialized to map only certain inputs (for a specialized sub-domain) better than others. For example, each sub-domain can be a region of the input space. However, the specialized sub-domain of ¢k can be more general than just a connected region in the input space. Several other learning models use a similar concept of fitting surfaces to the observed data by splitting the input space into several regions and approximating simpler functions in these regions (e.g., [11,7, 6]). However, in these approaches, the inverse map is not incorporated in the estimation algorithm because it is not considered in the problem definition and the forward model is usually more complex, making inference and learning more difficult. The key algorithmic problems are that of estimating the specialized domains and functions in an optimal way (taking into account the form of the specialized functions), and using the knowledge of the inverse function to formulate efficient inferIThus, ( is a computer graphics rendering, in general called forward kinematics ence and learning algorithms. ... |

194 |
Information geometry and alternating minimization procedures
- Csiszár, Tusnady
- 1984
(Show Context)
Citation Context ... function number Yi generated data point number i. Using Bayes' rule and assuming independence of observations given 8, we have the log-probability of our data given the modellogp(ZI8), which we want to maximize: argm;x 2:)og LP(1jIi lvi, Yi = k,8)P(Yi = kI8)p(Vi ), i k (1) where we used the independence assumption p(vI8) = p(v). This is also equivalent to maximizing the conditional likelihood of the model. Because of the log-sum encountered, this problem is intractable in general. However, there exist practical approximate optimization procedures, one of them is Expectation Maximization (EM) [3,4, 12]. 3.1 Learning The EM algorithm is well known, therefore here we only provide the derivations specific to SMA's. The E-step consists of finding P(y = klz, 8) = P(y). Note that the variables Yi are assumed independent (given Zi)' Thus, factorizing P(y): p(y) = II P(t)(Yi) = II[(AYiP(1/Jilvi,Yi,B))/(2:AkP(1/Jilvi,Yi = k,B))] (2) kEC However, p( 1/Ji lVi, Yi = k, B) is still undefined. For the implementation described in this paper we use N(1/Ji; ¢k(Vi,Bk), ~k)' where Bk are the parameters of the k-th specialized function, and ~k the error covariance of the specialized function k. One way to inte... |

148 | Bayesian Reconstruction of 3D Human Motion from Single-Camera Video,
- Howe, Leventon, et al.
- 1999
(Show Context)
Citation Context ...ome cost function. Careful manual placement of the model on the first frame is required, and tracking in subsequent frames tends to be sensitive to errors in initialization and numerical drift. Generally, these systems cannot recover from tracking errors in the middle of a sequence. To address these weaknesses, more complex dynamic models have been proposed [14, 13,9]; these methods learn a prior over some specific motion (such as walking). This strong prior however, substantially limits the generality of the motions that can be tracked. Departing from the aforementioned tracking paradigm, in [8] a Gaussian probability model was learned for short human motion sequences. In [17] dynamic programming was used to calculate the best global labeling according to the learned joint probability density function of the position and velocity of body features. Still, in these approaches, the joint locations, correspondences, or model initialization must be provided by hand. In [1], the manifold of human body dynamics was modeled via a hidden Markov model and learned via entropic minimization. In all of these approaches models were learned. Although the approach presented here can be used to model... |

136 | Learning Switching Linear Models of Human Motion. - Pavlovic, Rehg, et al. - 2000 |

75 | Towards detection of human motion.
- Song, Feng, et al.
- 2000
(Show Context)
Citation Context ...uired, and tracking in subsequent frames tends to be sensitive to errors in initialization and numerical drift. Generally, these systems cannot recover from tracking errors in the middle of a sequence. To address these weaknesses, more complex dynamic models have been proposed [14, 13,9]; these methods learn a prior over some specific motion (such as walking). This strong prior however, substantially limits the generality of the motions that can be tracked. Departing from the aforementioned tracking paradigm, in [8] a Gaussian probability model was learned for short human motion sequences. In [17] dynamic programming was used to calculate the best global labeling according to the learned joint probability density function of the position and velocity of body features. Still, in these approaches, the joint locations, correspondences, or model initialization must be provided by hand. In [1], the manifold of human body dynamics was modeled via a hidden Markov model and learned via entropic minimization. In all of these approaches models were learned. Although the approach presented here can be used to model dynamics, we argue that when general human motion dynamics are intended to be lear... |

46 | Learning and tracking cyclic human motion,” - Ormoneit, Sidenbladh, et al. - 2001 |

35 | A hierarchical community of experts. - Hinton, Sallans, et al. - 1998 |

17 |
Specialized mappings and the estimation of body pose from a single image.
- Rosales, Sclaroff
- 2000
(Show Context)
Citation Context ...vi,Yi = k,B))] (2) kEC However, p( 1/Ji lVi, Yi = k, B) is still undefined. For the implementation described in this paper we use N(1/Ji; ¢k(Vi,Bk), ~k)' where Bk are the parameters of the k-th specialized function, and ~k the error covariance of the specialized function k. One way to interpret this choice is to think that the error cost in estimating 1/J once we know the specialized function to use, is a Gaussian distribution with mean the output of the specialized function and some covariance which is map dependent. This also led to tractable further derivations. Other choices were given in [16]. The M-step consists of finding B(t) = argmaxoEj>(t) [logp(Z,y IB)]. In our case we can show that this is equivalent to finding: argmJn 2: 2: P(t)(Yi = k)(1/Ji - ¢k(Vi, Bk))T~kl(Zi - ¢k(Zi,Bk))· (3) i k This gives the following update rules for Ak and ~k (where Lagrange multipliers were used to incorporate the constraint that the sum of the Ak'S is 1. 1 - 2: P(Yi = klzi' B) n . (4) In keeping the formulation general, we have not defined the form of the specialized functions ¢k. Whether or not we can find a closed form solution for the update of Bk depends on the form of ¢k. For example if ¢k ... |

1 |
A hierarchical community of experts. Learning
- Hinton, Sallans, et al.
- 1998
(Show Context)
Citation Context ...upervised learning problems. Our approach consists in generating a series of m functions ¢k : ~c -+ ~t. Each of these functions is specialized to map only certain inputs (for a specialized sub-domain) better than others. For example, each sub-domain can be a region of the input space. However, the specialized sub-domain of ¢k can be more general than just a connected region in the input space. Several other learning models use a similar concept of fitting surfaces to the observed data by splitting the input space into several regions and approximating simpler functions in these regions (e.g., [11,7, 6]). However, in these approaches, the inverse map is not incorporated in the estimation algorithm because it is not considered in the problem definition and the forward model is usually more complex, making inference and learning more difficult. The key algorithmic problems are that of estimating the specialized domains and functions in an optimal way (taking into account the form of the specialized functions), and using the knowledge of the inverse function to formulate efficient inferIThus, ( is a computer graphics rendering, in general called forward kinematics ence and learning algorithms. ... |