运动识别之HOJ3D和HMM

http://cvrc.ece.utexas.edu/Publications/Xia_HAU3D12.pdf

View Invariant Human Action Recognition Using Histograms of 3D Joints

The HOJ3D computed from the action depth sequences are reprojected using LDA and then clustered into k posture visual words, which represent the prototypical poses of actions. The temporal evolutions of those visual words are modeled by discrete hidden Markov models (HMMs).

特征定义

In this representation, the 3D space is partitioned into n bins using a modified spherical coordinate system. We manually select 12 informative joints to build a compact representation of human posture. To make our representation robust against minor posture variation, votes of 3D skeletal joints are cast into neighboring bins using a Gaussian weight function.

we acquire the 3D locations of 20 skeletal joints which comprise hip center, spine, shoulder center, head, L/ R shoulder, L/ R elbow, L/ R wrist, L/ R hand, L/ R hip, L/ R knee, L/ R angle and L/ R foot.

we compute our histogram based representation of postures from 12 of the 20 joints, including head, L/ R elbow, L/ R hands, L/ R knee, L/ R feet, hip center and L/ R hip. We take the hip center as the center of the reference coordinate system, and define the x-direction according to L/ R hip. The rest 9 joints are used to compute the 3D spatial histogram.

要达到视不变（不同视角下相同姿态正确归类）：We achieve this by aligning our spherical coordinates with the person’s specific direction。We define the center of the spherical coordinates as the hip center joint.Define the horizontal reference vector α to be the vector from the left hip center to the right hip center projected on the horizontal plane (parallel to the ground), and the zenith reference vector θ as the vector that is perpendicular to the ground plane and passes through the coordinate center.

partition the 3D space into n bins

The inclination angle is divided into 7 bins from the zenith vector θ: [0, 15], [15, 45], [45, 75], [105, 135], [165, 180]

Our HOJ3D descriptor is computed by casting the rest 9 joints into the corresponding spatial histogram bins.

To make the representation robust against minor errors of joint locations, we vote the 3D bins using a Gaussian weight function:

For each joint, we only vote over the bin it is in and the 8 neighboring bins. We calculate the probabilistic voting on θ and α separately since they are independent (see Fig. 4). The probabilistic voting for each of the 9 bins is the product of the probability on α direction and θ direction. Let the joint

location be