一. NTU论文中的预处理方法
We translate them to the body coordinate system with its origin on the “middle of the spine” joint (number 2 in Figure 1), followed by a 3D rotation to fix the X axis parallel to the 3D vector from “right shoulder” to “left shoulder”, and Y axis towards the 3D vector from “spine base” to “spine”. The Z axis is fixed as the new X × Y. In the last step of normalization, we scale all the 3D points based on the distance between “spine base” and “spine” joints. In the cases of having more than one body in the scene, we transform all of them with regard to the main actor’s skeleton.
总结就是,每个视频分别处理:
- 以“middle of the spine”为原点;
- 改变xyz坐标轴;
- 用“spine base” 到 “spine”的距离来normalization。
二. HCN论文中的预处理方法
该方法来自论文2018IJCAI-Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation.
该论文用卷积的方法来处理骨架数据,它对骨架的预处理方法,以NTU骨架数据集为例就是,所有视频同时处理:
- 把所有骨架数据变为一个5维数组,每个视频长度为300帧,不够300帧的视频在后面补零;
- 在所有骨架数据中分别找出XYZ的最大最小值,然后用最大最小值归一化。
代码实现链接:https://github.com/huguyuehuhu/HCN-pytorch/blob/master/feeder/feeder.py