Tags:2D to 3D, 3D human pose, Human body reconstruction and Self-attention
Abstract:
Although there has been some progress in 3D human pose and shape estimation, accurately predicting complex human poses is still challenging. To tackle this issue and improve the accuracy of the human mesh reconstruction, we propose an end-to-end framework called Multi-level Attention Network (MANet) that improves reconstruction results. MANet consists of three modules: Intra Part Attention Network (IntraPA-Net), Inter Part Attention Network (InterPA-Net), and Hierarchical Pose Regressor (HPR), which model attention at various levels. IntraPA-Net utilizes pixel attention and aggregates pixel-level features for each body part, InterPA-Net establishes attention between different body parts, and HPR implicitly captures the attention of different joints in a hierarchical structure. Experimental results demonstrate that MANet achieves high accuracy in reconstructing the human mesh and aligning well with images that contain flexible human motion.
MANet: Multi-Level Attention Network for 3D Human Shape and Pose Estimation