Tags:activity recognition, construction monitoring, digital twin construction, labor productivity and vision transformers
Abstract:
In response to the activity-based productivity concerns in construction environments, we developed a multifaceted computer vision approach merged with BIM models. High-level process information is derived from continuously acquired site images by the following computational processing chain: (1) Worker activity is classified using the proposed vision transformer network ViTPoseActivity, leveraging human pose features to detect worker activities. (2) On-site labor activities are analyzed according to their on-site impact and fused with the corresponding BIM geometry. Our model, ViTPoseActivity, achieved 92.31 % accuracy while surpassing previous prediction speeds, demonstrating an effective trade-off between computational cost and precision in activity analysis. Unlike previous studies, our approach was deployed on a large real-world dataset, carefully investigating subtasks and affording productivity insights on reinforcement activities. Integrating as-performed and geometry information supports construction management by facilitating better decision-making regarding worker group definition and task allocation. Our research fills a crucial gap by providing a robust and efficient method to assess on-site labor productivity.
ViTPoseActivity: a Multifaceted Computer Vision Approach to on-Site Activity Monitoring