Human tracking dataset of 3D anatomical landmarks and pose keypoints

Published: 26 January 2024| Version 2 | DOI: 10.17632/493s6f753v.2


Image pose detectors in which the pose if defined by anatomical landmarks are rare and scarcely available, which impedes progress in methods of markerless motion capture and analysis applied to biomechanics. Temporal 3D scanning (or 4D scanning) systems allow human bodies in motion to be obtained with high precision, as well as providing a realistic 3D avatar of the scanned person. These two aspects can be used to obtain the pose by different methods. A first method is the location of anatomical landmarks on the mesh surface in a direct way. The other method consists in obtaining virtual images of the mesh from different points of view from which a neural network can estimate the location of body markers. This dataset associates 2D and 3D human pose keypoints estimated from images with MediaPipe ( with the location of their corresponding 3D anatomical landmarks. It consists of 567 movement sequences of 71 participants in A-Pose and performing 7 movements (walking, running, squatting, and four types of jump) who were scanned with Move4D ( to build a collection of 3D human moving meshes with texture and with anatomical correspondence (a total amount of 51,051 poses). From each mesh of that collection, the 3D locations of 53 anatomical landmarks were obtained and 48 images were created using virtual cameras with different perspectives. 2D pose keypoints from those images were obtained using MediaPipe's Pose landmarker model and their corresponding 3D keypoints were calculated by linear triangulation. There is one folder per participant which contains two Track Row Column (TRC) files and one JSON file per movement sequence. One TRC file is used to store the 3D keypoints triangulated and the other contains the 3D anatomical landmarks. The JSON file stores 2D keypoints and the calibration parameters of the virtual cameras. The anthropometric characteristics of the participants (height, weight, age and sex) are annotated in a single CSV file. The files are named following the next scheme [CODE]_[MOVEMENT]_[DATA].[EXT], in which: [CODE]: is the participant code. The last character (F/M) makes reference to the sex ("F" female, "M" male). [MOVEMENT]: Name of the movement (A-POSE / F-JUMP / GAIT / J-JACKS / JUMP / RUNNING / SQUATS / T-JUMP). [DATA]: Reference data (AL = anatomical landmarks / KP2D = keypoints in pixels and cameras calibration / KP3D = 3D keypoints obtained by triangulating the 2D keypoints). [EXT]: file extension (TRC / JSON). The JSON files contains general sequence data ("subject", "movement", "fps") and a list of annotations ("frame", "camera", "keypoint_scores", "proj_matrix", "proj_matrix_rows", "proj_matrix_cols" ). "keypoint_scores" is vector of size 99 (u1, v1, score1, … u33, v33, score33) where (ui,vi) is the 2D keypoint location and score(i) is its associated score. The AVI file shows an example of the 3D landmarks contained in both TRC for TDB_001_F.



Instituto de Biomecanica de Valencia, Universitat Politecnica de Valencia


Computer Science, Motion Capture, Gait


Conselleria de Innovación, Industria, Comercio y Turismo (GVA)


Instituto Valenciano de Competitividad Empresarial (IVACE)