University of Washington Indoor Object Manipulation (UW IOM) Dataset

Published: 15 Jun 2019 | Version 2 | DOI: 10.17632/xwzzkxtf9s.2

Description of this data

The University of Washington Indoor Object Manipulation (UW IOM) dataset comprises videos (and corresponding skeletal tracking information) of twenty participants within the age group of 18-25 years, of which fifteen are males and the remaining five are females. The videos are recorded using a Kinect Sensor for Xbox One at an average rate of twelve frames per second. Each participant carries out the same set of tasks in terms of picking up six objects (three identical empty boxes and three identical rods) from three different vertical racks, placing them on a table, putting them back on the racks from where they are picked up, and then walking out of the scene carrying the box from the middle rack. The boxes are manipulated with both the hands while the rods are manipulated using only one hand. The above tasks are repeated in the same sequence three times such that the duration of every video is approximately three minutes. We categorize the actions into seventeen labels, where each label follows a four-tier hierarchy. The first tier indicates whether the box or the rod is manipulated, the second tier denotes human motion (walk, stand, and bend), the third tier captures the type of object manipulation if applicable (reach, pick-up, place, and hold), and the fourth tier represents the relative height of the surface where manipulation is taking place (low, medium, and high).

Experiment data files

  • UW IOM Dataset
    • JointPositions
    • VideoLabels
    • Videos

Latest version

  • Version 2


    Published: 2019-06-15

    DOI: 10.17632/xwzzkxtf9s.2

    Cite this dataset

    Parsa, Behnoosh; Samani, Ekta U.; Hendrix, Rose; Devine, Cameron; Singh, Shashi M.; Devasia, Santosh; Banerjee, Ashis G. (2019), “University of Washington Indoor Object Manipulation (UW IOM) Dataset”, Mendeley Data, v2


Views: 1741
Downloads: 694

Previous versions

Compare to version


University of Washington


Computer Vision, Ergonomics, Activity Recognition, Video Processing, Motion Capture, Deep Learning


CC BY 4.0 Learn more

The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International licence.

What does this mean?
You can share, copy and modify this dataset so long as you give appropriate credit, provide a link to the CC BY license, and indicate if changes were made, but you may not do so in a way that suggests the rights holder has endorsed you or your use of the dataset. Note that further permission may be required for any content within the dataset that is identified as belonging to a third party.