PASMVS: a dataset for multi-view stereopsis training and reconstruction applications

Published: 09-08-2020| Version 2 | DOI: 10.17632/fhzfnwsnzf.2
André Broekman,
Petrus J. Gräbe


A large collection of high precision synthetic, path-traced renderings for use in multi-view stereopsis and 3D reconstruction applications. The material properties are primarily highly specular (reflective metals). Ground truth depth maps, model geometry, object masks and camera extrinsic and intrinsic data is provided together with the rendered images. A total of 18,000 samples are provided (45 camera views for 400 scenes), varying the illumination using five, high-definition HDR environment textures, four models (teapot, bunny, armadillo and dragon), ten material properties (bricks, brushed metal, ceramic, checkerboard, concrete, copper, grunge metal, marble, piano/ivory, and steel) and two camera focal lengths (35 mm and 50 mm). File descriptions: + PASMVS.blend - Blender file (developed using Blender version 2.8.1) used to generate the dataset. The Blender file packs only one of the HDR environmental textures to use as an example. + PASMVS_sample.jpg - a visual collage of 8 scenes, illustrating the variability introduced by using different models, illumination, material properties and camera focal lengths. + PASMVS_576p.tar.xz - The full PASMVS dataset containing all 400 scenes (768x576 pixel resolution). Note that tar.xz archive used due to file size limitations (7.8 Gb); uncompressed the total file size is approximately 35 Gb. The "index.csv" in the root folder provides a summary for all 18,000 scenes, specifying the corresponding file names alongside the the subfolder. + - Python example source code for loading ground truth depth map (PFM file format) as a numpy array. + modelSTL - Stereolithography (STL) file for the models used in the dataset. + - A single PASMVS scene sample for easy viewing and download.


Steps to reproduce

The open source Blender software suite ( was used to generate the dataset, with the entire pipeline developed using the exposed Python API interface. The camera trajectory is kept fixed for all 400 scenes. The camera intrinsic information was initially exported as a single CSV file (scene.csv) for every scene, from which the camera information files were generated; this includes the focal length (focalLengthmm), image sensor dimensions (pixelDimensionX, pixelDimensionY), position, coordinate vector (vectC) and rotation vector (vectR). The STL model files, as provided in this data repository, were exported directly from Blender, such that the geometry/scenes can be reproduced. The data processing below is written for a Python implementation, transforming the information from Blender's coordinate system into universal rotation (R_world2cv) and translation (T_world2cv) matrices. The following packages are required: import numpy as np from scipy.spatial.transform import Rotation as R The intrinsic matrix K is constructed using the following formulation: focalLengthPixel = focalLengthmm x pixelDimensionX / sensorWidthmm K = [[focalLengthPixel, 0, dimX/2], [0, focalPixel, dimY/2], [0, 0, 1]] The rotation vector as provided by Blender was first transformed to a rotation matrix: r = R.from_euler('xyz', vectR, degrees=True) matR = r.as_matrix() Transpose the rotation matrix, to find matrix from the WORLD to BLENDER coordinate system: R_world2bcam = np.transpose(matR) The matrix describing the transformation from BLENDER to CV/STANDARD coordinates is: R_bcam2cv = np.array([[1, 0, 0], [0, -1, 0], [0, 0, -1]]) Thus the representation from WORLD to CV/STANDARD coordinates is: R_world2cv = The camera coordinate vector requires a similar transformation moving from BLENDER to WORLD coordinates: T_world2bcam = -1 * T_world2cv = The resulting R_world2cv and T_world2cv matrices are written to the camera information file using exactly the same format as that of BlendedMVS developed by Dr. Yao ( The original rotation and translation information can be found by following the process in reverse. Note that additional steps were required to convert from Blender's unique coordinate system to that of OpenCV; this ensures universal compatibility in the way that the camera intrinsic and extrinsic information is provided.