Published: 29 November 2023| Version 1 | DOI: 10.17632/j7pz7x4wmb.1
Shuchang Xu, Fangtao Mao, Menghui Ji, Haohao Xu, Wenzhen Yang


The dataset contains raw visual images, visualized tactile images along the X- and Z-axes and an Excel file that organize every sample and their correspondence in order. The tactile images are interpolated on the raw haptic signal to align with the visual images. Both the visual and tactile images have identical resolution of 620 X 410. The dataset consists of 567 records. Each record includes one visual image, two tactile images along the X and Z axes, and one defect segmentation image. Tactile image filenames ending with x and z denote X and Z components respectively.The samples in the dataset exhibit a wide range of colors and textures. Moreover, the dataset demonstrates the advantage of cross-modal data fusion. As a flexible material, leather may have defects on its surface and underside, which can be observed in the visual and tactile images, respectively. Combining visual and tactile images provides better information on the distribution of defects


Steps to reproduce

In order to simultaneously capture visual and tactile data, we have designed a dedicated platform. The platform primarily consists of a three-axis linkage, a digital camera, and a tactile data sampling module for acquiring visual and tactile data, respectively. The tactile sensor, featuring a spherical contact tip with a radius of 0.5mm, is mounted on the Z-axis and can be moved within the plane space along the X- or Y-axis. Dense tactile data of object is collected as the tactile sensor is carried and moved along the X- and Y-axes by servo motors, which are controlled by a motion controller. The tactile sensor has a measurement range of 2N and an accuracy of approximately 0.5%.The visual data is captured straightforwardly by a digital IP-camera. However, the process of collecting tactile data is relatively complex and time-consuming. We will describe the whole process in seperated paper. Initially, we apply red ink to the contact tip of the tactile sensor to mark the tactile data region and facilitate subsequent alignment of visual-tactile data. The tactile data collection proceeds as follows: 1)Moving to the starting point. The tactile sensor will descend until it makes contact with the object surface while the Z-axis moves down. The contact point is then marked as A(x1, y1). 2) Collecting dense tactile data line by line. The tactile sensor is then moved to acquire tactile data along path1. Every 5ms, the axis servo motor and tactile sensor continuously provide the corresponding coordinates and pressure values, respectively, after sampling by the FPGA module. Specifically, the pressure value contains XYZ three components. Upon reaching point B, the sensor will return to point A and move up △y along path3. Subsequently, a new round of line tactile data collection repeats until the data collection is completed within the region of ABCD. 3) Aligning the visual data. Typically, the range of visual data acquisition is much larger than that of tactile data. Therefore, we clip the visual image to maintain the same data region according to the four corner marks left by the contact tip of the tactile sensor. 4) Tactile data interpolation. The tactile data is initially sampled by the FPGA module and stored in the control center. Subsequently, we apply cubic spline interpolation on the tactile data, followed by data visualization, to maintain the same resolution as the visual image. Considering the sampling distance along the Y direction, the tactile data value of the y component is relatively unsignificant. Therefore, the proposed dataset only includes tactile components of the X and Z axes


Hangzhou Normal University, Zhejiang Lab


Computer Vision, Data Fusion, Artifact Detection, Tactile Perception, Haptics


National Key Research and Development Program of China


Key Research Project of the Zhejiang Lab

K2022PG1BB01, 2022MG0AC04

Youth Fund of the Zhejiang Lab

K2023MG0AA11, K2023MG0AA02