MV-RED Object Dataset-2016


Introduction

In our track, we release a real world object and 3D printed object dataset with multi-view and multimodal information, named Multi-view RGB-D Object Dataset (MV-RED), which was recorded by the Multimedia Institute from Tianjin University. The MV-RED dataset consists of 605 objects, which are selected apple, cap, scarf, cup, mushroom, toy and so on. For each object, both RGB and depth information were recorded. Some example views are provided in Fig.1. The dataset will be made publicly available so as to enable rapid progress based on this promising technology.

Fig 1. RGB image and depth image of banana object.


Dataset

This dataset is recorded under three different recording settings.

(1) 202 objects were recorded in the first setting, which was completed in March 2014. It was recorded with three Kinect sensors (the 1st generation) mounted as shown in Fig.2. Camera 1 and Camera 2 captured 36 RGB and depth images respectively by uniformly rotating the table controlled by the step motor. Camera 3 only captured one RGB image and one depth image in the top-down view. In this way, each object has 73 RGB images and 73 depth images in total. The resolution of RGB/depth image is 640 × 480.

Fig 2. Recording environment of first version of data

(2) 303 objects were recorded under the second setting, which was completed in November 2014. These objects belong to 58 categories. It was recorded with three Kinect sensors (the 1st generation) mounted as shown in Fig.3. Camera 1 and Camera 2 captured 36 RGB and depth images respectively by uniformly rotating the table controlled by the step motor. Camera 3 only captured one RGB image and one depth image in the top-down view. In this way, each object has 73 RGB images and 73 depth images in all. The resolution of RGB/depth image is 640 × 480.

(3) 100 objects were 3D printed. These objects were recorded under the second setting, which was completed in November 2015. These object belong to 29 categories. It was recorded with three Kinect sensors (the 1st generation) mounted as shown in Fig.5. Camera 1 and Camera 2 captured 36 RGB and depth images respectively by uniformly rotating the table controlled by the step motor. Camera 3 only captured one RGB image and one depth image in the top-down view. In this way, each object has 73 RGB images and 73 depth images in all. Some examples are shown in Fig.4.

The difference between these two settings lies in the directions for view acquisition, which increases the difficulties in view matching.

We implemented the foreground segmentation for RGB images of both versions. The mask of each object will be provided for this contest like Fig.5.




Fig 3. Recording environment of the second version of dataset

Fig 4. Some examples of 3D printed object in our dataset with the second settings


Fig 5. The results of foreground segmentation



Download MV-RED Object Dataset


People

Weizhi Nie Anan Liu Yuting Su Xixi Li
Qun Cao Zhongyang Wang Xiaorong Zhu Ning Xu
Fan yu Yang Li Xiaoxue Li Yaoyao Liu
Fuwu Li Yang Shi Yahui Hao Zhengyu Zhao


Acknowledgement

National Natural Science Foundation of China (61472275,61100124); Tianjin Research Program of Application Foundation and Advanced Technology; Grant of Elite Scholar Program of Tianjin University.


Last Update: 07/09/2016