MV-RED Object Dataset

Introduction

In our track, we will release a real world object dataset with multi-view and multimodal information, named Multi-view RGB-D Object Dataset (MV-RED), which was recorded by the Multimedia Institute from Tianjin University. The MV-RED dataset consists of 505 objects, which are selected apple, cap, scarf, cup, mushroom, toy and so on. For each object, both RGB and depth information were recorded. Some example views are provided in Fig.1. The dataset will be made publicly available so as to enable rapid progress based on this promising technology.

Fig 1. RGB image and depth image of banana object.

Dataset

This dataset is recorded under two different recording settings.

(1) 202 objects were recorded in the first setting, which was completed in March 2014. It was recorded with three Kinect sensors (the 1st generation) mounted as shown in Fig.2. Camera 1 and Camera 2 captured 360 RGB and depth images respectively by uniformly rotating the table controlled by the step motor. Camera 3 only captured one RGB image and one depth image in the top-down view. In this way, each object has 721 RGB images and 721 depth images in total. The resolution of RGB/depth image is 640 × 480. Moreover, we also uniformly sampled the images from Camera 1 and 2 with the step of 10 degrees and provide the compact version with 73 images for individual RGB and depth data, that was 36 images from Camera 1, 36 images from Camera 2, and 1 image from Camera 3.

Fig 2. Recording environment of first version of data

(2) 303 objects were recorded under the second setting, which was completed in November 2014. These objects belong to 58 categories. It was recorded with three Kinect sensors (the 1st generation) mounted as shown in Fig.3. Camera 1 and Camera 2 captured 360 RGB and depth images respectively by uniformly rotating the table controlled by the step motor. Camera 3 only captured one RGB image and one depth image in the top-down view. In this way, each object has 721 RGB images and 721 depth images in all. The resolution of RGB/depth image is 640 × 480. Moreover, we also uniformly sampled the images from Camera 1 and 2 with the step of 10 degrees and provide the compact version with 73 images for individual RGB and depth data (36 from Camera 1, 36 from Camera 2, 1 from Camera 3).

The difference between these two settings lies in the directions for view acquisition, which increases the difficulties in view matching.

We implemented the foreground segmentation for RGB images of both versions. The mask of each object will be provided for this contest like Fig.4. To sum up, the MV-RED dataset contains two versions of data, which has been summarized in Table 1.



Versions Complete Version Concise Version
# of RGD Images 721 73
# of Depth Images 721 73

Table 1. The image content of the two versions of the dataset


Fig 3. Recording environment of the second version of dataset

Fig 4. The results of foreground segmentation



Download MV-RED Object Dataset

Download

In this track, we provide 311 objects as query objects and 505 objects as testing objects. For each object, we will provide two versions. One version includes 73 images. Other version includes 721 images. For each version, we both provide RGB image, Depth image and Mask image. We build one ftp server for downloading this dataset. The IP of this ftp server is: 123.56.92.50. When participants send email to truman.nie@gmail.com or weizhinie@tju.edu.cn for registering, we will send username and password to participation for downloading this dataset.

Note: The participants also need to download and file Agreement and Disclaimer Form and send it back to us with your register email. We will then email you the instructions to download the dataset.