Multimedia Institute

Group
MV-RED
M2I

Multi-view RGB-D Object Dataset

Keywords: 3D Model, Depth Image, MultiModal


Data Description:


This dataset is recorded under two different recording settings.
  1. 202 objects were recorded in the first setting, which was completed in March 2014. It was recorded with three Kinect sensors. Camera 1 and Camera 2 captured 360 RGB and depth images respectively by uniformly rotating the table controlled by the step motor. Camera 3 only captured one RGB image and one depth image in the top-down view. In this way, each object has 721 RGB images and 721 depth images in total. The resolution of RGB/depth image is 640 × 480. Moreover, we also uniformly sampled the images from Camera 1 and 2 with the step of 10 degrees and provide the compact version with 73 images for individual RGB and depth data, that was 36 images from Camera 1, 36 images from Camera 2, and 1 image from Camera 3.

  2. 303 objects were recorded under the second setting, which was completed in November 2014. These objects belong to 58 categories. It was recorded with three Kinect sensors. Camera 1 and Camera 2 captured 360 RGB and depth images respectively by uniformly rotating the table controlled by the step motor. Camera 3 only captured one RGB image and one depth image in the top-down view. In this way, each object has 721 RGB images and 721 depth images in all. The resolution of RGB/depth image is 640 × 480. Moreover, we also uniformly sampled the images from Camera 1 and 2 with the step of 10 degrees and provide the compact version with 73 images for individual RGB and depth data (36 from Camera 1, 36 from Camera 2, 1 from Camera 3).

More information you can refer to here

Representative Publications:

  1. Nie, Wei-Zhi; Liu, An-An; Su, Yu-Ting;“3D object retrieval based on sparse coding in weak supervision Journal of Visual Communication and Image Representation”. Academic Press,2015.

  2. Gao, Yue; Liu, Anan; Nie, Weizhi; Su, Yuting; Dai, Qionghai; Chen, Fuhai; Chen, Yingying; Cheng, Yanhua; Dong, Shuilong; Duan, Xingyue; "3D Object Retrieval with Multimodal Views",The Eurographics Association2015.
* You can download the Dataset is here.

M2I Dataset

Keywords: Action recognition, Multi-modal, Multi-view, Interaction


Data Description:

  1. Multi-modal & Multi-view & Interactive (M2I) dataset provides person-person interaction actions and person-object interaction actions. It consists of 22 action categories and a total of 22 unique individuals. Each action is performed twice by 20 groups (two persons in a group). Meanwhile, all the RGB image, depth data, and skeleton data are preprocessed to remove noise. For evaluation, all samples were divided with respect to the groups into a training set (8 groups), a validation set (6 groups) and a test set (6 groups).

Representative Publications:

  1. Xu, Ning; Liu, Anan; Nie, Weizhi; Wong, Yongkang; Li, Fuwu; Su, Yuting;“Multi-modal & Multi-view & Interactive Benchmark Dataset for Human Action Recognition”Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, 1195-1198 2015 ACM.

* You can download the Dataset is here.

Media Security And Forensics

PI: Su Yu-Ting, Zhang Jing, and Zhang Cheng-Qian

Keywords: Information Hiding, Digital Watermarking, Temper Detection


Representative Publications:

  1. Junyu Xu, Yuting Su, Xingang You, Detection of video transcoding for digital forensics, Audio, Language and Image Processing, pp.160-164, 2012.
  2. Yuting Su, Chengqian Zhang, Chuntian Zhang. A video steganalytic algorithm against motion-vector-based steganography, Signal Processing, 2011.
  3. Yuting Su, Jing Zhang, Yu Han, Jing Chen, Qingzhong Liu, Exposing digital video logo-removal forgery by inconsistency of blur, International Journal of Pattern Recognition and Artificial Intelligence, November 2010, 24(7): 1027-1046.