SHREC 16 Track

3D Object Retrieval with Multimodal Views



 Home  Dataset  Results  Evaluation


Why SHREC?

The aim of SHREC '16 is to provide a common benchmark for the evaluation of the effectiveness of 3D-shape retrieval algorithms. This year the competition is organized into different tracks. This page addresses the track of 3D Object Retrieval with Multimodal Views.


Description of the track

Fig 1. RGB images and depth images of banana object from different angles.

View-based 3D object retrieval aims to retrieve 3D objects which are represented by a group of multiple views. Most of existing methods start from 3D model information, while it is hard to obtain the model information in real world applications. In the case where no 3D model is available, a 3D model construction procedure is required to generate the virtual model via a collection of images for model-based methods. We notice that 3D model reconstruction is computationally expensive and that its performance is highly restricted to sampled images, which severely limits practical applications of model-based methods.

With the widely applied color and/or depth visual information acquisition devices, such as Kinect and mobile devices with cameras, it becomes feasible to record color and/or depth visual information for real objects. In this way, the application of 3D object retrieval can be further extended to real objects in the world.

Starting from the Lighting Field Descriptor at 2003, much research attention has focused on view-based methods in recent years. It is noted that it is still a hard task to retrieve objects via views. The challenges lie in the view extraction, visual feature extraction, and object distance measure.

We have successfully organized the track in SHREC 2015 of “3D Object Retrieval with Multimodal Views“, which attracted 6 teams with 13 groups of results. In this year, we plan to further expand the impact of this track and explore the influence of different types of objects, i.e., real objects and 3D printed objects, on the retrieval performance.

We then build an object dataset with two parts, i.e., real objects and 3D printed objects, using Kinect. The real object part contains 505 objects and the 3D printed part contains 100 objects from 60 categories. For each 3D object, two groups of color and depth images, a complete image data and a selected image data, are recorded for representation. Then 3D object retrieval is based on the multimodal information, i.e., multiple color and depth views. Based on this new benchmark, we plan to organize this track to further foster focused attention on the latest research progress in this interesting area. In our task, 100 real objects and all 100 3D printed objects are selected as the queries, and the left 405 real objects are used as the retrieval dataset.


What to do for participation

Each participant is requested to:



Submit Format

For our track, The 605 objects belong to 61 categories and the number of objects in each category ranges from 1 to 20. 100 real objects and all the 100 3D printed objects are used as the query once.

In our track, these 200 objects are used as the query object once. For each object, we provide 73 images and 73 depth images to represent one object. If the participants have more then one method and more than one result, they should provide different folders and each folder includes one result. The filename of folder should be named as the name of method.

For example, the method used by user is name as CCFV. Author should build a txt document, which is named as author-CCFV.txt, where author should be replaced with the first author's surname. CCFV is the method's name. In this track, we provide 200 objects as query dataset and 405 objects as test dataset. Thus, these txt file should include 200 row and 405 column. Each row represents the retrieval result of one query. The first column represents query model and the rest of columns represents the retrieval results. Each result separated by Spaces.

You can download this dataset from here. You can find a standard retrieval result file in here.


Timeline



Results

It will be coming soon.....


Evaluation

To evaluate the performance of different methods in the task, Precision-Recall curve (PR Curve), Nearest Neighbor (NN), First Tier (FT), Second Tier (ST), F1-measure (F1), normalized discounted cumulative gain (NDCG) and average normalized modified retrieval rank (ANMRR) are employed as the evaluation criteria.

1.PR curve comprehensively demonstrates retrieval performance; it is assessed in terms of average recall and average precision, and has been widely used in multimedia applications.

2.NN evaluates the retrieval accuracy of the first returned result.

3.FT is defined as the recall of the top T results, where T is the number of relevant objects for the query.

4.ST is defined as the recall of the top 2T results.

5.F1 jointly evaluates the precision and the recall of top returned results. In our experiments, top 20 retrieved results are used for F1 calculation.

6.NDCG is a ?statistic that assigns relevant results at the top ranking positions with higher weights under the assumption that a user is less likely to consider lower results.

7.ANMRR is a rank-based measure, and it considers the ranking information of relevant objects among the retrieved objects. A lower ANMRR value indicates a better performance, i.e., relevant objects rank at top positions.


Organizers


Tianjin University & Tsinghua University


Last Update: 07/09/2016