SHREC 15 Track
3D Object Retrieval with Multimodal Views
The aim of SHREC '15 is to provide a common benchmark for the evaluation of the effectiveness of
3D-shape retrieval algorithms. This year the competition is organized into different tracks.
This page addresses the track of 3D Object Retrieval with Multimodal Views.
View-based 3D object retrieval aims to retrieve 3D objects which are represented by a group of multiple views. Most of existing methods start from 3D model information, while it is hard to obtain the model information in real world applications. In the case where no 3D model is available, a 3D model construction procedure is required to generate the virtual model via a collection of images for model-based methods. We notice that 3D model reconstruction is computationally expensive and that its performance is highly restricted to sampled images, which severely limits practical applications of model-based methods.
With the widely applied color and/or depth visual information acquisition devices, such as Kinect and mobile devices with cameras, it becomes feasible to record color and/or depth visual information for real objects. In this way, the application of 3D object retrieval can be further extended to real objects in the world.
Starting from the Lighting Field Descriptor at 2003, much research attention has focused on view-based methods in recent years. It is noted that it is still a hard task to retrieve objects via views. The challenges lie in the view extraction, visual feature extraction, and object distance measure.
We then build a real object dataset using Kinect containing 505 objects from 61 categories. For each 3D object, two groups of color and depth images, a complete image data and a selected image data, are recorded for representation. Then 3D object retrieval is based on the multimodal information, i.e., multiple color and depth views. Based on this new benchmark, we plan to organize this track to further foster focused attention on the latest research progress in this interesting area.
Each participant is requested to:
For our track, The 505 objects belong to 61 categories and the number of objects in each category ranges from 1 to 20. Here the categories containing no less than 10 objects are selected as the queries and there are 23 categories of objects are employed here. For these 23 categories, there are 311 objects.
In our track, two 3D object retrieval tasks are launched, which employ the complete version (721 images) and the concise version (73 images) of data respectively. In each task, these 318 objects are used as the query object once. For each object, we provide two kinds of image sets (73 images and 721 iamges) respectively. Thus, a success results should includes 2 files, which show the retrieval results base on different image sets. If the participants have more then one method and more than one result, they should provide different folders and each folder includes one results. The filename of folder should be named as the name of method. Meanwhile, participants should provide corresponding code according different results.
For example, the method used by user is name as CCFV. Author should build a folder, which is named as author-CCFV, where author should be replaced with the first author's surname. CCFV is the method's name. This folder should include two files. The names of files is 73.txt and 721.txt. In this track, we provide 311 objects as query dataset and 505 objects as test dataset. Thus, these two txt files should include 311 row and 505 column. Each row represents the retrieval result of one query. The first column represents query model and the rest of columns represents the retrieval results. Each result separated by Spaces.
You can download this dataset from here. You can find a standard retrieval result file in here.
You can download the paper is here
To evaluate the performance of different methods in the task, Precision-Recall curve (PR Curve), Nearest Neighbor (NN), First Tier (FT), Second Tier (ST), F1-measure (F1), normalized discounted cumulative gain (NDCG) and average normalized modified retrieval rank (ANMRR) are employed as the evaluation criteria.
1.PR curve comprehensively demonstrates retrieval performance; it is assessed in terms of average recall and average precision, and has been widely used in multimedia applications.
2.NN evaluates the retrieval accuracy of the first returned result.
3.FT is defined as the recall of the top T results, where T is the number of relevant objects for the query.
4.ST is defined as the recall of the top 2T results.
5.F1 jointly evaluates the precision and the recall of top returned results. In our experiments, top 20 retrieved results are used for F1 calculation.
6.NDCG is a ?statistic that assigns relevant results at the top ranking positions with higher weights under the assumption that a user is less likely to consider lower results.
7.ANMRR is a rank-based measure, and it considers the ranking information of relevant objects among the retrieved objects. A lower ANMRR value indicates a better performance, i.e., relevant objects rank at top positions.
Last update: 23/12/2014