The key idea of our virtual rephotography evaluation method is to evaluate an image-based reconstruction by comparing its renderings with 2D ground truth - the original photographs. To exclude fraudulent reconstructions and prevent evaluation bias each benchmark dataset contains only a part of all dataset photos. We keep the rest of the images secret and evaluate your submissions against them.
For more information regarding the method and its properties please refer to our:
The renderings or virtual rephotos of your reconstruction will be rated with respect to completeness and visual error.
Visual error is measured by comparing virtual rephotos and test images with a patch-based image metric (NCC - normalized cross-correlation). In this comparison we skip all image patches that are invalid in your virtual rephotos (i.e. contain pixels that you left black) or invalid in the test images (i.e. contain ill-exposed pixels).
Completeness is the fraction of all valid rephoto image patches out of all valid test image patches.
This means, that you can leave regions black, if you are uncertain about their correct color, without harming your error score. However, this will decrease your completeness score.
You can find the core application (image_comparison), that does this comparison, in the code section.
If you are not focusing on a complete image-based modeling and rendering pipeline but only parts of it, have a look into the dataset and code sections. We provide shortcuts for typical steps there (e.g. a point cloud or texturing code).
We offer multiple datasets ranging from a controlled lab setup with regularly sampled images to more complex outdoor scenes with varying camera intrinsics and drastically varying viewpoints. For each dataset we provide the input images and their camera parameters as well as the parameters for the views that we want you to render and submit.
Additionally, we provide our reconstructions and their intermediate steps, which you can use as a shortcut if you only want to evaluate a certain pipeline step or a rendering method which requires further input, such as IBR. All these reconstructions have been created with our algorithm. Links to these algorithms and further helpers, such as a view renderer for simple models, can be found within the code section. In the code section you can also find the implementation of our evaluation method, which you may use for testing purposes on the training images. However, since we do not publish the test images you will not be able to exactly reproduce our evaluation results.
Each dataset is available as images and CAM files (or alternatively as an MVE scene). The CAM files provide intrinsic and extrinsic parameters for the images. They consist of two lines and are structured as follows:
First line: Extrinsics - translation vector and rotation matrix
Second line: Intrinsics - focal length (normalized by dividing with the larger image dimension), distortion coefficients, pixel aspect ratio and principal point (in normalized coordinates [0,1]). A full definition can be found within the MVE Math Cookbook. Because the provided images are already undistorted the distortion coefficients will be zero.
The Image Comparison application measures the similarity of two images using different patch- and pixel-based image metrics. We use this application to evaluate the quality of your renderings against the test images. Although this application computes several image metrics, the only one used in the context of this benchmark is 1-NCC.
The Virtual Rephotography application renders views from a 3D model according to extrinsic and intrinsic parameters. This is potentially the fastest way to render your model for submission. Currently it supports PLY meshes with vertex color and OBJ models with texture.
The Multi-View Environment is an implementation of a complete end-to-end pipeline for image-based geometry reconstruction. It features Structure-from-Motion, Multi-View Stereo and Surface Reconstruction. The individual steps of the pipeline are available as command line applications, but most features are also available from its user interface UMVE. If you use parts of our pipeline for your reconstruction, please mention that in the submission form.
The Multi-View Stereo Texturing algorithm creates high quality textures for triangle meshes from images given their intrinsic and extrinsic parameters. It addresses many challenges occurring in real-world datasets such as occluders, (out-of-focus) blur, varying scale, exposure time and illumination. If you use our texturing algorithm for your reconstruction, please mention that in the submission form.
You may submit one or multiple datasets within an archive (.tar.gz, .zip). Please create for each dataset a folder with the name of the dataset. This folder should contain your renderings for the test camera parameters as PNG images (8bit, no alpha channel). The prefix of each rendering has to be exactly the same as the one from the .CAM file it has been created from.
If you need assistance, want to change your submission name, add a project website, etc.
please feel free to contact us: