|Figure 1: Schematic process of 3D reconstruction
This subsystems deals with the conversion of sensor (camera, 3D point clouds, 3D audio data and other) data into a 3D Scene Graph and the reverse operation of rendering a 3D Scene Graph into a 2D image. The 3D Scene Graph in turn forms the input for the Plan Recognition subsystem that tries to determine the Goals and Plans underlying any changes in the 3D Scenes. Below we specify a possible algorithm that complies with the requirement of other Subsystems higher up in the processing hierarchy.
For "rendering" a 3D scene graph into a 2D image we propose to use an off the shelf Rendering Engine available as part of several computer gaming platforms. In addition to the Scene Graph, the Rendering Engine needs information about the textures of the objects, the lighting and filters resembling the optical characteristics of the camera used to record the real-world scene. With this information the Rendering Engine can create 2D imaged very similar to the original sensor data. For the reverse operation of reconstructing a 3D scene graph from 2D sensor data we propose an iterative algorithm that arranges and configures the objects in a scene graph, together with lighting and filters in order to create a 2D image that minimizes a distance delta measure to the original 2D image.
Rendering and parsing Scripts from video sequences works accordingly, allowing to acquire large amounts of training data efficiently from readily available video material.
Video Compression as a Side Application
An interesting practical side application of this type of 3D reconstruction is a video encoder that compresses videos to a much higher level than current systems.
Comparison and References
TinyCog does not implement 3D reconstruction, because it is based on a simulation environment where the scene is known beforehand. Other implementations of Scene Based Reasoning may focus more on this aspect of the architecture.