Introduction
To date, there is a whole set of software products for building 3D-models of objects and scenes on sets of images (for example, 123D Autodesk or Photomodeller). The description of working with such programs was shown in the article
http://habrahabr.ru/post/134781/ and
http://habrahabr.ru/post/64080/ . In this article I want to describe the general methodology for solving this issue, with the capabilities of each of the stages. The article is primarily aimed at those who are “far away” from this topic, but would like to understand how it works there and what you can get.
Description of the general scheme
First, we will describe the requirements for taking a photograph of an object (see Fig. 1). The overlap between a pair of frames of the photographed area of ​​space should be no worse than 50% (otherwise the model will turn out with gaps). Moreover, such a survey should ensure that the three adjacent images are overlapped (for example, in Figure 1, adjacent images can be considered 1,2,3 or 4,5,6). Thus, the resulting 3D model will be determined by only one large-scale parameter.

Figure 1. The scheme of photographing the object
')

Figure 2. Sample snapshots for building a 3D model
Now, let's say, there is a set of pictures (see fig. 2). Next, performing image processing (namely, searching for identical points of an object in images and solving a system of nonlinear equations based on the matches found), we determine camera parameters (focal length, etc.) and camera position / orientation at times of photographing each of pictures relative to one of them (basic, for example, the first, see Fig. 3).

Figure 3. Oriented snapshots
For oriented images, all identical points are searched on adjacent pairs of images (so-called dense maps or depth maps), after which they calculate the position of points in space (see Fig. 4) in the coordinate system of the base image (based on the calculated camera parameters: focal length , position / orientation, etc.).

Figure 4. 3D object model

Figure 5. Textured 3D model of the object
As a rule, the set of points is represented as triangulation frameworks (see Fig. 6, the framework is constructed based on Delaunay triangulation), convenient for subsequent texturing (see Fig. 5, for example, using OpenGL) or transforming images.

Figure 6. Triangulation object model (source,
habrahabr.ru/post/134781 )
Details of each of the stages, as well as the algorithms used can be found in
this work.
Conclusion
The described technology is used in most modern commercial software products for building terrain models based on aerial photography, mobile mapping, etc. At the same time, all software type 123D with fully automatic processing is completely free, but does not guarantee any result at all (respectively, and there is no accuracy), and where the result is required, you have to pay, and there is a specially created functionality to control each of the processing steps.