Technical implementation of the thermal potential method for territory analysis

In the first publication ( Using thermal potentials for territory analysis ), we described how thermal potentials can be used to analyze territories in general. In the following publications, it was planned to describe how information about spatial objects is stored in databases, how models are built from the main components, and in general what problems of territory analysis can be. But first things first.

The use of the thermal potential method in the first place makes it possible to get a general idea of the territory of interest to us. For example, taking the initial information from OSM in Barcelona (Catalonia), and, having carried out an integral analysis without selecting parameters, we can get “thermal” images of the first main components. We also talked about “heat maps” in the first article, but it would not be superfluous to recall that the term “heat map” arose due to the physical meaning of the potentials used for integral analysis. Those. in physics problems, potential is temperature, and in problems of territory analysis, potential represents the total impact of all factors influencing a specific point on a territory.

Below is an example of a “thermal” map of Barcelona, obtained as a result of an integral analysis.
')

"Heat" map of the first main component, without the selection of parameters, Barcelona

And by setting a specific parameter (in this case, we have chosen the industry), you can get a "heat" map directly on it.

"Heat" map of the first main component, industry, Barcelona

Of course, the tasks of analysis are much wider and more varied than obtaining a general assessment of the selected territory, therefore, as an example in this article, we will look at the task of finding the best place for placing a new object and the technical implementation of the thermal potential method to solve it, and in other publications we will see others.

Solving the problem of finding the best place when placing a new object will help determine how much territory is “ready to accept” this new object, how it will correlate with other objects already existing in the territory, how much this new object will be valuable for the territory and what value it will add.

Stages of technical implementation

Technical implementation can be represented by the sequence of procedures listed below:

Preparation of the information environment.
Search, collection and processing of source information.
Building a grid of nodes in the analyzed area.
The division factors of the territory into fragments.
Calculation of potentials from factors.
The selection of factors to create thematic integral characteristics of the territory.
Application of the principal component method for obtaining the integral indicators of a territory.
Creating models for choosing a place for the construction of a new object.

Stage 1. Preparation of the information environment

At this stage, it is necessary to choose a database management system (DBMS), determine the sources of information, methods of collecting information, the amount of information collected.
For work, we used the PostgeSql database (DB), but it is worth noting that any other database that works with SQL queries is suitable.

The database will store the initial information - spatial data about the objects: data types (points, lines, polygons), their coordinates and other characteristics (length, area, quantity), as well as all calculated values obtained as a result of the work done and the results themselves .

Statistical information is also presented as spatial data (for example, areas of the region with statistical data assigned to these areas).

As a result of the transformation and processing of the collected initial information, tables are formed containing information about linear, point and area factors, their identifiers and coordinates.

Stage 2. Search, collection and processing of source information

As an initial information for solving this problem, we use information from open cartographic sources containing information about the territory. The leader, in our opinion, is OSM information, updated daily around the world. However, if you manage to collect information from other sources - it will not be worse.
Processing information is to bring it to uniformity, the exclusion of inaccurate information and preparation for loading into the database.

Stage 3. Building a grid of nodes in the analyzed area

To ensure the continuity of the analyzed territory, it is necessary to build a grid on it, the nodes of which have coordinates in a given coordinate system. At each node of the grid, the potential value will be determined later. This will allow you to visualize homogeneous areas, clusters and final results of the analysis.

Depending on the tasks being solved, there are two options for building a grid:
- Grid with regular pitch (S1) - is an overview over the entire territory. It calculates the potentials from the factors, determines the integral characteristics of the territory (main components and clusters) and displays the simulation results.

When selecting this grid, you must specify:

grid spacing - the interval at which the grid nodes will be located;
the boundary of the analyzed territory, which may correspond to the administrative-territorial division, or it may be an area on the map, limiting the territory of the calculation in the form of a polygon.

- The grid with irregular step (S2) describes certain points of the territory (for example, centroids). It also calculates the potentials from the factors, determines the integral characteristics of the territory (main components and clusters). Modeling with calculated main components is performed exactly on the grid with an irregular step, and to visualize the simulation results, the numbers of clusters from the grid nodes with an irregular step are transferred to the grid nodes with a regular step according to the principle of proximity of coordinates.
In the database, information about the coordinates of the grid nodes is stored in a table containing the following information for each node:

node identifier;
coordinates of the node (x, y).

Examples of grids with a regular pitch to different territories with different pitch are shown in the figures below.

Grid covering N.Novgorod (red dots). Grid covering Nizhny Novgorod region (blue dots).

Stage 4. Dividing the factors of the territory into fragments

For further analysis, extended territory factors need to be converted into an array of discrete factors so that each grid node contains information about each factor present in it. Linear factors are broken into segments, areal - into fragments.

The splitting step is selected based on the area of the territory and the specifics of the factor, for large areas (area) the splitting step can be 100-150 m, for smaller territories (cities) the splitting step can be 25-50 m.

The database contains information about the results of the partitioning in the form of a table containing the following information for each fragment:

factor identifier;
coordinates of centroids of the obtained fragments of the partition (x, y);
length / area of fragments of the partition.

Stage 5. Calculation of potentials from factors

One of the possible and understandable approaches to the analysis of initial information is the consideration of factors as potentials from objects of influence.

We use the fundamental solution of the Laplace equation for the two-dimensional case - the logarithm of the distance from the point.

Taking into account the requirement of the final value of the potential at zero and limiting the value of the potential over large distances, the potential is determined as follows:

F (r) = L n (r 1 / r 2)

$F (r) = Ln (r1 / r2)$ with r <r1 (1)

F (r) = L n (r / r 2)

$F (r) = Ln (r / r2)$ at r2> r> = r1

F (r) = 0

$F (r) = 0$ at r> = r2

Type of influence potential from a point object

The logarithmic function should be limited to zero and reasonably limited at some distance from the factors. If you do not limit the potential at large distances from the factor, you would have to take into account a huge amount of information far from the analyzed point, which has practically no effect on the analysis. Therefore, we introduce the radius of the factor, beyond which the contribution to the potential from the factor is zero.

For a city, the radius of the factor is assumed to be equal to a half-hour walking distance of 2,000 meters. For the region, we should talk about a half-hour transport accessibility - 20,000 meters.

Thus, as a result of calculating the potential values, we have the total potential from each factor at each node of the regular grid.

Stage 6. Selection of factors for creating thematic integral characteristics of a territory

At this stage, the most significant and informative factors are selected to create thematic integral characteristics of the territory.

The selection of factors can be carried out automatically, setting certain limits to parameters (correlation, percentage of influence, etc.), or you can expertly, knowing the subject of the problem and having some idea of the territory.

After the most significant and informative factors are selected, it is possible to proceed to the following stages - the interpretation of the main components.

Stage 7. Application of the principal component method for obtaining the integral indicators of a territory. Clustering

The initial information about the factors of the territory, transformed at the previous stage into the potentials calculated for each node of the grid, is combined into new integral indicators - the main components.

The method of principal components analyzes the variability of factors in the territory under study and, based on the results of this analysis, finds their most variable linear combination, which makes it possible to calculate the measure of their change - dispersion over the territory.

Let's take a general task for making a model of approximation of a linear model function to given values

\sum_{i} = 1, n (A_{i} * P C A_{i}, j + B) = P O T_{j}

$∑_i = 1, n (A_i * PCA_i, j + B) = POT_j$ (2)
Where i is the component number,
n - the number of components involved in the calculation
j - index of the node point of the territory, j = 1..k
k - the number of all nodes of the grid area, which was the calculation of the main components

A_{i}

$A_i$ - coefficient at the i-th main component of the model

P C A_{i}, j

$PCA_i, j$ - the value of the i-th main component at the j-th point
B - free member of the model

P O T_{j}

$POT_j$ - potential at the j-th point of the factor for which we are building a model

We define the unknowns in equation (2) by the least squares method, using the properties of the main components:

\sum_{j} = 1, k (P C A_{i}, j * P C A_{i} 2, j) = 0

$∑_j = 1, k (PCA_i, j * PCA_i2, j) = 0$ (3)
Where i and i2 are component numbers, i <> i2
j - the index of the site node
k - the number of all nodes of the territory

\sum_{j} = 1, k (P C A_{i}, j) = 0

$∑_j = 1, k (PCA_i, j) = 0$ (four)

(3) means no correlation between components.
(4) - the total value of any component is zero.

We get:

A_{i} = \sum_{j} = 1, k (P C A_{i}, j * P O T_{j}) / \sum_{j} = 1, k (P C A_{i}, j * * 2)

$A_i = ∑_ j = 1, k (PCA_i, j * POT_j) / ∑_j = 1, k (PCA_i, j ** 2)$

B = a v g (P O T_{j})

$B = avg (POT_j)$ (five)
Here are the designations as in equation (2) ,

a v g (P O T_{j})

$avg (POT_j)$ means mean potential

This result can be interpreted as follows:
The model is a simple expression, consisting of the average value of the simulated value, and simple corrections to it for each of the components. Minimally, the result must include the free member B and the first principal component. Below are examples of heat maps of the first major components in the Nizhny Novgorod region.

According to the calculated main components, it is possible to construct homogeneous regions. This can be done both in all respects and, for example, only in pricing - i.e. to cluster For this, you can use the K-means method . For each homogeneous region, the average value of the 1st main component characterizing the level of development of the territory is calculated.
An example of clustering by pricing parameters for the Nizhny Novgorod region is given below.

Also, using the obtained main components as parameters of the cost model, we can get the price surface of the territory.

Price surface of the city of N. Novgorod

Stage 8. Creating models for choosing a place for the construction of a new object

In order to choose the place that is most attractive for the location of a new object (hereinafter referred to as the “object”), it is necessary to compare the location of the “object” with its surrounding infrastructure. For the “object” to function, there must be enough resources to ensure its functioning, a large number of factors of both positive and negative impact on the “object” must be taken into account. The whole combination of these factors can be defined as a “nutrient” environment for the functioning of an “object”. The correspondence of the number of objects to the amount of resources of the territory is the basis for the stable functioning of the “object”.

The result of this comparison is the potential calculated for each point of the territory and allowing visually and analytically to analyze the choice of location for the placement of the new "object".

For trade, for example, among other things, a constant flow of buyers is important, which means that the list of factors that should be considered for trade objects should include those that provide this flow (for example, social infrastructure, work places, places of residence, highways, etc.) ).

On the other hand, when all the conditions are met to ensure the functioning of trade objects, it is necessary to take into account the density of trade objects, since the “consumption” of the environment leads to a decrease in the possibility of purchases. The flow of people is not unlimited, the same applies to their financial resources and physical capabilities.

The algorithm for solving the problem of choosing the best location for an object is reduced to the fact that the potential obtained, as a function of the main components, comes as close as possible to the potential of an aggregate of objects of the type “object”; then, the potential difference between the model and the potential of objects of the “object” type is calculated; the value of the contribution potential of one “object” is subtracted from the difference obtained; the resulting negative values are replaced by zero, i.e. those places are removed where there are not enough resources for the operation of the new “object”.

As a result of these actions, we get points of the territory with a positive potential value, that is, a place of favorable location of our “object”.

In other words, we have the calculated potentials of all the factors at our disposal and the factor by which we want to build a model and make an analysis of the chosen thematic area (trade, industry, culture, social sphere, etc.)

To do this, it is necessary to select factors for the construction of environment variables — the main components — and then to calculate models from them.
We suggest selecting factors by analyzing the correlations of all factors with the reference factor of the thematic area. For example, for culture it can be theaters, for the school education system, etc.

We calculate the correlation of the potential of the standard with the potentials of all factors. We select those factors whose correlation coefficients modulo more than a certain value (often takes the value of the minimum correlation coefficient = 0. 3).

| K k o r r_{i} | > 0.3

$| Kkorr_i | > 0.3$ (6)
Where

| K k o r r_{i} |

$| Kkorr_i |$ - the absolute value of the correlation coefficient of the i - th factor with the standard.

Correlation is calculated across all grid nodes covering the territory.

The potential difference between the model and the potential of objects of the same type as the new object in equation (2) shows the potential of the territory, which can be used to locate new objects.

As a result, we obtain the value of the potential, which characterizes the degree of benefit of the location of the “object” in the study area.

An example of how to graphically display recommended areas of a new “object” is given below.

Thus, the result of solving the problem of choosing the best location for a new object can be represented as an estimate of the territory in points at each point, giving an idea of the potential for placing an investment object, that is, the higher the score, the more profitable it is to locate the object.

In conclusion, it should be said that in this article we considered only one task, which can be solved by analyzing the territories, having in our hands data from open sources. In fact, there are a lot of tasks solved with its help, their number is limited only by your imagination.

Source: https://habr.com/ru/post/455048/

All Articles