📜 ⬆️ ⬇️

Choosing a chart for one-dimensional data: a geometric model

Data visualization is always some graphical construction that helps to explore the available data. We build a geometric model and modify it to represent different aspects of the data. We also encounter a limitation imposed by visual perception, namely, that the dimension of the visualization cannot be more than two. All available graphic tools are two-dimensional: a sheet of paper or a monitor screen.


Using the example of diagrams for one-dimensional data, let's see how the geometric model is constructed, how it is modified, and how the dimensionality of the data and visualization manifests.


Choosing a chart for one-dimensional data: a geometric model



The simplest geometric model of numeric values


Consider a series of values ​​of one variable (speed, temperature, price, etc.), for example:


3, 1, 5, 4, 6, 5, 4, 7, 9, 8

By one-dimensional data and we will understand that the variable is only one. To explore the properties of a number series, we construct a geometric model, that is, a model where data elements (numerical values) are represented using geometric objects: points, lines and circles.


For a numerical series, the simplest thing is to compare each number with a line whose length is proportional to numerical values. For example, the line corresponding to the number 3 is three times longer than the line corresponding to the number 1. The usual bar chart is obtained:


Numbers and corresponding lines (bar graph)


Transform visualization to explore different aspects of data


Now we will change the simplest model of a number series to explore its various aspects.


The significant parameter of the bar chart is the distance from the base of the chart (horizontal axis) to the top point. This distance is proportional to the value of the variable at some point in time. If you leave only the top points and connect them together, you will get a graph (line chart). On the graph, points are ordered by time from left to right:


Schedule


If you sort the lines not by time, but by ascending or descending, you will get a bar chart. This chart is well suited for presenting rankings and allows you to visualize the values ​​of a variable, ordered in descending or ascending order from top to bottom (by rank). Then it will look like an ordered list:


Bar chart


Now the conversion is more complicated. We divide the sorted set of lines into groups. In each group there are only lines of a certain length, no more and no less than specified boundary values. For each group, we count the number of lines (values) falling into a given interval. The resulting value set in line with the new line. It’s as if the original lines have become dies, and we stack them one on top of the other. Further we place new lines in ascending order of the maximum (minimum) border of the interval and we obtain a histogram.


bar chart


The histogram on the horizontal axis shows the values ​​of the original variable, in contrast to the bar chart. Therefore, it is better to make the bar chart horizontal - so as not to be confused with a histogram, especially if they are used simultaneously.


Reducing the dimension of the visual model


It can be noted that the above diagrams are two-dimensional, despite the fact that they help visualize one-dimensional data:



That is, the dimension of the visualization does not necessarily coincide with the dimension of the data.


Effectively increasing the dimension of visualization is difficult, but to reduce the dimension can be quite easy. Such a modification will allow to obtain several more diagrams for visualizing and modeling the values ​​of one variable.


The one-dimensional analogue of a chart is an interval chart or a candlestick chart, often used when displaying stock charts. To build it, we leave only four values ​​of the variable: initial, final, minimum and maximum. Instead of studying the time interval in detail, we look only at the boundary (in time and magnitude) values. In the interval diagram, the rectangle is not painted over, if the final value is greater than the initial one (growth), and filled out, if vice versa (fall).


Interval Chart - Japanese Candle


Now take all the lines that make up the bar chart, and connect them sequentially. We take the longest line corresponding to the maximum value, we attach the next largest value to it, etc. And then close the starting and ending point so that we get a circle. Thus, each line corresponding to a variable becomes an arc of a circle, and the circle itself corresponds to a whole - the sum of all values. At this fraction of each value corresponds to a sector of a circle and a certain angle proportional to the fraction.


Formula for pie chart


We got a pie chart.


Pie chart


And, finally, reduce the dimension of the histogram. By analogy with the interval chart, we will leave only a few basic values ​​that characterize the distribution: the minimum and maximum, two quartiles and the median. We will obtain a span diagram or a box with a mustache (box plot), on which the quartiles define the boundaries of the rectangle, and the vertical line in the middle is the median.


Span Chart - Mustache Box


The lower version proposed by the “minimalist” Taffeta clearly demonstrates the one-dimensionality of this visualization.


The interval chart (Japanese candle) and the span diagram (a box with a mustache) are very similar. Therefore, especially if they are used together, it is better to orient the candle vertically, and the box horizontally.


In general, a view with smaller dimensions, as if compressed, will allow you to build visualizations that compare several series of values.


Choosing a chart for visualizing one-dimensional data


Now we will create a table that will help you choose charts for visualizing one-dimensional data. The six considered diagrams are classified according to the following aspects of visualization:



DimensionTimeAttitudeDistribution
2DScheduleBar chartbar chart
1DInterval chart (“candlestick”)Pie chartSpan diagram (“mustache box”)

findings



Links



')

Source: https://habr.com/ru/post/326026/


All Articles