📜 ⬆️ ⬇️

All (or almost all) of the space

As follows from the title, the article will focus on an integral part of any Russian-language (and not only) text - on the space. We will touch on the history of the gap, the types of gaps, the issues of the use of the gap in the web typography.

Generally speaking, a space is any empty space in the handwritten, printed or displayed text on any other medium. So the spaces are different:
The following discussion focuses on word breaks separating words and functionally belonging to punctuation marks.

History of the word break

Interword gap - relatively later invention in the history of human thought. The story of the gap is described in depth in Paul Saenger’s “Space between Words: The Origins of Silent Reading” book and, less deeply, in Johannes Friedrich’s The History of Writing.
')
There is also a good article by Anton Bizyaev about spaces and their history “There were no spaces at the beginning” , which was published in 1997 in the magazine “Publish”.

In short, the space appeared rather late, in those writings where the lack of word delimitation led to the difficulty of reading (the so-called consonant writing, where only consonant sounds are recorded). However, in Greek and Latin, in which the vowel sounds were also recorded, the use of the space was lost. Paul Sanger connects this with the fact that reading was done out loud, which simplified the delineation of words in the perception of the text.

Again, the space began to be used approximately in the VII — IX centuries. n Oe., and this tradition came from Ireland, where the scribes and readers of the native language was Old Irish, and religious literature was written in Latin. Apparently, for this reason, the monks had difficulty reading aloud. It is believed that the appearance of a space is tightly connected with the gradual transition from reading out loud to reading to oneself. Examples of books in Latin with interword spaces are monuments of British literature: the Gospel of Darrow (VIIth century) and the Book of Kells (VIII — IX centuries).

In the verb and Cyrillic, the space was also absent, and in the usual sense, it has been used only since the 17th century.

Before the mankind invented the type-setting font, there was no special classification of interword spaces - the scribes put spaces by eye and set. Let me remind you (we wrote about this in the article “Width Switch” ) that the manuscript and xylography relate to ways of creating texts without the mobility of letters. Naturally, spaces could have different widths, since gaps were made manually.

Manual spaces


When the mobility of the letters appeared (and this happened with the advent of typesetting fonts), the questions accordingly arose - but how to put spaces in order to keep the switch width-wise?

The manual dialing technology is such that the typed line is completely clamped in the bench and in the galley, and, accordingly, should have a width almost exactly equal to the width of the strip (for more information on the manual dialing technology, see M. Schulmeyster’s book of the same name ).

When typing, the string was typed from letters (bars, on the end of which convex mirror copies of letters were printed on paper), and word breaks were created with the help of so-called spines - bars of various thickness, which do not have a printing surface on the end. It looks like this . Spacing for each font size, of course, produced their own, and had a different width. For example, for a point font of 10 points (a standard point for most text editions), shakes of 10, 5, 4, 3, 2 and 1 point were issued.

Spit widths were called kegel or round. The half-pintelle spears were called semi-pinicular or semi-circular. There is also the name “thin spacation”, which means spacation with a thickness of 1-2 points for a font size of 8-12 points. That is, for a 10 point font size, a thin spacing is usually 2 points (respectively, 1–5 point type). However, due to the lack of precise definition of fine spelling, in the manuals of the publisher, editor and typesetter they usually speak not about beating into thin faces, but about beating on so many points (assuming that the font size is 10 points).

Thus, you need to understand that, depending on the size of the font, the proportion of round spacing (third, quarter, etc.) may have different width in points, and vice versa.

Traditional word width


So, having figured out what a round and semicircular spacing is, let's move on to the width of the word word proper adopted in the Russian set.

Schulmeister writes (p. 94) that when typing a line between words a semicircular is put. When a line is typed to the end, in most cases its width is either smaller or larger than the width of the dial band. Therefore, the typesetter has to change the width of the spaces, reducing it to at least 1⁄4 round and increasing the maximum to 3⁄4 round (respectively, when typing 10 points, the interword words can vary from 3 to 7 points). Naturally, there are nuances that depend on the format of the publication, but we will not touch them.

However, Schulmeister stipulates that the word word itself is semicircular, and the use of a standard space in 1–3 round is both more economical from the point of view of paper consumption, and so is often more beautiful. Also the use of the interword space in a semicircular is not recommended for narrow fonts.

With the advent of line-throwing machines, spaces began to be made uniform in width within one line, and the width of the inter-word space began to vary around 1⁄3 round.

Computer typing and web typography

We are currently limited by the capabilities of the fonts used, and, of course, by the character set in Unicode. It must be remembered that by no means all fonts contain most white-space Unicode characters.

When switching to computer layouts, the transition was made from specifying the width of the spaces in points to specifying the width of the spaces in round lobes, as the fonts began to scale easily to any size, and whitespace elements should remain proportional to the size of the font.

Space Unicode Characters


Unicode includes the following characters for western typing spaces.
Regular and non-breaking interword spaces are included in any font and are correctly displayed by all agents, except for the absence of increasing and decreasing the non-breaking space when turning off in width in some word processors and browsers (which is a violation of the recommendations). For example, FireFox correctly scales non-breaking spaces, and MSIE 7.0 does not scale them at all.

All other whitespace characters have a fixed width and do not stretch when the lines are turned off. However, according to the Unicode string break algorithm , they should all be treated as a line break point.

Using different spaces


Since the width of the interword space is fixed in the font and changes automatically when turned off in width, the use of other whitespace characters as interword characters is justified only when typing printed publications, and only if you have a deep understanding of what is being done.

In the usual layout for the web, it is enough to use ordinary and non-breaking word breaks to separate words.

At the same time, according to the rules of Russian-language typography, in a number of places, fine spacation should be used (more precisely, reference books write about two-paragraph spacation, but we will use the term “fine spacation” as the most appropriate in terms of established terminology and when typing).

The basic rules for the use of spaces will be described below, but in general, we recommend the following principle for use in web layout.

When preparing HTML documents for publication on the Internet, only a space, an unbreakable space & nbsp; should be used as whitespace elements; and a fine fancy dress & thinsp ;. In the event that the author assumes that the page should be viewed with the help of agents that incorrectly process the & thinsp; symbol, then instead of thin spacing, a regular or non-breaking space should be used.

The use of only thin spacation from the whole variety of whitespace elements allows, firstly, to preserve the harmonious look of the typed text, and secondly, not to overload the author of the publication with various rules for the use of spacecraft of various fractional widths.

Processing gaps by browsers and search engines


In preparing the article we conducted a kind of experiment on a specially prepared page. Yandex and Google cope with non-standard characters well, replacing all non-standard whitespace with normal ones (we think this is the right behavior). That is, they do not make a difference between the texts “two words”, “two & ensp; words "," two & thinsp; words, "etc.

As it turned out, rendering non-standard white space elements works badly in browsers. Only Firefox 3.0 in Windows XP and * nix, MSIE 7.0 and Safari in Windows XP cope with the task normally. There is no data about MSIE 8.0, but most likely, he is also fine.
It is not quite clear what is the reason for the equal width of all white space elements in all browsers under Mac. Probably with embedded fonts.

Basic rules for using spaces

So, once again, we emphasize that in all the rules listed below, there is a thin & thinsp; it is used only in the case when the author notes the risk of the use by the visitor of the site of browsers that incorrectly display fine envelope. These include some browsers in * nix (perhaps this is due to embedded fonts), MSIE version 6.0 and earlier, browsers for Mac (they can be neglected, since the rendering error is only in width), perhaps - some browsers for mobile phones and PDA.

In that case, if the use of such browsers is likely, we recommend using regular or non-breaking word breaks instead of thin spaces.

As described above, according to the recommendations of Unicode, thin spacing is such a space where a line break is possible. In cases where the rules require fine spelling and prohibiting line breaks (for example, between digits when dialing a number), you need to use a design like <span style="white-space: nowrap;">250&thinsp;000</span> . The nobr html element is proprietary and prohibited to use.

Next, we describe those rules for the arrangement of spaces, which, according to our observations, are most often violated during the imposition of texts. More detailed information on the rules of typing can be found, for example, in the “Handbook of the Publisher and Author” A. E. Milchin and L. K. Cheltsova.

Abbreviations and Symbols

  1. In the abbreviations "and so on," "and the like," "because," "that is,", "and others," "BC," "southern latitude," and the like, all the elements of the reduction are separated by an inseparable space.
    and so on - and & nbsp; t. & nbsp; d.
    and so on - and & nbsp; t. & nbsp; P.
    t. k. - t. & nbsp; to.
    i.e. - t. & nbsp; e.
    and others - and & nbsp; others
    BC er - until & nbsp; n. & nbsp; er
    Yu. sh. - y. & Nbsp; sh.
  2. Initials are repelled from each other and from the last name by an inseparable space.
    A. S. Pushkin - A. & nbsp; C. & nbsp; Pushkin
    J. R. R. Tolkien - J. & nbsp; P. & nbsp; P. & nbsp; Tolkien
    It is also possible to discard the initials from each other and from the last name following him with a fine spelling, but transferring the initials or the last name to the next line is prohibited. Regardless of which style you choose, you must adhere to the unity of style throughout the entire document or site.
    V.V. Putin - V. & thinsp; B. & thinsp; Putin
    V. Putin - V. & thinsp; Putin
    Putin V.V. - Putin & nbsp; B. & thinsp; AT.
    Putin V. - Putin & nbsp; AT.
  3. The abbreviated word bounces on behalf of its own unbreakable space.
    st. Shchorsa - st. & Nbsp; Shchorsa
    Moscow - city & nbsp; Moscow
    Metro them. Lenin - Metro them. & Nbsp; Lenin
  4. The number and the corresponding countable word are beaten off by an unbroken space.
    12 billion rubles - 12 & nbsp; billion rubles
    Ch. IV - Ch. & Nbsp; IV
    pp 3—6 pp. 3–6
    rice 42 - Fig. & Nbsp; 42
    XX century. - XX & nbsp; at.
    1941-1945 - 1941-1945 & nbsp; yy
    Chamber № 6 - Chamber № & 6;
    § 22 - § & nbsp; 22
    25% - 25 & nbsp;%
    97.5? - 97.5 & nbsp ;?
    16 ¢ - 16 & nbsp; ¢ .
  5. The number and the unit of measure corresponding to it (except for signs of degree, minute and second) are beaten off by a thin sword, the line break is forbidden.
    400 m - 400 & thinsp; m
    100 tons - 100 & thinsp; t
    451 ° F - 451 ° F
    but 59 °, 57 ′, 00 ″.
  6. The signs of degree, minute, and second are beaten off by a thin sword from the following digits.
    59 ° 57 ′ 00 ″ - 59 ° & thinsp; 57 ′ & thinsp; 00 ″
It should be taken into account that there is no fully established rule about the jabbering of percent signs and currencies among typographers, so a set of a percent sign and currency symbols close to a number is not a mistake if such usage is carried out uniformly throughout the site. However, we believe that using a space in this case improves the readability of the text.

Numbers and intervals

  1. The fractional and integer parts of the number do not beat off with a space from the comma: 0.62 , 345.5 .
  2. The digits of the number are beaten off from each other by a fine spell, except for dates, numbers (for example, documents), signs of machines and mechanisms.
    25,563.42 - 25 & thinsp; 563.42
    1,652 - 1 & thinsp; 652
    1,298,300 - 1? Thinsp; 298 & thinsp; 300
    but 1999 , GOST 20283 , in. No. 982364
  3. When numerical designation of intervals, the dash does not bounce off the boundaries of the interval.
    50–100 m - 50–100 & thinsp; m
    1 500-2 000 - 1 & thinsp; 500-2 & thinsp; 000
    1.5-2 thousand - 1.5-2 & nbsp; thousand
    15–20% - 15–20 & nbsp;%
  4. Unary signs plus, minus and plus or minus do not beat off the following numbers: +20 ° C , −42 , ± 0.1 .
  5. Binary signs of mathematical operations and ratios are bent on both sides into a thin frame.
    2 + 3 = 5 - 2 & thinsp; + & thinsp; 3 & thinsp; = & thinsp; 5

Punctuation marks

  1. A comma, a comma, a colon, question and exclamation marks, a semicolon are not beaten off with a space from the preceding word, and are beaten with a space from the following: Ha, ha. Ha? Ha!
  2. An ellipsis does not bounce off the previous word, if it stands at the end of a sentence or part of a sentence, and from a subsequent one - if it stands at the beginning of a sentence: Wow ... What? …Nothing.
  3. The quotes do not beat off with spaces from the text enclosed in them: the battleship Potemkin .
  4. The brackets do not beat off with spaces from the text enclosed in them, and beat off with spaces outside (except when the closing bracket is adjacent to the punctuation mark on the right): Text in & nbsp; nobody brackets & nbsp; interesting (usually).
  5. The dash bounces off the previous word with a non-breaking space, and from the next with an ordinary space (including if the interval is specified in verbal rather than numeric form).
    Vitenka & nbsp; - well done!
    we only need a cucumber fifteen & nbsp; - twenty centimeters
    Molotov Pact - Ribbentrop .
  6. If two numbers in the verbal form do not form an interval, but mean “either one number or the other”, then a hyphen is inserted between them, which is not reflected by spaces: I drank two or three glasses .
There is a recommendation to beat the dash on a thin spice or not to beat off from a point, comma or quotation mark at all. This can be justified when typing a printed text with a specific font, since it increases the homogeneity of the spaces. At the same time, when viewing text for the web, the user’s fonts can be quite diverse, which is why the space left of the dash is constantly becoming narrower than the right one.

Unwanted transfers

  1. Short words and conjunctions ( a , and , but , me , you , and so on) should be discouraged from the subsequent word with an inseparable space, since the short word hanging at the end of a line impairs readability. Including it is very desirable not to allow the transfer of the line between the particle and the verb following it.
  2. Particles, however , would it be desirable to beat off the previous word with an unbroken space: then & nbsp; same said & nbsp; would have thought & nbsp; whether i
  3. It is advisable to keep the prepositions at the beginning of the sentence from the words following them. (even longer than one and two letters)

Source: https://habr.com/ru/post/23250/


All Articles