Introduction to binary technology

The first information technology is a bit, the second is a qubit, and then a dark forest. The purpose of the following is to provide an initial development of IT, defining a bit boundary condition as the minimum unit of information.

Open the folder "/ Logic / IT", create the directory "binary technologies" in it, prescribe the "problem of stopping" instead of the coherent criterion of logical truth, the ultimate solution of which is the possibility of complete testing of the program for the correctness of its implementation, and go on.

1. Digitality

So, all around are bits, and something needs to be done with them. What to do with them in the first place? Perhaps pack in the minimum unit of memory. You can't get away from memory in IT, it is clear that it should be somehow structured, and that the easiest way to do this is to present it as a linear sequence of cells of the same capacity. If the memory determines the spatial (static) aspect of the information system, then the temporal (dynamic) one should compare the process, which, regardless of the target objectives, could be represented as a set of actions for updating the memory. Thus, we have a memory and devices controlling it, which determine the dynamics of the information process, with an adjacent information structure defining the way the devices interact with the memory and with each other. The rather rich architecture assumes a hierarchical management structure, and with all the potential diversity of its functional devices, the first steps in the development of binary technologies require the definition of such a control device that would be able to ensure the functioning of the entire system through a formal language - a processor is needed in one word. A moderately rich information system can be managed by a single processor, although this solution can hardly be considered typical, since the “division of labor” in information processes is always welcome. Well, with the increasing complexity of tasks, this number can be increased from one to, say, a million processors (distributed computing), which form their own management hierarchy. The maximum is “postmen”, because if the complexity of the process being modeled exceeds the capabilities of this “postman” (for example, you will need to shovel the entire tree of options in chess in a second), then to solve problems of this level you will need quantum technologies using qubits in calculations to arithmetic. Thus, it is possible to determine the horizons of the expediency of using binary technologies.

There are two principal ways of interpreting the data stored in memory: the actual data and the code executed by the arithmetic logic unit (ALU), which defines a set of elementary ways of manipulating the data that make up the formalized low-level language, acting as an intermediary between the hardware and the abstract model of the information process . All actions for managing the system can be divided into internal ones that interact with the processor registers between themselves and with the memory addressed directly by them, and external ones that implement the “exchange of substances” between the functional core and the “environment” through direct and feedback links. Thus, one of the top priorities in the design of information systems is the creation of a machine language (assembler), which defines a set of atomic actions that provide storage, exchange and transformation of data. It is clear that the device that deals with all of this should be logical and to some extent arithmetic - hence the abbreviation ALU.
')
At this stage, we have an indefinable number of ways to combine machine language with the principles of communication between devices. The first step of formalization consists in defining rules that are valid for any functional system that combines centralized control of information flows through machine language with linear and mono-extensive memory organization. Historically, the concept of "bitness" has lost its relevance due to the rapid development of IT according to Moore's law, because of which (the law, not Moore) had to do everything hastily and then plug holes and produce a bunch of ballast information needed to solve compatibility issues. In order to avoid inconsistencies at least at the basic level of the organization of information systems, it is advisable to postulate the bitness as a fundamental constant, assuming that the lost opportunities available for free handling will be insignificant in comparison with the advantage that the mono-bit method of quantizing data allows.

2. Memory page

So, the next element in the list of definitions after a bit (memory, processor and devices we take for granted) we assume the bit depth. Having checked the option “single bit”, we will go through the other global settings and see what values there are by default.

By default, we have a linear memory organization and, as a consequence, the most trivial way of numbering and identifying the data stored in it, while the contents of the cells, depending on the situation, can be interpreted either as data or as the code of the command being executed. We set the key parameters that determine the architecture of the information system, which, taking into account the agreements adopted at the moment, can be endowed with predicates of “binary” and “single-bit”:

R - bit depth (width of a memory unit in bits)

A - the number of addressable units of memory (the amount of address space)

C - the number of machine instructions (the amount of command space)

We set the key condition that contributes to the integrity and completeness of the architecture, which is based on the principle of single bit:

A = C = 2 ^ R

Compliance with this condition means that any memory cell is capable of containing the address of any other cell, and also contains exhaustive information about the action being performed. The reduction of the address space is fraught with references to non-existent memory, and the increase is redundant information necessary to ensure the ability to address all memory. The same with the command space (in the future I will denote it by the abbreviation KP).

Are there other conditions that meet the above criteria and are obvious without going into the details of the development of low-level architecture? Yes, at least one such condition exists:

R = 2 ^ N

Compliance with this condition means that in order to specify the number of any bit in a cell, an integer number of bits is required (again, an excess or deficiency is undesirable here for reasons of ease of writing and the reliability of the machine code).

Obviously, N = 4 is the minimum acceptable value, since 256 bytes of memory with N = 3 is suitable only for programming Christmas tree garlands. With N = 5, we get the amount of memory corresponding to the latest achievements in the field of IT. But if we consider the main purpose of memory addressed by the processor, storing the machine code (engine), and taking out most of the data manipulated by this engine beyond the limits of the addressable space (and as a result, assume external devices to be carriers), then with this memory size ( 16 GB) machine code will be only a small part of it, and the rest of the space will be occupied by data. But in megabit machine code with N = 4, written on an efficient assembler, provided that the data is moved out of the address space, it will be possible to place, perhaps, the core of a full-fledged operating system. So, in principle, a mono-sixteen-bit architecture is capable of providing the necessary minimum of resources, namely: 65536 words of addressable memory and the same number of machine commands that make up the complete list of elementary actions performed by the processor.

The limitation is quite significant and creates a number of inconveniences caused primarily by the small amount of addressable memory, which requires the transfer of the processed data outside the address space. Conveniently, when they are “at hand” - that is, they are directly accessible to the processor, and are not located “somewhere there” in the external memory, from where they need to be sent to the internal, to do something with them, and then send back. This moment is perhaps the bottleneck of the mono-16-bit architecture and is fraught with the complication of a high-level language, since the programs written on it will not be able to abstract from the nuances associated with the organization of data exchange with external devices. Proper placement of accents allows you to solve all these problems, well, in general, when writing a program, a programmer would not be superfluous to imagine the dynamics of the information process in full, and not at the level of abstract abstractions. It can also be noted that the ideology of programming in this case will shift the emphasis towards creating holistic and complete solutions that fit into one memory page the size of megabits - and this is a considerable amount to provide a fairly complex functionality. Fundamental ways of resolving problems arising from reducing the amount of addressable memory to 64 kilosylists will be discussed later, here I would like to focus on the benefits of taking N = 4 as a fundamental constant.

First of all, it concerns the command space (KP): 16 bits is enough to implement the ALU in its classic form. Thus, the increase in the digit capacity of modern processors to 32, and subsequently to 64, had practically no effect on their system of machine instructions and, in fact, the assembler remained 16-bit. In principle, nothing prevents to come up with a machine language, the system of commands of which will effectively use 32 or more bits, but as practice shows, there is no urgent need for this, therefore it is advisable here to take the 16-bit amount of KP as the default value. In addition, N = 4 provides additional convenience, namely:

R % N = 0

Compliance with this condition means that in one memory cell you can place an integer number of values in the range [0 ... R) (for example, for N = 5 this condition is not met, since in this case R equal to 32 bits is not divisible by 5 no residue). Since the bit is a fundamental parameter, this property will provide additional benefits in the future, although in comparison with the previous arguments this one is of the third order of importance.

3. External RAM

Thus, the adoption of the number 16 as a fundamental constant determines the basic compatibility of the spatial and temporal aspects of the information process, namely: a memory page of 64 kilograms and a list of atomic actions of the same size and format. Hence the notion of digit capacity, providing this total amount and format - megabit “porridge” of zeros and ones, primarily structured in quanta of 16 bits each, constituting a data word. As a result, a single memory cell can accommodate the address of any other cell or contains comprehensive information on the action currently being performed. It is assumed that regardless of the details of the implementation of the system architecture, the digit capacity ensures the compatibility of its functional components at the first level of abstraction from physical processes, thereby facilitating the convenience of formalizing this architecture. At the same time, the omission of potential opportunities from accepting such a restriction is assumed to be insignificant - that is, I proceed from the assumption that if we rationally manage these initial resources, then it will be possible to create an effective information model on their basis, say, any complexity within the “postman” . Megabit of data is not the upper limiter of information resources, but the next (specifically, the third after a bit and word) level of formalization. The complication of the simulated processes naturally leads to a hierarchical management structure, and assuming the bitness is relevant for the next steps of their organization, we obtain the amount of external RAM, convenient for its compatibility with 16-bit architecture: 65536 * 65536 memory pages = 8 GB. I translate into generally accepted bytes, and as you can see, this figure is in good compliance with current requirements for resource-intensive technologies. The principal difference is that the shared memory cannot be addressed directly by the processor, or it is necessary to use two words of memory for this. The latter is not a problem in itself, however, due to the fact that the devices and processors making up the system must somehow divide this memory between them, a lot of nuances arise in its distribution and ensuring access to it. Assuming initially arbitrary the way in which this can be done, I will try to nevertheless highlight the key points that the formalized model of the information process, built on the philosophy of single bit, must necessarily take into account.

4. Integrated system

Up to this point, the addressable memory was assumed to be machine-oriented storage, although this did not exclude the possibility of storing data in it. But it’s not possible to directly process the memory page as a whole, because at least some part of it should be taken by the processing program itself. Since the page is supposed to be accepted as a standard unit of data fragmentation, this problem has to be solved somehow. The easiest way to do this is to include an additional page in the processor’s visibility zone, which could serve as a buffer that exchanges data with external memory without affecting the machine code page (with the two status bits being allocated to the source and receiver, respectively, so that the data can be to drive both within one of two pages of memory, and between them). Due to the small size of the page, it will not be technologically difficult to transfer data in one frame without splitting this megabit into quanta with one word - which will reduce to no the time costs of fragmentation of large amounts of external data intended for processing by the processor. You can also do without driving data back and forth, using substitution technology, which allows you to select any of the 65536 pages of external RAM and address it directly by the processor. If you do the same with the code page, then you can execute programs stored in external memory, thus removing the limit on their size to 64 kilograms and expanding it to 4 gigaslovs.

Based on the criterion for the admissibility of the substitution of the code / data page from the external RAM into the area addressed by the processor, it is advisable to divide the processors into leading and driven ones. On the one hand, this separation is conditional, since the management hierarchy can be structured as you like; on the other hand, it makes sense to single out one or several processors that make up the upper level of system management and are capable of configuring it - such as the operating system kernel, external memory manager, BIOS analogue, and others that focus on private storage of executable code and in which all these data substitutions ( and especially unknown code from external RAM) for reasons of reliability would be undesirable. Actually, that’s why the number 16 was calculated in order to naturally provide the primary markup of information resources in accordance with the requirements for speed and the degree of autonomy of the processes operating with these resources. Everything is rather trivial: based on the idea that a binary code is used as the basis for data manipulation, we take a two and write its fundamental sequences:

- n ^ 2 | 2 ^ n

- | 2 ^ 0 = 1

2 * 2 = 4 | 2 ^ 1 = 2

4 * 4 = 16 | 2 ^ 2 = 4

16 * 16 = 256 | 2 ^ 4 = 16

256 * 256 = 65536 | 2 ^ 16 = 65536

- As you can see, the number 16 is not only the minimum allowable value, but also the “most round” among the nearby ones - that is, without going into the details of the implementation of the information model, we can state the convenience of filling information in portions of just such a capacity (the above condition “R% N = 0 ", observed for the next in the list of values of digit capacity equal to as much as 256). From the above considerations, it is also possible to verify that the initial sweep of the principle of monos 16 decimals into the management hierarchy is in good agreement with practical applicability, because information processes, depending on the degree of autonomy of the system element under consideration, are differentiated according to the performance criterion, which is inversely proportional to the information capacity of the process. "Cache", painted by levels of organization of the system: "cache first level", "cache second level", and the whole that the hierarchy is usually fit into a few floors. When R = 16, we have the following indicators:

0th level cache (16-bit wide word): R-euders of the processor intended for addressing internal memory (they will most quickly exchange data between themselves in the absence of the need to go beyond the physical localization of the ALU)
cache of the 1st level (a memory block of 64 kilograms = 1 megabit): internal memory pages intended for direct addressing by the processor (one or two — more of them, with the possibility of a full page refresh for several processor cycles, is not necessary). Fix for this type of memory abbreviation AZU - I mean, addressed.
Level 2 cache (external RAM with capacity of 64 kilo pages = 64 gigabits): memory shared by the devices that make up the integrated system — an autonomous unit, a complete information process model capable of interacting with other systems on a common basis (from the system’s point of view , providing the possibility of intersystem interaction, is only one of the devices, the functioning of which can be described by the same expressive means that are intended to reflect the exchange processes data within autonomously operating system).
cache of the 3rd level (ROM with a capacity of up to 64 kiloZoZ = 4 petabs): on a hard disk of this size, for example, all movies made in the world will fit, so that the information capacity of this level can be considered unlimited. It is also possible to consider as irrelevant the next multiplication by 65536, which sets the next constant indicator of information capacity. Actually, there is no particular reason for this, since operating with such large amounts of data blurs the idea of what they can be intended for and how they can be structured. In addition, this memory is too slow to be considered as a “cache” - so, unlike previous levels, it does not make sense to strictly limit it. 4 gigaslov of RAM are assumed to be sufficient so that there is no need to drive the data too often to ROM and back. If the calculations turn out to be more resource-intensive, then it would be natural to solve this problem by using a sufficient number of processors and effectively distributing the functional load between them.

So, on the basis of the digit capacity by the recursive method, 3 levels of abstraction are formed, defining the structural depth of nesting of processes named by an enumerable set of language constructions defining the format and boundary conditions of applicability of the actions compared with them:

0 => 1 : a system of machine instructions, exhaustively defining the primary way of quantizing actions
1 => 2 : a high-level programming language (its object-oriented orientation is assumed to be natural, and as a result, the specialization of external RAM in the storage of objects)
2 => 3 : the language of system queries (macros) - for example, a command or address string that allows you to perform complex sequences of actions with a single command with a variety of parameters provided for it. In fact, this level can be considered the sphere of formalization of the key functions of the operating system.
3 => 0: initially assumed conditional in view of the conditional inexhaustibility of the resources within its composition. Based on the idea that a single point action “by a mouse click” can involve the information process of any information capacity and complexity, we believe that any named actions that were not included in the three previous categories of language abstractions can be included here. To endow this level with specificity allows its closure to the level of "iron" - which are further considerations.

The meaning of the formal language of the next level (if we consider it not zero, but fourth) implies going beyond the autonomous system, and as a result should be designed to name actions that organize inter-system interaction. But as has been said, intersystem data exchange forms its own hierarchical structure - starting with low-level protocols for digitizing analog signals, through a simple and universal TCP-IP protocol, on which the work of the world wide web rests, and so on - as the application horizons expand. implementations are becoming increasingly difficult. Here it should simply be noted that the formalization of network messages has its own specifics and, as a consequence, its own terminological structure - starting with simple algorithms for digitizing continuous signals with subsequent step-by-step reduction of this “porridge of bits” to the form “edible” for its “consuming” programs, themselves lining up the hierarchy as the principles of organizing data exchange from low to high become more complex.That is, all this information - it should go in parallel with the development of formalization languages, while not exerting a significant influence on them and being essentially only one of an indefinite set of categories of tasks solved by these languages. And since all these issues should be solved at the level of the operating system, then the methods of their formal display should somehow be distributed at the first three levels of the system organization. For example, TCP-IP itself is written in asm-e, a program for converting data packages into objects is written in C ++, support for the address bar is written as an OS macro, and a macro is a rather tangible concept in terms of functionality and capacity of operations operating with objects already designed in RAM by object-oriented means. It may contain nested macros,performing rather complicated actions each separately - so that with further deepening of nesting there is no reason to introduce fundamentally new information.

The next level of abstraction from specificity (that is, physical processes localized in the hardware) should further increase the expressive power of the conceptual base — that is, this transition should be qualitative, rather than quantitatively increasing the resource intensity of the abstractions associated with it. The conditions listed are satisfied by the possibility of automating the construction of an architecture of arbitrary complexity on the physical level based on an information model exhaustively described by a formal language. That is, we draw a communication scheme in the editor, define input / output connectors for connecting external devices, select firmware for one or several processors that make up the functional core of the system, press the large “create” button, and then the device this program works with (for this hypothetical moment),no worse than a printer; electronic documents; prints electronic boards packed in a box inside which the required architecture is implemented. We connect external devices, click "start", and everything works - of course, provided the spelling is correct and suitable for collaboration of the engines, "wired" in the program memory of the control processors. The classic implementation assumes a variant with a single processor, which, among other system utilities, is spinning a program capable of loading the rest of the kernel engines. Those, in turn, can use the remaining processors in existence, if those are provided for in the system, according to the target tasks - be it working with specific devices, implementing multitasking or organizing distributed computing. In principle, let's say a variant with one processor, on which the whole system rests,although this method is fraught with a serious load on speed.

Now these details are not so important, here I would like to focus on the expediency of distinguishing among language abstractions of the closing level of such a term as firmware - that is, a program (static code) compiled in machine language, not exceeding the size of one page of memory and intended for execution by one processor, as well as implementing some kind of holistic solution for managing the information process.

5. Arithmetic and geometric aspects of the information process

Here I will summarize the subtotals.

So, the digit capacity R = 16, fixed as a fundamental parameter of the information system, combines the spatial (AMS) and temporal (KP) aspects of the information process, thus ensuring the basic compatibility of functional devices, and as a result, simplifying the construction of a formalized model of various processes. Having such initial resources, the recursive method sequentially determines three levels of system organization (highlighted in bold lists), implying the mapping to the hierarchy of language abstractions of the corresponding expressive power. In order to adapt engineering terminology {bit => cell => AMU => RAM=> ROM} for linguistic I suggest to associate the following sequence: {letter => word => page => volume => library}

The last level of abstraction, which closes the control hierarchy to zero (that is, again on physical processes, only on the reverse side), is beyond the scope of the list, since consideration of its elements makes sense only if there is a ready architecture on which they could line up - in While at the design level, this architecture is yet to materialize. Taking into account the above considerations, this method consists in determining the scheme of communications and the choice of firmware of control processors. Hence, there are two principal objects of formalization, the first of which is represented by information channels for exchanging data and communication nodes tied into them, and the second is machine code quantized by elementary data conversion actions. The first aspect can be called "geometric"since its formal display is conveniently represented graphically - in the form of diagrams with arrows indicating the direction of circulation of information flows in the system, as well as rectangles and other symbols of its functional components. The second, accordingly - “arithmetic”, since the processor instruction set is oriented towards the implementation of basic mathematical calculations. You can also identify the criterion on the basis of which the fundamental difference in their functional purpose is determined: if “geometric” abstractions determine the ways of organizing the transfer of data between devices, then “mathematical” means how to transform them. In this case, both aspects should be displayed in the command system, that is,indicating the direction of circulation of information flows in the system, as well as rectangles and other symbols of its functional components. The second, accordingly - “arithmetic”, since the processor instruction set is oriented towards the implementation of basic mathematical calculations. You can also identify the criterion on the basis of which the fundamental difference in their functional purpose is determined: if “geometric” abstractions determine the ways of organizing the transfer of data between devices, then “mathematical” means how to transform them. In this case, both aspects should be displayed in the command system, that is,indicating the direction of circulation of information flows in the system, as well as rectangles and other symbols of its functional components. The second, accordingly - “arithmetic”, since the processor instruction set is oriented towards the implementation of basic mathematical calculations. You can also identify the criterion on the basis of which the fundamental difference in their functional purpose is determined: if “geometric” abstractions determine the ways of organizing the transfer of data between devices, then “mathematical” means how to transform them. In this case, both aspects should be displayed in the command system, that is,since the processor instruction set is oriented towards the implementation of basic mathematical calculations. You can also identify the criterion on the basis of which the fundamental difference in their functional purpose is determined: if “geometric” abstractions determine the ways of organizing the transfer of data between devices, then “mathematical” means how to transform them. In this case, both aspects should be displayed in the command system, that is,since the processor instruction set is oriented towards the implementation of basic mathematical calculations. You can also identify the criterion on the basis of which the fundamental difference in their functional purpose is determined: if “geometric” abstractions determine the ways of organizing the transfer of data between devices, then “mathematical” means how to transform them. In this case, both aspects should be displayed in the command system, that is,C = G U A , where C is the command space with a total volume of 65536 machine instructions, G is the set of instructions that initiate the external data exchange mechanisms, and A is the internal addressing commands through registers. Since the instructions that make up the G set abstract from data conversion and essentially come down to the MOV source -> destination command, their formal definition comes down to identifying the source and destination. But if for the set A all methods of their identification are obviously included in it, then the commands from the set GThe processes involved in the “exchange of substances with the environment” cannot knowingly “know” from where and where they will have to transfer data, since the communication scheme is not displayed as part of the ALU. In order for such an opportunity to be available, this scheme will have to be fixed at the system design level — which will deprive the design process of flexibility, and further consideration of this issue makes no sense. If, on the other hand, the functionality of commands from group G is assumed to be indefinable, and taking into account the fact that the essence of their actions always comes down to distilling data from a certain source to a certain receiver, then the load on the command space for the set G can be neglected and reduced to one element - the only command without parameters (I will call it trite CMD). At the same time, information about what exactly and where exactly should be sent can be drawn from registers referring to arbitrary areas of the RAM and RAM — so there are no fundamental restrictions on the volume and method of structuring the target data that the CMD manipulates . In the proposed case, information about this method is transferred to the hardware implementation of the part of the processor that is responsible for external communications and, as a result, is an integral part of their circuit. Now KP almost entirely belongs to the set A , and the tasks of formally defining the system of machine instructions and principles of organizing data management no longer interfere with solving anything independently of each other.

To summarize the results: knowing the communication scheme and the command memory of the leading processors, and also assuming the correct operation of each firmware separately and the reliability of their joint operation, we can give a correct and comprehensive formal description of the system as a whole.

6. ALU

For the time being, I leave the geometric aspect aside, assuming that it is possible to determine for it the necessary and sufficient minimum of expressive means, which is optimally consistent with the ideas of a low level of programming. The subject of my immediate interests is to find the most effective way of distributing a CP — that is, the task is to determine the 16-bit machine instruction system that best correlates with the principle of necessity and sufficiency (“there is everything that is needed and nothing is superfluous”) . At the first step, this correlation can be traced clearly and unambiguously, identifying 5 necessary and sufficient entities that ensure the full functioning of the program:

1. PC ( P program Counter - program counter) - reflects the principle of step-by-step program execution in the direction of increasing the address of the command being executed. Here is the standard sequence of actions with the program counter: send to the ALU a command, the code of which is contained in the cell of the RAM pointed to by the PC; then increment the PC (it now points to the next word after the code of the currently executed instruction addressable memory).

2. SP ( S tack Pointer - a pointer to the top of the stack) - reflects the principle of building a program, which assumes its design in the form of a hierarchical structure of subroutines. As a result, the return address should be remembered to the place from which the subroutine was called, and since these calls can be nested, it is necessary to allocate some memory to store all these addresses. Hence the organization of the stack according to the LIFO principle (the acronym Last In, First Out, “last came - first gone”) and two principal ways of using it: “pushing” the current address onto the stack (PC -> (-SP)) and “pushing »It from there back to the software counter ((SP +) -> PC). Other ways of using the stack are possible, but these two actions are necessary to ensure that the program can be structured using classical means. Note:here and hereafter, the signs of pre-declaring (-) and postintext (+) will be placed immediately before or after what they are applied to — later in the text there will be various types of addressing, among which there are cases of ambiguous interpretation when using the assembler PDP- mnemonics 11, which, when first considered, will be useful as a basic prototype.

3. STATUS- contains flags necessary for conditional transitions. Condition - it is either complied with or not complied with, and in order to determine this, state flags are needed. There are only three obligatory flags reflecting the results of the simplest mathematical operations: zero / non-zero, positive / negative, there is a carry / no carry (the latter is needed so that the bits that get out of the memory cell are not lost). For compatibility with the 16-bit architecture, it is certainly convenient to expand the status register to 16 flags, which will not be difficult to find application in the future, however, at the first stage of formalization, I single out the elements, respectively, of only the first degree of necessity. Here you also need to add the two flags mentioned above, by switching which you can address, in addition to the code memory, an additional RAM - data page. The remaining 11,the distribution of functions which requires more imagination than is permissible in the first step of formalization, is for the time being beyond the scope of consideration.

4. R- general purpose registers play the role of local variables that provide for the immediate needs of the subroutine currently being executed. They can also transfer values between subroutines, interact with the stack, address memory, in a word, unlike the first three, they are unspecialized. There is no need to think about their number for a long time: 16 pieces are optimal not only for ideological reasons, but also because having some programming experience on asm, I can say that 8 is not enough (as a rule, these registers in the program are in great demand, therefore with so many of them the code goes to a temporary transfer of their values to the stack and back), while 32 registers for local functions are already overkill (you can also store the values of variables in the CAM). Further, for brevity, general purpose registers will be denotedRON- s, and special - respectively PCN- s.

5. Interrupt- interrupts from external devices. Since the defined command system abstracts from external information flows, it is enough to interrupt here to know that it can occur at any time, a subroutine is needed to process it (respectively, an indication of the address from which it starts), as well as the possibility of returning to that place where the main program was interrupted. In order not to produce entities beyond the need and not to fill the memory with redundant data (for example, a table containing processing addresses for each of the interrupts), we take the zero address as the start of processing any interrupt, and in addition to the return address and the current value of the status register, put information about the interrupt itself. If this information is not sufficient to determine the necessary actions for handling an interrupt,additional information can be loaded into addressable memory by using commands (s) from the setG. Increasing the load on performance in comparison with the method using the address table is irrelevant - just to the interrupt handler located at the zero address, you have to add a number of checks and conditional transitions that organize branching by the interrupt number placed at the moment of its occurrence on the top of the stack. The checks themselves can be ranked in order of priority of the numbers being checked - so that interrupts with the highest processor distraction frequency and those that require processing are first checked first. These insignificant costs for speed are more than compensated by the monolithic property of the ABC page containing the executable code - which has a good effect on the integrity of the software implementation. Besides,all internal information flows will be concentrated in three specialized registers andR general purpose registers - what is necessary and sufficient to provide basic functionality of low-level programming tools. So the processor architecture is easiest to keep in mind, and the memory can be distributed in any way you want - if only the executable code came from the zero address, not the data, well, the program must also correctly process interrupts if they are provided for. At the time of the processor start, all registers are reset, the PC is incremented after the next command is sent to the ALU, and the SP is decremented before the next data word is sent to the stack — so that with the initial zero values between the stack and the program code, no overlaps should arise.

At this, the first step of formalization can be considered complete, and the next I will try to give an idea of how much megabit KP "weighs". If you are familiar with the PDP-11 processor architecture, further calculations will be easier to visualize. In ancient times, programmers used them, and then they were supplanted by the hostile ideology of Intel engines, and nostalgia for those times when the number of “holes” in the organization of computer architecture was still tolerable helped me to dig up this topic from the beginning. From this point of view, Intel's assembler can be considered an exemplary example of how not to do it, since the key inconsistencies with the approach proposed here are traced when you first become familiar with its mnemonics. For example, a two-operand command must first indicate the source,and then the receiver - that is, in a natural sequence, reflecting the causal relationships between the physical processes occurring in the "gland". Also, Intel’s machine instructions do not explicitly use a PC — that is, there are clear signs of a CISC-oriented ideology that seeks to “hide behind the screen” the hardware implementation of machine instructions. In short, RISC and they rule, and CISC and gobom.

In the next paragraph, the indicator of subjectivity in my reasoning will increase somewhat, and they will become more focused on enthusiasts in the field of low-level programming.

6. Machine instruction system

So, we have at our disposal 3 special and 16 general registers, to which a number of ways of memory addressing apply. In the initial state, the memory is received by pure and “unspotted” specialized information that requires the introduction of some other entities besides those listed in the first iteration step. The processor starts from the zero address of the code page, sequentially executes commands, performs conditional and unconditional jumps, calls subroutines, and finally distracts to interrupts (the latter are generally allowed not to be mapped to the set A, since the return algorithm from the interrupt can be borrowed from the usual way of completing the subroutine the RET command, which boils down to restoring PC and STATUS from the stack, so that the program can continue its work correctly).

To make it easier to "weigh" the KP offhand, let us pay attention to the fact that most of its volume is machine instructions with two operands (the three-operands can certainly be eliminated, since they clearly do not fit into the KP page, and they are not very consistent with low level). Three bits of the 16 “bite off” immediately on the indication of the command number to accommodate the classic seven {MOV, CMP, ADD, SUB, AND, OR, XOR}, the loss of any of which would be fraught with a significant omission of possibilities, while in adding to this list of new teams is no longer an urgent need, because through them you can do everything else. Multiplication and division are implemented algorithmically and unlike the listed commands, the hardware method of their implementation on multiplexers is elementary. It is advisable to include this requirement in the list of default decisions (for example, by ticking the “trivial instructions only” option): all commands that make a control unit should be no more difficult than addition / subtraction operations. At the design level of information systems, it makes no sense to complicate the processor, since taking into account the possibility of assembling it, ideally, by atoms, all problems regarding the distribution of the computational load can be solved by adjusting the relative position of processors in space, where the speed of light will be the only limit on the speed of their interaction. As a result, it will be preferable to achieve the required speed by increasing the number of processors and organizing their effective interaction by software, while each processor separately, due to the simplicity of its device, will be able to provide sufficiently high performance indicators. So the ideology of RISC to the proposed approach will be clearly friendly - all the more so since the places in the command space are so close. And finally, the main convenience of accepting an agreement on the triviality of machine instruction instructions that make up an ALU consists in leveling actions by complexity at the fundamental level of designing information systems - in this case the term “low level language” will fully correspond to the desired solution.

Further evaluation of the information capacity of the KP will be based on the idea that 7/8 of its volume is occupied by instructions with two operands, one of which is represented in the command code by the source addressing field, and the second by the receiver's addressing field. The addressing field must contain comprehensive information about the register through which the addressing is performed, as well as the addressing method itself. The registers are only 19, that is, the indication of the register number is required by the average estimate of log2 (19) bits. Three bits went to specifying the command number, the remaining 13 give two addressing fields of 6.5 bits for each, so we have 2 ^ [6.5 - log2 (19)] ~ 4.8 types of addressing to the operand. The PDU-11 CPUs' ALU uses 3 bits to indicate the address number and the same number to the register number — that is, it includes 8 registers and 8 ways of addressing through them, while PC and SP are among RON s. The latter property is useful in that it allows the command system to be made orthogonal, and as a result, it makes it possible to memorize instructions separately and, separately, access methods to operands (which actually attracted these processors to programmers at the time). In this case, this approach is unacceptable for two complementary reasons:

1. For specialized registers (mainly for PC), many ways of addressing are not applicable, because they lead either to a guaranteed freeze ((-PC)) or to a program crash due to the execution of data instead of code ((PC)). As a result, the manual contains instructions, of which a considerable number, formally admissible, but in practice inapplicable. Direct addressing of PCHs should generally be avoided - from the point of view of addressing, this is their fundamental difference from RONs, for which direct addressing is considered natural, while using PCHs to temporarily store values is understandably unacceptable. That is, for PC and SP, the point of reference is indirect addressing, not direct, because, first of all, we are not interested in the very value of these registers, but in the contents of the cell to which it refers. This does not mean complete exclusion from the ALU of the possibility of direct addressing of the PCNs, and if the programmer needs, say, to restore the stack directly or jump further than the local transitions allow, appropriate instructions should be provided for such purposes. The difference is that these instructions will take place in the command system not on a general basis, but in the form of a small number of specialized (“piece”) commands, so as not to burden the control and exclude practically unapplicable addressing methods. Take, for example, the application to the PC and SP bitwise operations AND, OR and XOR - hypothetically this possibility is permissible, but it will take a lot of imagination to come up with a situation when such a need arises (and if it does, then only that you have to do the same thing with two teams instead of one). Thus, the use in the RSN-s program has its own specifics, in accordance with which the set of permissible actions on each of them should be determined. There are other fundamental differences from the PDP-11 command set, because of which its further comparison with the ALU implementation defined here will be less revealing.

2. The second reason, as has already been said, is that there is too little space in the control panel so that you can allow yourself to be scattered like this, as described in the previous paragraph, and therefore the orthogonal set of commands will have to be canceled. The increase in the number of RONs from 6 (PDP-11) to 16 resulted in a reduction in the number of addressing methods from 8 to 4.8 on average per operand, resulting in addressing becoming the “bottleneck” of a 16-bit ALU using a total of 19 registers. For example, for RON s, only 4 standard addressing types are placed in the command space: R, (R), (R +), (-R). In principle, in a greater number of them there is no special need, since double indirect addressing (taking the address to the address) is relevant for the aforementioned reasons only for PC and SP. And about the rest, included in the composition of the 8 PDP-shnyh addressings, we can say that they would not hurt, but if they are excluded, it will not be a big loss due to the atypical appearance of RONs in other types of addressing. Generally speaking, for the PC, only two ways of addressing are applicable: (PC +) and & (PC +) (the first is called “indirect auto-increment”, the second is “double indirect auto-increment”). The post-increment is necessary in order to jump over the data word after it is taken by the ALU for processing - in order to avoid the execution of a random command intended to carry out the next one. As a result, the first way of addressing by meaning means taking a constant — a value that is at the time of the execution of a command in a cell, located in the CAM following the code of this command (let's call this cell “current”), and the second is accessing a variable (this time the current cell contains the address of the cell where the value of the variable is stored — hence, double indirection. The use of the first type of addressing makes sense only in relation to the source, since the data is usually sent from a constant, and not vice versa - to a constant. In principle, nothing prevents you from recording any data at the current address, but this only makes sense if it is intended to be used further - for which you need to know the address of the current cell. It turns out that the constant in this case turns into a variable that could be stored in any other cell - that is, at the receiver this method of addressing will be redundant, and with meticulous selection of addressing according to the criterion of utility, it should be deleted. Thus, if the PC besides these two methods (formally, there are not even two, but one and a half), it makes no sense to address it in a different way, then for RONs, more addresses simply do not fit into the address space, and if you stay within the classical implementation of the ALU, then we can consider these methods necessary and sufficient.

But on the SP and STATUS in the command space there is enough space to provide greater opportunities for their addressing. The stack is usually used in the program often enough to give it attention and to compare a more solid set of addresses. In the current implementation, this set contains 16 types of addressing, half of which are addressing with offset (N (SP), N + (SP), -N (SP), & N (SP), & N + (SP), & -N (SP) , & N (SP) +, & N- (SP)}, where N is the offset byte, under which half of the status register is allocated - that is, the offset value from the top of the stack cannot exceed 255. But unlike the PDP-11 instruction set, here N is a variable, not a constant — which allows you to apply auto-increment / auto-decrement to it and extract values from different places of the stack, leaving its top untouched. As a rule, we are interested in data that is located not above the top of the stack (actually brought onto the stack, and not overlying, which are usually considered to be already worked out), therefore it is impractical to include negative values in the range of the offset value. The remaining half of the addressing is as follows: {(SP), (+ SP-), (-SP +), & (SP), & (SP +), & (SP-), & (SP) +, & - (SP) }. To avoid confusion in the interpretation of the second and third elements in the list, I note that this is one of the few exceptions to the asymmetry of addressing for the source and the receiver (here the source is assumed to be post-increment / decrement, while the receiver is as well as in the previous eight, they use the stack target cell as a counter initialized with the initial address of the memory array. The elements of the list were selected according to the principle of least artificiality and greatest utility, and nny time me this collection, the number of items that tailored to the "round" number 16, seems to me best.

The status register turned out to be the most productive in terms of the diversity of addressing methods. Asking how to best deal with the remaining 11 bits, I came to the conclusion that it is best to split STATUS into two bytes, one of which will contain special-purpose flags, and the second - general (by analogy with the basic classification of registers) . Each of the bytes is divided in turn into two nibbles. The first specialized nibble contains 4 arithmetic flags, 3 of which are described above, and the second nibble contains 4 “geometric” flags, 2 of which are reserved for the switches of addressable memory pages. Arithmetic flags can be “finished off” with an even parity flag that checks the low-order bit of the result (for symmetry with a sign flag that checks the most significant bit), and the purpose of the two remaining “geometric” flags are reserved for external communication functions that have not yet been defined. The special-purpose flags (FSNs) are beyond the scope of further consideration - from the point of view of addressing, the FONES, or more precisely, the general-purpose byte, are especially interesting here, one example of which is described above (where about SP addressing with offset). Marking the nibbles with the variables Bx and By, I present the initial list of the status register addressings: {B, Bx, By} —that is, the byte as a whole and the nibble bytes apply the direct addressing method separately; you can store values there and convert them on a common basis. But the main goal of further fragmentation of the status register was to store the register number in nibbles and, as a result, to ensure the possibility of RON s indexing: {Rx, Ry, (Rx), (Ry)}. Example: MOV (Rx), (- SP) - send to the stack the value of a variable whose address is placed in a register whose number is specified in Bx. Now you can store short arrays in RON s, refer to them by value, well, in general, an additional floor is formed in the memory access hierarchy: 4-bit mini-register indicates 2 ^ 4 = 16-bit RON (word), and that in turn can refer to the cell 2 ^ 16 = 65536-word AMS (page). It is very useful and promising, which expands the possibilities of structuring the data processing processes, as well as undemanding to the place in the CP. Another possibility, which I also think should not be neglected, is to associate general-purpose nibbles with arithmetic flags (which appear below under the indices z, n, c, p): {Rz, (Rz), Rn, (Rn), Rc, (Rc), Rp, (Rp), Bz, Bn, Bc, Bp}. Example: MOV (SP +), (Rn) - take the value from the top of the stack and send it to the cell whose address is specified in the register, the number of which, depending on the state of the flag n, is either in Bx or By (the remaining actions in the list can be determined by analogy). The proposed method of using nibbles of status significantly increases the expressive power of the abstractions making up the command space, while the hardware implementation of the actions associated with them does not go beyond the framework of triviality, which is considered a necessary condition for compliance with the RISC ideology. Here, by the way, the above-mentioned advantage of the number 16, as “the most round” (for which the condition R% N = 0 is met) came in handy: the 16-bit register number can be placed in another 16-bit register (in this case, in the status register) exactly 4 times.

At this stage of the ALU definition, the command space is divided into 8 equal parts, of which 7 are assigned to the two-operand teams, and the 8th is reserved for the rest. Next in terms of capacity, there is a group of instructions that carry out local transitions and include in their code an offset value relative to the place of their execution. There is no sense to reject an offset less than a byte, but it will not work anymore for the same banal reason for the lack of space. From the remaining 13 bits, we “bite off” the offset byte, and 5 bits remain to indicate the command number — that is, they cannot be more than 32 bits. And at least it is necessary to allocate a place for single-operand teams. There are not many “bite off” there (it’s difficult to manage with less than 24 teams from this group), but since single-operand commands “weigh” 2 times less due to the fact that they need 7, not 8 bits per operand representation, then from 8 I can make 16 single-instruction branches of branching - which is quite enough in the conditions of austerity of command resources. These figures are close to PDP shnyy, well, the experience of laying out mosaics according to the criteria set in the condition of this task shows that such proportions are optimal and stable, namely: the two-team command occupies 7/8 of the command space, and the remaining 8th part is shared single-instruction commands and local transitions in the ratio of 1: 3 volumes. The full system of ALU machine instructions is not limited to these three groups of commands, but they are the most popular in the program, but from the point of view of the load on the command space, edelyayuschimi. For the rest (bezoperandnyh, flag and others, perhaps not fit into the overall format of commands) assumes the allocation of space from the remnants, untouched by the addressing tables of the three basic sets of machine instructions. Thus, at the second iteration step, we get 7 + 16 + 24 + {approximately the same additional} commands that make up the full implementation of the ALU. Their unequivocal composition is defined only in the first group as a “classic seven”; the composition of the second group involves the inclusion of a dozen standard commands found in almost any implementations of the ALU, and a few more useful low-level programming tools at the provision level (for example, permutation of bytes in the word); the third group contains the branching and looping commands in the offset byte range; the rest provide additional convenience and flexibility of writing machine code. In this case, one command can contain from 1 (non-operand) to 8192 (two-operand) machine instructions.

It would not hurt to provide the processor with additional registers - such as TMC (system timer counter), TMR (system timer preset register that specifies the counting period), PTI (input port), PTO (output port) and CFG (system configuration register containing prescalers , ; ; , ).In this case, you will have to add to G a number of instructions exchanging data between the main and additional registers just listed.

On this, perhaps, everything - if you do not go far beyond the essential elements necessary to ensure the basic functionality of the processor. As can be seen from the above calculations, to ensure an acceptable implementation of low-level (RISC-oriented) programming tools, 16-bit KP is necessary and sufficient, and in an effort to comply with the first part of the problem condition (“everything is necessary”) for the second (“there is nothing superfluous” ) in the manual there is almost no space left. So, in essence, the question was reduced to the distribution into groups of commands and addresses with their preliminary allocation and subsequent coordination with the data page capacity.

7. Results

The above reasoning is focused on finding the most effective means of expression for describing a formalized model of the information process - so that it can be easily kept in mind at least at the level of matching physical processes with logical abstractions. The principal way to do this is to fix two key points:

1. Accept the bit width of the fundamental constant, designed to combine the spatial and temporal aspects of the information process, through which we recursively define a three-tier architecture, the spatial aspect of which defines the capacity of a memory unit at each level {word (cell) => page (AZS) => volume (RAM )}, and the time is the complexity of the processes initiated by the control actions distributed over these levels {machine instruction (command) => control firmware (program) => integrated system (actually, computer) }

2. We proceed from the concept of the fundamental possibility of an exhaustive description of this architecture by the expressive means of two non-intersecting abstract languages, the first of which (the notation of information channels and nodes, or "geometric language") contributes to an exhaustive definition of the interaction between its functional components, and the second (the system machine instructions, or "arithmetic language") provides a formal basis for the development of an algorithm for operating the system at a low level.

Formally, both languages should intersect in the same CP - that is, external communications management teams should be included in the number of processed ALUs. But in reality, the latter can be represented in the command system by a single instruction, the result of which depends on the contents of the registers, in which information about the source and receiver, the amount of information being sent, as well as related data defining the nature of information exchange are recorded. This information does not require a lot of space: 2 registers are able to hold the number of any of the 4 billion computers, another 2 - the number of any of the 4 billion files on this computer, the next 2 - specify the number of any cell in the file up to 4 gigaslov . That is, to uniquely identify the source and receiver (say,within the world wide web and up to the address of a specific data word stored in the memory of a specific computer) 6 * 2 = 12 registers will be needed. 2 more will be required to indicate the amount of information sent, after which 2 will remain for the placement of clarifying data - so that it is perfectly possible to keep within 16 registers. If more information is needed for an exhaustive description of the actions performed, then nothing prevents the use of part of the registers as pointers to memory blocks placed in the AZS.If more information is needed for an exhaustive description of the actions performed, then nothing prevents the use of part of the registers as pointers to memory blocks placed in the AZS.If more information is needed for an exhaustive description of the actions performed, then nothing prevents the use of part of the registers as pointers to memory blocks placed in the AZS.

Thus, all KP instructions, except for one and perhaps still insignificant number, are completely abstracted from external communications, so their “visibility zone” is exhausted by 16 + 3 registers and two pages of AMS - thanks to which it becomes possible to develop "arithmetic" and "geometric" languages independently. When first viewed (by default), you can tick off the “mono-channel” and “mono-command” options, implying that the entire executable code is written on the same version of the assembler, and all external communication principles are projected onto the contents of RON s before the CMD call, spelled out by one list of agreements.At the system design level, removing the first tick would simply mean that the ALU for each processor can be selected from the list of alternative versions - if only the CMD instruction and 16 RONs capable of containing the information necessary for an exhaustive description of all types of communications are included in all these ALUs, supported by the hardware ("geometric") component of the process being modeled. That is, the question of the unification of ALU, as a way of describing the program (“arithmetic”) component, is less fundamental, since unlike the processors and the low-level means of formalizing this process adjacent to them, the communication scheme for architectural integrity needs to be represented in the system only includingAt the design level of completed architectural solutions, this moment reflects the natural sequence of actions when disconnecting the “mono-channel” and “mono-command” options: creating a new project, we choose the way of describing the communication scheme (defining the format of the file containing an exhaustive description of the information model), and as we add to this scheme For each of them, we define a system of commands from the list of available implementations of the assembler.

The development of "geometric" means of formalization of binary technologies has its own specifics, different from the "arithmetic". If a lot of machine instructions should fit into a specific amount of gearbox (and, in fact, the considerations expressed in the previous paragraph were designed to make sure that 16 bits, although close to each other, but enough to accommodate a full-fledged RISC-oriented asm-16 ti RON s), then with respect to the “geometric” aspect, there are no objective limitations on the number of constituent language units. Nevertheless, this number should also strive for the necessary minimum, and the principle “there is everything you need, there is nothing superfluous” is no less relevant here, since the formal basis on which building the next floors of the formalization of the information process is expected,must meet the requirements of compactness and easy digestibility. I suppose that in this area it is also possible to find a classical solution through a simple analysis, which, with minimum information costs and limited possibilities for building a formalized model of the information process, would ensure the reliability of operation and ease of design of the architecture created on the basis of this model.

Automation of the physical design of information systems is supposedly the next step in the development of information technologies, and the complexity of the system that provides such capabilities is approximately an order of magnitude higher than current achievements. For these reasons, redundant information becomes a significant hindrance, designed to solve compatibility problems at all levels of the organization of the system architecture — as happens with the spontaneous development of languages and levels of formalization of information processes. Here I tried to place only key accents, varying in some limits by the degree of severity of the restrictions taken, so as to remove the checkmarks in reverse order (actually, this is “binary”, “single bit”, “single channel”, “single command” and "RISC-oriented architecture"),it would be possible to return to the root directory "/ Logic".

Source: https://habr.com/ru/post/307524/

All Articles