📜 ⬆️ ⬇️

CGI programming in assembler?!? - Easy!

The article has been walking on the Internet for quite some time, but, as the author thinks, I have the right to repost it here. Much (if not all) of the writing here is outdated, and it may seem useless at first glance, but after passing this way, after 6 years I can say it was not superfluous. So.
In this article I want to talk about the CGI interface in general, its implementation for windows and the use of the assembler language in particular when writing CGI programs. The full description of CGI is not included in the scope of this article, since the Internet on this issue is just a sea and I just don’t see any reason to retell it all.


CGI Theory

CGI - (Common Gateway Interface) - Common Gateway Interface. As it is not difficult to guess this interface serves as a gateway between the server (here I mean the program - the server) and any external program written for the OS on which this same server is running. Thus, CGI is responsible for how exactly the data will be transferred from the server program to the CGI program and vice versa. The interface does not impose any restrictions on what the CGI program should be written on; it can be either a regular executable file or any other file - the main thing is that the server can run it (in a windows environment, for example, it can be a file with tied to any program).
From the moment you called (for example, you clicked the button of the form to which the call of the CGI program is attached) the CGI program until you get the result in the browser window the following happens:
- A web client (for example, a browser) creates a connection to the server specified in the URL;
- The web client sends a request to the server, this request is usually done using two methods GET or POST;
- The data from the client's request (for example, the values ​​of the form fields) are transmitted by the server, using the CGI interface, to the CGI program specified in the URL;
- The CGI program processes the client data received from the server and generates a response based on this processing to the client, which it transmits via the same CGI interface to the server, and it in turn sends it to the client directly;
- The server breaks the connection with the client.
In the standard CGI specification, it is assumed that the server can exchange with the program in the following ways:
- Environment variables - they can be set by the server when the program starts;
- Standard input stream (STDIN) - with its help the server can transfer data to the program;
- Standard output stream (STDOUT) - the program can write to it its output, transmitted to the server;
- Command line - in it the server can transfer some parameters to the program.
Standard I / O streams are very convenient and widely used on UNIX systems, which cannot be said about windows, so there is a CGI specification, developed specifically for windows systems and so called “Windows CGI”. But, of course, standard input / output streams can also be used in windows CGI programming. Here I will not touch the “Windows CGI” standard, and there are at least two reasons for this - the first and most important - at the moment not all http-servers for windows support this specification (in particular, my favorite Apache 1.3.19) . You can observe the second reason by typing “Windows CGI” in any search engine. I’ll note only general details about this interface - all data from the server to the client is transmitted via a file common for windows * .ini, the name of which is transferred to the program on the command line. At the same time, all the data in the file is already carefully divided into sections by the server and all you need to do is to extract them from there using the functions “GetPrivateProfile *”. The response to the server is transmitted again via the file whose name is indicated in the corresponding ini-file entry.
What data can be transferred by the client to the CGI program? - almost any. In the general case, the program transfers the values ​​of the form fields that the client fills, but it can also be any binary data, such as a file with a picture or music. Data can be transferred to the server by two different methods - this is the GET method and the POST method. When we create a form to fill in on our page, we clearly indicate which of the methods we want to send the data entered by the user, this is done in the main form tag like this:
When sending data using the GET method, the browser reads the data from the form and is placed after the script URL, after the question mark, if there are several significant fields in the form, they are all transmitted through the “&” sign, the field name and its value are written to the URL via the “ = ". For example, the request generated by the browser from the form when you click on the button to which the script “/cgi-bin/test.exe” is attached, taking into account that the first field of the form is called “your_name”, the second one - “your_age” may look like this:
GET /cgi-bin/test.exe?your_name=Pupkin&your_age=90 HTTP / 1.0
Using the GET method has several weaknesses at once - the first and most important thing is because the data is transferred to the URL then it has a limit on the amount of this most transmitted data. The second weakness again arises from the URL - this is confidentiality, with such a transfer, the data remain completely open. So, it’s good if we have 2-3 small fields in the form ... the question is what to do if there is more data? The answer is to use the POST method!
When using the POST method, the data is transmitted to the server as a data block, rather than in a URL, which gives us a few hands off to increase the amount of information transmitted. For the above example of a POST form, the block sent to the server will be something like this:

POST /cgi-bin/test.exe HTTP/1.0
Accept: text/plain
Accept: text/html
Accept: */*
Content-type: application/x-www-form-urlencoded
Content-length: 36
your_name=Pupkin&your_age=90


As mentioned above, after receiving the data, the server must convert it and transfer it to the CGI program. In the standard CGI specification, the data entered by the client at the request of GET is placed by the server into the environment variable of the program QUERY_STRING. When requesting a POST, data is placed into the standard input stream of the application, from where it can be read. In addition, with such a request, the server sets two more environment variables, CONTENT_LENGTH and CONTENT_TYPE, by which it is possible to judge the length of the request in bytes and its content.
In addition to the data itself, other environment variables of the called program are set by the server, I will cite some of them:
')
REQUEST_METHOD
Describes how exactly the data is obtained.
Example: REQUEST_METHOD = GET

QUERY_STRING
The query string if the GET method was used.
Example: QUERY_STRING = your_name = Pupkin & your_age = 90 & hobby = asm

CONTENT_LENGTH
Length in bytes of the request body
Example: CONTENT_LENGTH = 31

CONTENT_TYPE
Request body type

GATEWAY_INTERFACE
CGI protocol version
Example: GATEWAY_INTERFACE = CGI / 1.1

REMOTE_ADDR
The IP address of the remote host, that is, the client who pressed a button in the form
Example: REMOTE_ADDR = 10.21.23.10

REMOTE_HOST
The name of the remote host, it can be its domain name or for example the name of the computer in the Windows environment, if they cannot be received, the field contains its IP
Example: REMOTE_HOST = wasm.ru

SCRIPT_NAME
The name of the script used in the request.
Example: SCRIPT_NAME = / cgi-bin / gols.pl

SCRIPT_FILENAME
The name of the script file on the server.
Example: SCRIPT_FILENAME = c: /page/cgi-bin/gols.pl

SERVER_SOFTWARE
Server software
Example: Apache / 1.3.19 (WIN32)
The called CGI program can read any of its environment variables set by the server and use it to its advantage.
In general, this is in short everything, for more detailed information about the General Gateway Interface, see the specialized documentation, I made this description to remind you, and if you didn’t know, bring it up to date. Let's try to do something in practice.

Practical part

For practice, we will need at least 3 things - some http-server for Windows, I tried all the examples on Apache 1.3.19 for Windows, the server is free, you can download it from i
Yes, and we need the server not anyhow - which one, but configured to run cgi-scripts! How this is done for the server you are using, see the documentation. The second thing that we need is, of course, an assembler, it is also necessary that the compiler supports the creation of WIN32 console applications, I use Tasm, but both Fasm and Masm and many other * asm'ov are perfect. And finally, the most important thing is that this desire is required.
So, I admit that the server was set up and configured by you, so that in the root directory of the server documents there is an index.html file, which is remarkably displayed in the browser when you type 127.0.0.1 . Also, I will take into account that somewhere in the wilds of server folders there is a daddy “cgi-bin” in which scripts are allowed to run.
Let's check the server setup, and at the same time write a small script. Our script will be a regular * .bat file. I anticipate questions - how? really? Yes, this is a regular batch file, as already mentioned above, the CGI specification does not distinguish between file types, the main thing is that the server can run it, and he, in turn, has access to stdin / stdout and environment variables, a bat file, not fully, but for example we are quite satisfied. Create a file of approximately the following content:

@echo off
rem
echo Content-type: text/html
echo.
rem
echo "!<br>
echo " GET : %QUERY_STRING%


The file is called test.bat and put it in the directory for running scripts, most likely it will be the “cgi-bin” directory. The next thing we need to do is to call this script in any way, in principle, this can be done directly by typing in the browser address window something like “http://127.0.0.1/cgi-bin/test.bat”, but let's let's make his call from our main page, at the same time we will check the work of the GET method. Create the index.html file in the server root with the following content:

 <html> <head>
 <meta content = "text / html; charset = windows-1251" http-equiv = Content-Type>
 </ head>
 <form name = "Test" method = get action = "/ cgi-bin / test.bat">
 <center> <table border = 1 cellspacing = 0>
 <tr bgcolor = "# e0e0e0"> <td colspan = 2> <font face = arial size = 2>
 <center> <b> Enter the data to send to the server: </ b> </ center>
 </ font> </ td> </ tr>
 <tr> <td> <font face = arial size = 2>
 <b> Data: </ b> </ font> </ td>
 <td> <font face = arial size = 2>
 <textarea rows = "3" cols = "55" warp = "physical" name = "data"> </ textarea>
 </ font> </ td> </ tr> <tr bgcolor = "# e0e0e0">
 <td colspan = 2> <center>
 <input type = "submit" value = "Send!">
 </ center> </ td>
 </ tr> </ table> </ center> </ form> </ body> </ html> 


Now at the entrance to the server (http://127.0.0.1 in the address bar of the browser) a form should appear, type something in it and click the “send” button, if everything was done correctly, you will see our bat- the script. Now let's see what we have pissed off.
As you can guess, the “echo” command outputs to stdout, first of all we transfer the header of our response to the server - “echo Content-type: text / html”. This is the standard CGI specification header, which says that we want to transmit text or an html document, there are other headers. A very important point is that the header should be separated from the body of the answer with an empty line, which we do with the next command "echo.". Next, the body of the response itself is transmitted - this is a normal html document, in the body of the document, for clarity, I display one of the environment variables passed to us by the server - “QUERY_STRING”, as already mentioned in the GET method (and this is our case) in this variable, all data entered by the user, which we can observe in the script response. You may have noticed “quotes out of place” in the last 2 lines of the file, immediately after “echo”, they stand there because of the specificity of the bat-files, as you can see html tags are framed with “<” and “>” symbols, at the same time, these characters serve as redirection of input / output in bat-files, and therefore we cannot use them here freely.
I recommend a little indulgence with similar bat-scripts, it can be very useful, try to see other environment variables. I will say a little, deviating from the topic, on UNIX-systems, command interpreter languages ​​are very strongly developed and the line between programming in the command interpreter language and programming in the "real" programming language is very, very blurred in some cases, therefore, quite often unpretentious scripts are written on UNIX-systems it is in the command interpreter languages, but the windows interpreter cmd.exe or, earlier, command.com is clearly weak for these purposes.
Now let's move on to the most important task of this article, to actually writing a CGI program in assembler. In principle, considering all the above about CGI, we can conclude that the CGI interface from our program requires:

This is quite enough to create a full-fledged CGI application.
Let's start with the last item. To access the environment variables of a Windows application, the API function “GetEnvironmentStrings” is used, the function has no arguments and returns a pointer to an array of environment variables (NAME = VALUE) separated by zero, the array is closed with double zero, when the program is started by the server in the program environment besides Standard CGI variables described above are added to standard variables; you won't see them when you start the program from the command line, of course.
In order to write something to stdout or read from stdin, we first need to get the handles of these threads, this is done using the API function "GetStdHandle", one of the following values ​​is passed as a function parameter:



The function will return the handle we need for read / write operations. The next thing we need to do is write / read these threads. This is done with the usual file read / write operations, i.e. ReadFile and WriteFile. There is one subtlety here, you might think that WriteConsole / ReadConsole can be used for this purpose, it’s really valid for the console and it will work fine, the results will be output to the console as well as with WriteFile, but it will continue until we run our program as a script on the server. This happens because when our program starts the server handles returned by the “GetStdHandle” function, they will no longer be console handles per se, they will be pipe handles, which is necessary for connecting two applications.
Here is a small example of what an assembly language CGI program should look like:

.386
.model flat,stdcall
includelib import32.lib
.const
PAGE_READWRITE = 4h
MEM_COMMIT = 1000h
MEM_RESERVE = 2000h
STD_INPUT_HANDLE = -10
STD_OUTPUT_HANDLE = -11

.data
hStdout dd ?
hStdin dd ?
hMem dd ?
header:
db 'Content-Type: text/html',13,10,13,10,0
start_html:
db ' CGI- : <br>',13,10,0
for_stdin:
db ' STDIN : <br>',13,10,0
end_html:

db '',13,10,0
nwritten dd ?
toscr db 10 dup (32)
db ' - ',0
.code
_start:

xor ebx,ebx
call GetStdHandle,STD_OUTPUT_HANDLE
mov hStdout,eax
call GetStdHandle,STD_INPUT_HANDLE
mov hStdin,eax

call write_stdout, offset header
call write_stdout, offset start_html

call VirtualAlloc,ebx,1000,MEM_COMMIT+MEM_RESERVE,PAGE_READWRITE
mov hMem,eax
mov edi,eax
call GetEnvironmentStringsA
mov esi,eax
next_symbol:
mov al,[esi]
or al,al
jz end_string
mov [edi],al
next_string:
cmpsb
jmp short next_symbol
end_string:
mov [edi],'>rb<'
add edi,3
cmp byte ptr [esi+1],0
jnz next_string
inc edi
stosb
call write_stdout, hMem
call write_stdout, offset for_stdin

call GetFileSize,[hStdin],ebx
mov edi,hMem
call ReadFile,[hStdin],edi, eax,offset nwritten, ebx
add edi,[nwritten]
mov byte ptr [edi],0
call write_stdout, hMem
call write_stdout, offset end_html
call VirtualFree,hMem
call ExitProcess,-1

write_stdout proc bufOffs:dword
call lstrlen,bufOffs
call WriteFile,[hStdout],bufOffs,eax,offset nwritten,0
ret
write_stdout endp
extrn GetEnvironmentStringsA:near
extrn GetStdHandle:near
extrn ReadFile:near
extrn WriteFile:near
extrn GetFileSize:near
extrn VirtualAlloc:near
extrn VirtualFree:near
extrn ExitProcess:near
extrn lstrlen:near
ends
end _start


The executable file is built with the commands:
tasm32.exe / ml test.asm
tlink32.exe / Tpe / ap / o test.obj
Do not forget that the program should be a console.
You can call this program using the html form described above, you only need to change the name test.bat in the form to test.exe and copy it to / cgi-bin / respectively, while you can set it in the POST request method, the program processes it.
I also want to note that you can call the program and in another way, you can create a file in the cgi-bin directory, for example test.cgi with a single line "#! C: / _ path_ / test.exe" and call it in requests, and the server in turn will read its first line and run the exe-file, for this it is necessary that the * .cgi extension as an extension for scripts is specified in the http-server settings. With this approach, the server will launch our program with the command line “test.exe path_to_test.exe”, which has several advantages - the first is that the person running our script will not even guess what the script is written on, the second is how it is transmitted to us the name of the file with our line, for example, we can add any settings for our script to this file, which simplifies debugging, by the way, all interpreters work this way - you have already noticed that in all perl / php / etc programs, there is a similar line - indicating on the team inte rpretator. So, the server when launching a cgi program, if the program’s extension is registered as a script in the settings, reads the first line of the file, and if it is of the format described above, it starts the program specified in the line with the name of this file with a space, say The line contains the pearl interpreter. He receives such a gift and begins to fulfill it, because a comment in a pearl is a “#” symbol, then it skips the first line and the script runs further, in general, a convenient thing.

Source: https://habr.com/ru/post/111131/


All Articles