This is another article "walking for a long time" on the Internet, and again, as the author, I will make a repost. I think it is useful here.
Introduction
In this article, I do not set myself the goal of retelling all the RFCs related to the FTP protocol, of which there are not a few, you can find much more information in them, I will try only in general terms to introduce you to the FTP protocol and the basic techniques for working with it from the client.
')
FTP Protocol Overview
So, FTP (File Transfer Protocol) is a file transfer protocol in TCP / IP networks. This protocol was specifically created to facilitate and standardize the programming of file transfer algorithms between the client and the server. Like all high-level protocols, it does not deal with direct data transmission (the lower-level protocol, TCP, and the protocols below, does this), but only describes the client-server “communication” method.
Let us proceed directly to the description of the protocol. Its distinctive feature is the use of two connections between the server and the client. One connection (command or control) is used to send commands to the server, as well as receive responses to these commands. The second connection (data connection) is used directly for receiving or transmitting data. The control connection is always from the client to the server port 21 and remains open throughout the entire session. A data connection is opened and closed as necessary to receive or receive data.
After the control connection is established, the client can send various commands to the server via it. Each command consists of 3 or 4 ASCII capital characters, followed by one or more spaces followed, some of the commands are not required arguments. Any command ends with a pair of CR, LF - this is undoubtedly known to all 0dh, 0ah - if we are talking about DOS / Windows. In general, the command diagram is as follows:
Command [argument (s)] CR, LF.In total, there are just over 30 commands (in RFC959 - 33) that can be sent to the server, but this does not mean that the server will support them all. I will give an example of the most frequently used commands.
USER usernameSpecifies username
Pass passwordSpecifies user password
LIST file listFile List Request
PORT n1, n2, n3, n4, n5, n6Specify IP and port for data connection
RETR file nameGet file from server
STOR file namePut file on server
TYPE typeType of data transmitted
QuitDisconnect from server
AborCancel previous command. Termination of data transfer.
When a server receives a request, it sends a response to it via the same control connection. The server response consists of three characters (numbers) in ASCII format, followed by a non-mandatory text, usually explaining the digital code of the response, followed by the constant CR, LF. For example, the answer may be: 226 File send OK. - in this example, the server informs us that the file was sent on its part (which does not mean at all that it has already been received from the client). The first digit of the server's response is the most significant, and gives an unambiguous idea of ​​how the command was executed (or failed). Values ​​can be:
- 1xx The command is in progress, you must wait for another message before giving the next command.
- 2xx Command completed. The server is waiting for the next one.
- 3xx The command is completed, but one more command is needed to continue.
- 4xx The command was not executed, you must wait and repeat the command
- 5xx The command was not executed and will not be executed during the repetition.
According to the second digit of the response, it is possible to judge which situation led to the emergence of the response:
- x0x Syntax error.
- x1x Information.
- x2x Response refers to the state of the controller or data connection.
- x3x Response refers to user authentication or budget status.
- x4x Not defined.
- x5x Response refers to the state of the file system.
Finally, the third digit of the response carries additional information.
Special attention should be paid to the fact that although the server responds to most commands with one response, there are commands that are widely used, in response to which the server generates several responses. The first digit of the first response will be “1” - i.e. If you look at the tables above, the server tells us that you need to wait for another message from it before sending the next command. An example of such a command is the RETR command, when the server accepts it and starts sending data, it answers us something like: “150 Opening BINARY mode data connection for HIDE.ASM (958 bytes).” - the meaning of the message is reduced to “data transfer started ". Then, when the data has already been sent to them (but again I want to focus attention - not the fact that they were received by the client), he will send another response via the control connection - “226 File send OK.” - i.e. "File sent". In this case, only after receiving the second message, the server is ready for the next command. Instead of the last message, we may well receive a message with an error starting with “4” - in the event that there are any problems with the transfer of the file.
In general terms, this is all about the control connection.
Now let's talk about the data connection. As mentioned above, the data connection is organized as needed, and is closed every time after sending or receiving data. This happens because the data transfer mode between the client and the server is streaming, and in this mode, the end of the data transfer is closing the connection. From the above, we have to draw one important conclusion - we can judge the end of data transfer from the server side by closing the connection.
Typically, a data connection is opened as follows:
- the client selects a free port on his host and performs a passive opening on it;
- the client informs the server via the control connection its IP address and the port number to which it has made a passive discovery;
- the server, having received a port and an IP address, is actively opening it;
- data is transmitted or received;
- depending on who sends and who receives data, the port is closed.
A small digression: if you carefully read the second paragraph, the question may arise - “What will happen if we give the server a fictitious address and port?”. The answer is ambiguous, the server can check the IP address, but this does not always happen, so there are some interesting “problems” with the use of fictitious addresses.
Regarding the port chosen for the data connection by the client. Usually a dynamically assigned OS port is used, i.e. a request is made to the system, it gives the first free one. If the client does not indicate to the server the port for the connection, it occurs on the port from which the control connection was made (this is not recommended). The server always connects data from port 20.
This is all the main thing that I wanted to talk about the data connection.
Now, when we know why and how both connections work, I want to note one more thing (you can skip it when you first read it). The LIST command returns a list of files in the current directory, and returns it by data connection. The list is a set of ASCII strings ending in CR, LF. Each line carries information about one of the elements of the requested catalog. The general pattern of this line is:
Txxxxxxxxx [] uk [] user [] group [] size [] mm [] dd [] yytt [] name CR, LFWhere,
T - type of element (“d” - directory, “-” - file, “l” - link, etc.);
xxxxxxxxx - file protection attributes;
user - the user, the owner of the file;
group - owner group;
size - the size of the element;
mm is the month when the element was created in text form, for example, “jul”;
dd - the day of the month the item was created;
yytt - here can be the year or the time of the element creation;
name - the name of the element (file, directory, links);
[] - one or more spaces.
Yes, between these elements there can be a different number of spaces, I must say thank you, that in different server implementations they left one number of significant columns, so this should be taken into account when analyzing the file table. It is necessary to take into account such a thing, that not always the first row from the table is a significant row that carries information about the first element of the catalog. In some implementations of FTP servers (for example, ftpd on FreeBSD), the first line of the list is “total NN”.
How should this work?
Let's digress a bit and see how an FTP session should look like getting a file “from the inside”. So, we run the client. The server at this time already passively opened and listens to the 21st port. First of all, we need to create a control connection - go to the server on port 21. What next? Immediately, as soon as we successfully connected with the server via the created control connection, we received a greeting from the server, it would be something like “220 VSFTP deamon base on Alt Linux 2.2, Shpakovsky”.
The next step should be registration - let's say we are connecting to an anonymous server - the client sends the USER anonymous command to the server via the control connection, so if the server supports the anonymous user we get the answer: “331 Please specify the password.” - “please provide the password”, note the number “3” in the server's response, which means that a command is required to continue, which the client actually does — send the command PASS 1 @ 1 — specifying a fictitious e-mail as the password. What we get the server response “230 Login successful. Have fun. ”-“ Registration was successful. ”
Everything, now our actions depend on what we want, and as mentioned above, we want to get a file from the server, even if for example it is the file “HIDE.EXE” located in the root directory of the server. Before receiving or transmitting data to the server, you must specify what type of data will be transmitted, this is done with the command TYPE N, where N = “A” if the type is ASCII and N = “I” if the file is binary. The client sends the TYPE I command to the server, for which he receives the answer - “200 Switching to Binary mode.”.
So, it remains only to get the file. To do this, the client must open a data connection. A free port is selected by the client, passive opening is performed, i.e. the client listens to him. Next, the client needs to inform the server of its IP address and the port number that it has just passively opened (assume the client's IP address is 10.21.23.10, and the port number is 2000). The client sends the command PORT 10,21,23,10,7,208 to the server on the control connection - “what is 7,208?” - you ask. This is the port number it is built like this - 7 * 256 + 208 = 2000. After receiving this command, the server will try to make an active opening of the specified port and in case of success it will return something like “200 PORT command successful. Consider using PASV. ”.
Everything, the data connection is established, it is necessary to give the command to transfer the data to the server, which is what the client does - RETR HIDE.EXE, and in case everything is normal (the file exists and can be transferred) the server replies "150 Opening BINARY mode data connection for HIDE.EXE (4096 bytes). ”And begins to merge the file at the data connection. Again I draw your attention to the first digit of the answer. When the file is completely sent, the server will send the message “226 File send OK.” And close the data connection.
The client waits for the data to be received on its part (as evidenced by receiving a message from the server + closing the data connection, there are some nuances, but about them later) and then closes the data connection port on its part.
So the file is received by the client, it remains to break the control connection, the client sends the QUIT command, the server responds with “221 Goodbye.” And terminates the connection.
This is actually the most important theoretical information about the protocol. Before proceeding to practice, I strongly advise you to indulge in managing the connection to the FTP server using telnet, you will not be able to create a data connection, but the commands and responses to them will be visible. I also recommend working with any console FTP client and watching all this while creating and closing connections using any utility for this, which is like dirt on the Internet.
Implementation.
Now about the implementation itself. In this client implementation, I use non-blocking (non-blocking) sockets, so the client model is event-driven, i.e. perform any actions related to the sockets used by the client, the client will be available only when a corresponding event occurs (for example, closing a connection, notifying on receiving data, etc.). The events used are messages that come into the main window procedure. In addition, the program model is stream, it uses a stream to read the data connection and a stream to read the control connection, as well as the main client thread, which starts when you click on the "connection" button. Since the program is multi-threaded to synchronize the operation of these three streams (as well as the message procedures of the main window), “event's” are used (“events”, do not confuse these events used by the program as sensor 1 or 0 - an event has occurred or has not occurred sockets that come to the main window procedure).
So, let's begin. When creating the main application window, we carry out the main initialization of the program, I will explain the main points:
call VirtualAlloc,ebx,1024000,MEM_COMMIT+MEM_RESERVE,PAGE_READWRITE
mov ReciveDataBufferOffset,eax
call VirtualAlloc,ebx,10240,MEM_COMMIT+MEM_RESERVE,PAGE_READWRITE
mov ReciveCommandBufferOffset,eax
(1 ) (10 ).
call CreateEventA,ebx,ebx,ebx,ebx
mov HDataReciveEvent,eax
……
event () .
call CreateThread,ebx,ebx,offset ReciveThread,offset ReciveDataThreadStruc, \
NORMAL_PRIORITY_CLASS,offset ThreadID_data
call CreateThread,ebx,ebx,offset ReciveThread,offset ReciveCommandThreadStruc,\
NORMAL_PRIORITY_CLASS,offset ThreadID_command
2 – , . , .
call gethostname, offset HostName,64
call gethostbyname,offset HostName
…..
mov PortInPort,esi
ret 0
The meaning of the lines above is in obtaining the IP address of our host, a small conversion and writing it to a separate place; we will need the address of the host to execute the PORT command.
At this point, the initial initialization process ends, and the program is in a state of waiting for a user command. Let's see what happens when the user clicks the "connect" button.
In the main window procedure, the main flow of the application is created, we consider its key points.
Right at the start, we initialize the variables related to receiving data and get the connection parameters entered by the user (server, password, etc.) from the dialog window. After that, we need to create a controlling connection to the server, which we do:
- ;
call socket, AF_INET, SOCK_STREAM, IPPROTO_TCP
mov ReciveCommandSock,eax
- ,
,
.
call WSAAsyncSelect, ReciveCommandSock, newhwnd, WM_COMMANDSOCK,FD_READ+FD_CONNECT
-
…..
call connect,ReciveCommandSock,offset sockaddr_in,16
- FD_CONNECT,
call SetEvent,HWaitConnectEvent ,
,
5 , .
call WaitForSingleObject,HWaitConnectEvent,5000
call ResetEvent,HWaitConnectEvent
- , 5
, - . WaitAnswerRecive .
call WaitAnswerRecive,5000
or eax,eax
jnz errorwithregisration
- the input parameter to the function is the interval during which the function will be
wait for server response, if no response is received for the specified interval, function
displays an error message and terminates with a non-zero value of the eax register.
WaitAnswerRecive proc TimeToWait:dword
call WaitForSingleObject,HWaitCommandEvent,TimeToWait
- HWaitCommandEvent,
, .
or eax,eax
jz NoTimeOutGet
call MessageBoxA,newhwnd,offset ErrTimeOutCommand,offset ErrorCap,40h
call ResetEvent,HWaitCommandEvent
- HWaitCommandEvent .. ,
.
NoTimeOutGet:
ret
WaitAnswerRecive endp
Now it's time to consider the data acquisition flows, as mentioned above, these flows are created during the initialization of the main window, and are constantly in the process of waiting for new data, the flows are activated in the main window procedure when it receives a message that there is new data, a message for the manager we defined the connections at the very beginning of the main thread by the WSAAsyncSelect function; the message for the data connection is defined when creating this connection, as we will see later.
Universal trade for obtaining data on the control and data connection is given below.
- ReciveDataThreadStruc
ReciveCommandThreadStruc .
ReciveCommandThreadStruc :
- ;
HCommandReciveEvent dd ?
- , ;
HWaitCommandEvent dd ?
- ;
ReciveCommandBufferOffset dd ?
- ;
BytesCommandRecived dd 0
- , ;
ReciveCommandSock dd ?
ReciveThread proc parametr:dword
mov edi,parametr
InfinityLoop:
- , ;
call WaitForSingleObject,dword ptr [edi],-1
- esi , - +
;
mov esi,[edi+8]
add esi,[edi+12]
- 4096 ;
call recv,dword ptr [edi+16],esi,4096,0
- , ;
add [edi+12],eax
- ebx , , ;
mov ebx,[edi+4]
-
, -
;
cmp edi,offset ReciveDataThreadStruc
je comparefordata
-
0dh, 0ah, ;
mov eax,[edi+12]
mov esi,[edi+8]
cmp byte ptr [esi+eax-1],10
je short CallEvent
jmp InfinityLoop
comparefordata:
- , = ;
mov eax,[edi+12]
cmp FileLenght,eax
jne InfinityLoop
CallEvent:
- ;
call SetEvent,ebx
jmp InfinityLoop
ReciveThread endp
Now back to the main thread, we successfully received a response from the server, that it is ready to receive commands, now we can send commands to it, in this implementation, the SendCommandInSocket function is responsible for sending commands to the server, then we call this function in the main thread to send server sequential commands: USER, PASS, TYPE, CWD, PORT and LIST. The function itself looks like this:
- , , ,
;
SendCommandInSocket proc uses ebx ecx esi edi, hSocket:dword, OutBufOffset:dword
- ;
mov edi,OutBufOffset
push edi
mov eax,0ah
mov ecx,100
repne scasb
sub edi,OutBufOffset
mov ecx,edi
pop esi
push edi
- , ,
;
mov edi,ReciveCommandBufferOffset
add edi,BytesCommandRecived
rep movsb
pop edi
add BytesCommandRecived,edi
- ;
call send,hSocket,OutBufOffset,edi,ebx
- , WaitAnswerRecive;
mov eax,5001
Wait2Answer:
dec eax
push eax
call WaitAnswerRecive
or eax,eax
jnz ErrorProcessed
- , , ,
, ,
.
.
mov edi,ReciveCommandBufferOffset
mov ecx,BytesCommandRecived
dec ecx
dec ecx
add edi,ecx
mov al,0ah
std
repne scasb
cld
xor eax,eax
- ;
mov cl,[edi+2]
cmp cl,'1'
- "1"
jz Wait2Answer
cmp cl,'3'
- "3" - ;
jna NoErrorProcessed
call MessageBoxA,newhwnd,edi,offset ErrorCap,40h
ErrorProcessed:
xor eax,eax
inc eax
NoErrorProcessed:
ret
SendCommandInSocket endp
There is one more thing to consider - before sending the PORT command, we need to create a listening socket, we do this by calling the CreateListenSock procedure.
CreateListenSock proc
pushad
- ;
call socket, AF_INET, SOCK_STREAM, IPPROTO_TCP
mov datasock,eax
- - , ,
,
, ;
call WSAAsyncSelect, datasock, newhwnd, WM_DATASOCK, FD_ACCEPT+FD_READ+FD_CLOSE
- ;
mov sin_port,0 ; ,
;
mov sin_family,AF_INET
mov sin_addr,INADDR_ANY
call bind, datasock, offset sockaddr_in, 16
- ;
call getsockname,datasock,offset sockaddr_in,offset szSockaddr_in
- ;
xor eax,eax
mov ax,sin_port
call ntohs,eax
push eax
shr eax,8
- ASCII;
call DECtoASCII,eax,PortInPort
- PORT
mov al,','
stosb
pop eax
and eax,0ffh
call DECtoASCII,eax,edi
mov ax,0a0dh
stosw
mov esi,PortInPort
- ;
call listen, datasock, 1
popad
ret
CreateListenSock endp
So the last command sent was the LIST command, after it a list of files of the current directory should come to the data connection, so after sending the message we need to wait until we receive this list, because even if the server sent us a message stating that it had successfully completed the sending of all data, this does not at all mean that our stream has already completed and received everything, so we expect the receive to be completed by the WaitTransferComplete function.
-
, , .
WaitTransferComplete proc uses ecx edi, TimeToWaitEndTransfer:dword
WaitProgress:
- ,
;
call WaitForSingleObject,HWaitCloseEvent,-1
- , ,
;
call WaitForSingleObject,HWaitDataEvent,TimeToWaitEndTransfer
or eax,eax
jz CloseDataSocks
- , , , ..
, ,
;
cmp TimeToWaitEndTransfer,1000 ;
jz CloseDataSocks
call MessageBoxA,newhwnd,offset ErrTimeOutCommand,offset ErrorCap,40h
CloseDataSocks:
- ;
call ResetEvent,HWaitDataEvent
- ;
call closesocket,ReciveDataSock
call closesocket,datasock
ret
WaitTransferComplete endp
In case of successful completion of the procedure above, the directory table will be in the data receive buffer. Below the program, we process the resulting table and in turn get all the files found in it, getting the file is no different from getting the directory, so here I will not describe this. After all the files have been received and saved, we close the control connection and end the stream.
Conclusion
We discussed the basic principles of working with the FTP protocol from the client, of course, not all aspects of this task were touched upon. For example, sending files to the server was not considered, but I think, having carefully studied the material above, as well as the accompanying source code, you can do this without any problems, let the further study of the FTP protocol from the server side be your “homework”.