What exactly happens when a user types google.com in the address bar? Part 1

Translation of the first part of the material from github , which thoroughly explains the work of the Internet: what exactly happens when a user types google.com in the address bar?

The "enter" button returns to its original position

To start counting, we will choose the moment when the “enter” button is recessed. At this moment, the circuit responsible for this button closes. A small current passes through the logical contours of the keyboard. They scan the status of all switches, extinguish parasitic electrical impulses, and convert the presses into key code 13. The controller encodes the code for transmission to the computer. Now it is almost always done via USB or Bluetooth, and earlier PS / 2 or ADB participated in the process.

If it is USB

The USB in the keyboard is powered with a voltage of 5V across the first pin of the USB host controller in the computer. The key code is stored in the keyboard's memory in the “endpoint” register. Every 10 milliseconds, the USB controller requests data from this register. So he gets the saved codes. The code is transferred to the USB SIE (Serial Interface Engine) and converted to one or more low-level USB protocol packets. Packages are sent via a differential electrical signal on pins D + and D- with a maximum speed of 1.5 Mb / s, since the HID (Human Interface Device) is considered a low-speed device.

Then the serial signal is decoded in the controller and interpreted by the keyboard HID driver. The code value is transferred to the operating system hardware abstraction layer.
')

If it is a virtual keyboard (touch screen)

When the user places a finger on the capacitive screen, a small current flows between it and the finger. This closes the circuit in the electrostatic field of the conductive layer and creates a voltage drop at this point of the screen. The screen controller raises the interrupt, giving the touch coordinates.

The mobile OS sends a message to the current application about a click on one of the GUI elements (in this case, the virtual keyboard keys). The keyboard raises an interrupt to send a keystroke message to the OS.

The occurrence of an interrupt (not on a USB keyboard)

The keyboard sends an interrupt request (IRQ), which the interrupt controller maps to the interrupt vector. The CPU uses the IDT interrupt description table to map vectors to the interrupt handler functions that the kernel provides. When the interrupt arrives, the CPU starts the required handler. So we get to the core.

(Windows) A WM_KEYDOWN message is sent to the application.

HID transmits the click event to the KBDHID.sys driver, which converts it to a scancode. In our case, it is equal to VK_RETURN (0x0D). This driver communicates with the KBDCLASS.sys driver. The latter is responsible for handling keyboard input in a safe manner. It calls Win32K.sys (possibly after sending a message through different keyboard filters). This happens in kernel mode.

Win32K.sys finds out which window is active via the GetForegroundWindow () API. This API provides a handler for the browser input line. Then the Windows message handling system calls SendMessage (hWnd, WM_KEYDOWN, VK_RETURN, lParam). lParam - a bitmask containing additional information about the click - repetitions, scancode, whether additional keys are pressed, etc.

The SendMessage API adds a message to the queue for the specified window handle (hWnd). Later, the WindowProc message handling function is called to handle the queue.

The active hWnd window is the edit window, in which case WindowProc has a message handler for WM_KEYDOWN. Since the VK_RETURN code was transmitted, he knows that the user pressed Enter.

(OS X) KeyDown NSEvent is passed to the application.

The interrupt signal triggers an event in the I / O Kit driver. It converts the signal to the key code and transfers it to the WindowServer process. That creates an event for active applications through their Mach port. Events are queued by these applications. They are read from there by threads that have the appropriate level of access using the mach_ipc_dispatch function. This is most often done through the main NSApplication loop using NSEvent / NSEventType KeyDown.

(GNU / Linux) Xorg Server Tracks Codes

When using the X graphic server, the evdev driver will be used to get the code. According to the rules, the key code will be converted to a scancode. After that, the symbol is transferred to the window manager (DWM, metacity, i3, etc.), which in turn passes the symbol to the window that is in focus. The window graphics API retrieves the symbol and displays the corresponding symbol in the window in which the focus is located.

Parsing URL

The browser now has the following information from the URL (Uniform Resource Locator):

Protocol "http" - use the 'Hyper Text Transfer Protocol'
Resource "/" - request home page

Is it a URL or search query?

If no protocol is specified and the string is not a valid domain name, the browser sends this text to the default search engine.

HSTS check list

The browser checks the list of HSTS (HTTP Strict Transport Security). This is a list of sites that need to be accessed only via HTTPS. If the site is listed, the browser sends a request via HTTPS. Otherwise, via HTTP.

We translate characters in the server name that do not belong to the ASCII table.

The browser checks if there are characters from non az, AZ, 0-9, -, ranges.
Since we have google.com, there will be no such characters. Otherwise, Punycode encoding will be applied to the server name.

DNS query

The browser checks if there is a domain in the cache. If not, the gethostbyname library function is called (depending on the OS). It checks whether the server’s address can be recognized by name based on information from the local hosts file. If this does not help, a request is made to the DNS server, which is specified in the network settings. This is either a local router or the provider's DNS server. If the DNS server is on the same subnet, the library works on ARP protocol with the server. Otherwise, the request is sent to the IP address of the standard gateway.

Protocol ARP (Address Resolution Protocol)

To send an ARP broadcast request, the network stack needs to know the recipient's IP address and the MAC address of the interface that will be used for this.

First, the ARP cache is checked for the recipient's IP. If it is in the cache, the result is "Recipient IP = MAC".

If it is not in the cache, then the routing table is scanned for the presence of an ip-address in any of the local subnets. If it is there, the interface assigned to this subnet is used. If not, the library uses the main gateway's subnet interface. Then the MAC address of the selected interface is searched and a Layer 2 ARP request is sent:

Sender MAC: interface:mac:address:here Sender IP: interface.ip.goes.here Target MAC: FF:FF:FF:FF:FF:FF (Broadcast) Target IP: target.ip.goes.here

If the computer is connected to the router directly, the router responds with ARP Reply. If the connection goes through the hub, it sends a request to all ports. If there is a router, he will give an answer. If the connection is via a switch, it will determine by its CAM / MAC table which port has the correct MAC address. If it does not, it will distribute the request to all ports. And if it does, it will send a request on the port where the MAC is available.

Reply ARP Reply:

 Sender MAC: target:mac:address:here Sender IP: target.ip.goes.here Target MAC: interface:mac:address:here Target IP: interface.ip.goes.here

Now the library has the ip-address of either the DNS server or the main gateway. You can continue the process of recognizing the domain. Port 53 is opened and a UDP request is sent to the server (for large requests, TCP is used). If there is no information from the DNS server, then a recursive search is requested that goes through the list of DNS servers until it reaches the SOA and the necessary answer is not found.

Socket opening

When the browser receives the ip address of the destination server, it together with the port (for the HTTP port the default is 80, for HTTPS it is 443) uses them as parameters for calling the socket function and requests the TCP socket stream — AF_INET and SOCK_STREAM.

First, this request is transmitted to the transport layer, where a TCP segment is created. The destination port is added to the header, and the source port is selected dynamically from the list of kernel ports (on Linux, this is ip_local_port_range).

The segment is sent to the network layer, where it is added an ip-header, which contains the ip-address of the destination server and the ip-address of our computer. Then the package falls into Link Layer. To it is added the frame header, which contains the MAC address of the computer and gateway (local router). If the kernel does not know the MAC address of the gateway, ARP broadcasts are sent to clarify it. And now our package is ready to be sent via Ethernet, WiFi or mobile connection.

In most cases, the packet from the computer passes through the local network, then enters the modem (modulator / demodulator), where it is converted from digital to analog. Such a signal can be transmitted by telephone, cable or wireless connection. The modem of the receiving party converts it back to digital form, from where it arrives at the next network node, where the sender and recipient addresses are further parsed.

Sometimes the packet is sent immediately via Ethernet or optics, then it remains digital and reaches the next network node. In the end, the signal reaches the local subnet router. From there, it goes through the AS and other AS border routers, and reaches the destination server. Each router on the way retrieves the ip-address of the destination and sends the packet to the next hop. In this case, the TTL in the packet is reduced by one. A packet is dropped if it reaches zero, or if the current router has a place in the queue. Sending and receiving packets occur multiple times within a TCP connection.

First, the client selects the initial sequence number (ISN) and sends the packet to the server, setting the SYN bit so that it is clear that this is an ISN.

If the server receives a SYN and is in a good mood, then it selects its ISN, sets the SYN to indicate that the packet contains the ISN, copies the client's ISN + 1 in the ACK field and adds the ACK flag to confirm receipt of the first packet.

The client confirms the connection by sending a packet, where its ISN increases, the sender's ISN increases, and the ACK field is set.

Data is transmitted as follows: when one side sends N bytes, it increases the SEQ by that number. When the other party acknowledges receipt of the packet (or their chain), it sends an ACK packet, where the ACK value equals the last received from the other side of the sequence.

To close the connection, the closing party sends a FIN packet, the other party acknowledges the receipt of the packet and sends its FIN, and the first one confirms its receipt.

Source: https://habr.com/ru/post/251373/

All Articles