Meet the Windows pseudo console (ConPTY)

Article published August 2, 2018

This is the second article about the Windows command line, where we will discuss the new Windows pseudo-console interface and software interfaces, that is, Windows Pseudo Console (ConPTY): why we developed it, what it is for, how it works, how to use it, and much more.

In the last article “The heavy legacy of the past. Windows command line problems ” we talked about the prerequisites for the emergence of the terminal and the evolution of the command line in Windows, and also began to study the internal structure of the Windows Console and the Windows Command-Line infrastructure. We also discussed the many advantages and major disadvantages of the Windows console.
')
One of the drawbacks is that Windows tries to be “useful”, but it prevents developers of alternative and third-party consoles, developers of services, etc. When creating a console or service, developers need to have access to communication channels through which their terminal / service communicates with command-line applications, or provide access to them. In the * NIX world, this is not a problem, because * NIX provides a “pseudo-terminal” (PTY) infrastructure that makes it easy to create communication channels for a console or service. But in Windows this was not ...

… until now!

From TTY to PTY

Before discussing our development in detail, let's briefly return to the development of terminals.

TTY was first

As discussed in the previous article , in the early days of computing, users controlled computers using electromechanical teletypes (TTY) connected to a computer through some kind of serial communication channel (usually through a 20 mA current loop ).

Ken Thompson and Dennis Richie (standing) work on DEC PDP-11 by teletype (messages without an electronic display)

Terminal distribution

Teletypes were replaced by computerized terminals with electronic displays (usually CRT screens). As a rule, terminals are very simple devices (hence the term “stupid terminal”), containing only electronics and computing power necessary for the following tasks:

Receive text input from the keyboard.
Buffering of the entered text on one line (including local editing before sending).
Sending / receiving text over a serial channel (usually via the once-wide RS-232 interface ).
Display of the received text on the terminal display.

Despite the simplicity (or perhaps because of it), terminals quickly became the main tool for managing minicomputers, mainframes and servers: most data entry operators, computer operators, system administrators, scientists, researchers, software developers and industry luminaries worked on DEC terminals, IBM, Wyse and many others.

Admiral Grace Hopper in his office with a DEC VT220 terminal on the table

Distribution of software terminals

Since the mid-1980s, instead of specialized terminals, general-purpose computers began to be used, which became more accessible, popular, and powerful. Many early PCs and other computers of the 1980s had terminal applications that opened a connection over an RS-232 port on a PC and communicated with anyone at the other end of the connection.

As general-purpose computers became more sophisticated, a graphical user interface (GUI) and a whole new world of simultaneously running applications, including terminal applications, appeared.

But the problem arose: how can a terminal application interact with another command line application running on the same machine? And how to physically connect a serial cable between two applications running on the same computer?

The emergence of a pseudo terminal (PTY)

In the world of * NIX, the problem was solved by introducing a pseudo-terminal (PTY) .

PTY emulates serial telecommunications equipment in a computer by exposing the master and slave pseudo-devices (“master” and “slave”): terminal applications connect to the master pseudo-device, and command-line applications (for example, shells like cmd, PowerShell, and bash) to the slave pseudo-device. When a terminal client transmits text and / or control commands (encoded as text) to the master pseudo-device, the text is translated to the associated slave. The text from the application is sent to the slave pseudo-device, then back to the master and, thus, to the terminal. Data is always sent / received asynchronously.

Pseudo-terminal application / shell

It is important to note that the “slave” pseudo-device emulates the behavior of a physical terminal and converts the command characters into POSIX signals. For example, if the user enters CTRL + C into the terminal, then the ASCII value for CTRL + C (0x03) is sent through the master device. When received at the slave pseudo-device, the value 0x03 is removed from the input stream and a SIGINT signal is generated.

Such a PTY infrastructure is widely used by * NIX terminal applications, text panel managers (for example, screen, tmux), etc. Application data calls openpty() , which returns a pair of file descriptors (fd) for the PTY master and slave devices. The application can then fork / run a child command line application (for example, bash), which uses its slave fd to listen and return text to the connected terminal.

This mechanism allows terminal applications to "talk" directly with command-line applications running locally, as the terminal would talk to a remote computer via a serial / network connection.

What, there is no pseudo-console Windows?

As we discussed in the previous article, while the Windows console is conceptually similar to the traditional * NIX terminal, it differs in several key ways, especially at the lowest levels that can cause problems for developers of Windows command line applications, third-party terminals / consoles, and server applications:

There is no PTY infrastructure in Windows : when a user starts a command line application (for example, Cmd, PowerShell, wsl, ipconfig, etc.), Windows itself “connects” a new or existing console instance to the application.
Windows interferes with third-party consoles and server applications : Windows (for the time being) does not give terminals a way to provide communication channels through which they want to interact with a command line application. Third-party terminals have to create a console off-screen, send user-entered data and scrap the output, redrawing it on the third-party console's own display!
Only in Windows is the Console API : Windows command line applications rely on the Win32 Consol API, which reduces code portability, since all other platforms support the text / VT, and not the API.
Non-standard remote access : dependence of command line applications on Consol API significantly complicates interaction and remote access scripts.

What to do?

Many, many developers have often requested a PTY-like mechanism under Windows, especially those who work with ConEmu / Cmder, Console2 / ConsoleZ, Hyper, VSCode, Visual Studio, WSL, Docker and OpenSSH tools.

Even Peter Bright, the technology editor of Ars Technica, asked to implement the PTY mechanism a few days later, as I began working on the Console team:

And recently again:

Well, we finally did it: we created a pseudo console for Windows :

Welcome to the Windows pseudo console (ConPTY)

Since the formation of the Console Team about four years ago, the group has been engaged in a major overhaul of the Windows console and internal mechanisms of the command line. In doing so, we regularly and thoroughly considered the issues described above and many other related issues and problems. But the infrastructure and code were not ready to make the release of the pseudo-consoli possible ... until now!

New Windows pseudo-console infrastructure (ConPTY), API and some other relevant changes will eliminate / facilitate a whole class of problems ... without breaking backward compatibility with existing command line applications !

The new Win32 ConPTY API (official documentation will be published soon) are now available in the latest Windows 10 Insider builds and the corresponding Windows 10 Insider Preview SDK . They will appear in the next major release of Windows 10 (somewhere in autumn / winter 2018).

ConHost Console Architecture

To understand ConPTY, you need to study the architecture of the Windows console, or rather ... ConHost!

It is important to understand that although ConHost implements everything you see and know as a Windows Console application, but ConHost also contains and implements most of the Windows command line infrastructure! From now on, ConHost becomes a real “console node” , supporting all command line applications and / or GUI applications that interact with command line applications!

How? Why? What? Let's take a closer look.

Here is a high-level view of the internal console / ConHost architecture:

Compared to the architecture from the previous article , ConHost now contains several additional modules for processing VT and a new module ConPTY that implements open APIs:

ConPTY API : The new Win32 ConPTY APIs provide a mechanism similar to the POSIX PTY model, but refracted from Windows.
VT Interactivity : Receives incoming text in UTF-8 encoding, converts each displayed text character to the corresponding INPUT_RECORD entry and stores in the input buffer. It also processes control sequences, such as 0x03 (CTRL + C), converting them into KEY_EVENT_RECORDS , which produce the appropriate control action.
VT Renderer : generates VT sequences needed to move the cursor and render text and style in the output buffer areas that have changed from the previous frame.

Ok, but what does that really mean?

How do Windows command line applications work?

To better understand the impact of the new ConPTY infrastructure, let's look at how Windows console and command-line applications have worked so far.

Whenever a user starts a command-line application, such as Cmd, PowerShell, or ssh, Windows creates a new Win32 process into which it loads the executable binary of the application and any dependencies (resources or libraries).

The newly created process usually inherits the stdin and stdout descriptors from its parent. If the parent process was a Windows GUI process, then the stdin and stdout descriptors are missing, so Windows will deploy and attach the new application to the new console instance. Communication between command line applications and their console is transmitted via ConDrv.

For example, when starting from a PowerShell instance without elevated rights, a new application process will inherit the parent stdin / stdout descriptors and, therefore, receive input data and output the output data to the same console as the parent.

We need to make a little reservation here, because in some cases command-line applications are launched attached to a new console instance, especially for security reasons, but the description above is usually correct.

Ultimately, when a command-line / shell application is launched, Windows connects it to the console instance (ConHost.exe) via ConDrv:

How does ConHost work?

Whenever a command line application is executed, Windows connects the application to a new or existing instance of ConHost. An application and its console instance are connected through the kernel-mode console driver (ConDrv), which sends / receives IOCTL messages containing serialized API call requests and / or text data.

Historically, as stated in the previous article, the work of ConHost is relatively simple today:

The user generates input from the keyboard / mouse / pen / touchpad, which is converted to KEY_EVENT_RECORD or MOUSE_EVENT_RECORD and stored in the input buffer.
The input buffer empties one record at a time, performing the requested input actions, such as displaying text on the screen, moving the cursor, copying / pasting text, etc. Many of these actions change the contents of the output buffer. These modified areas are recorded by the ConHost state engine.
In each frame, the console displays the modified areas of the output buffer.

When a command line application calls the Windows Console API, API calls are serialized into IOCTL messages and sent via the ConDrv driver. It then delivers the IOCTL messages to the attached console, which decodes and makes the requested API call. Returned / output values are serialized back to the IOCTL message and sent back to the application via ConDrv.

ConHost: a contribution to the past for the sake of the future

Microsoft tries to maintain backward compatibility with existing applications and tools whenever possible. Especially for the command line. In fact, 32-bit versions of Windows 10 can still run many / most 16-bit Win16 applications and executables!

As mentioned above, one of the key roles of ConHost is to provide services to its command-line applications, especially legacy applications that call and rely on the Win32 console API. ConHost now also offers new services:

Seamless PTY-like infrastructure for communicating with modern consoles and terminals
Upgrade legacy / traditional command line applications
- Receiving and converting UTF-8 text / VT to input records (as if entered by the user)
- Calls to the console API for a hosted application, updating its output buffer accordingly.
- Display of modified output buffer areas in UTF-8 encoding, text / VT

Below is an example of how a modern console application communicates with a command line application via ConPTY ConHost.

In this new model:

Console:
1. Creates own communication channels
2. Calls the ConPTY API to create a ConPTY, forcing Windows to start an instance of ConHost connected to the other end of the channels.
3. Creates an instance of a command line application (for example, PowerShell) connected to ConHost, as usual
ConHost:
1. Reads UTF-8 text / VT at the input and converts it to INPUT_RECORD entries that are sent to the command line application.
2. Performs API calls from a command line application that can modify the contents of the output buffer.
3. Displays changes in the output buffer encoded in UTF-8 (text / VT) and sends the resulting text to its console.
Command line application:
1. It works as usual, reads input data and calls the Console API, having no idea what its ConPTY ConHost translates input and output from / to UTF-8!

The last moment is important! When an old command-line application uses calls to the Console API like WriteConsoleOutput(...) , the specified text is written to the corresponding ConHost output buffer. Periodically, ConHost displays the modified output buffer areas as text / VT, which is sent back to the console via stdout.

In the end, even traditional command-line applications from the outside “speak” with the text / VT without any changes !

Using the new ConPTY infrastructure, third-party consoles can now directly interact with modern and traditional command-line applications and exchange data with all of them in the text / VT.

Remote interaction with Windows command line applications

The mechanism described above works fine on a single computer, but also helps in interacting, for example, with a PowerShell instance on a remote Windows computer or in a container.

When you run the command line application remotely (that is, on remote computers, servers, or in containers), there is a problem. The point is that command-line applications on remote machines communicate with the local ConHost instance, because IOCTL messages are not intended to be transmitted over the network. How to transfer input from the local console to a remote machine and how to get output from the application running there? Moreover, what to do with Mac and Linux machines, where there are terminals, but no Windows-compatible consoles?

Thus, in order to remotely control a Windows machine, we need some kind of communication broker that can transparently serialize data across the network, control the lifetime of the application instance, etc.

Maybe something like ssh ?

Fortunately, OpenSSH recently ported to Windows and added Windows 10 as an additional option . PowerShell Core also uses ssh as one of the supported protocols for remote interaction PowerShell Core Remoting . And for those who worked in Windows PowerShell, remote interaction Windows PowerShell Remoting is still an acceptable option.

Let's take a look at how OpenSSH for Windows now allows you to remotely control Windows shells and Windows command line applications:

Currently, OpenSSH includes some undesirable complications:

User:
1. Starts the ssh client, and Windows connects the console instance as usual.
2. Enters text into the console that sends keystrokes to the ssh client
ssh client:
1. Reads input as bytes of text data.
2. Sends text data over the network to the sshd listening service.
The sshd service goes through several stages:
1. Runs a default shell (for example, Cmd) that causes Windows to create and mount a new console instance.
2. Finds and connects to the Cmd instance console.
3. Moves the console off-screen (and / or hides it)
4. Sends input from an ssh client to an off-screen console as input.
The cmd instance works as always:
1. Collects input from sshd service
2. Performs work
3. Causes the Console API to output / style text, move the cursor, etc.
Attached [offscreen] console:
1. Performs API calls, updating the output buffer
Sshd service:
1. Squires off-screen console output buffer, finds differences, encodes them into text / VT and sends back ...
The ssh client that sends the text ...
The console that displays text

Fun, right? Not at all! In such a situation, much can go awry, especially in the process of simulating and sending user input and clearing the output buffer of the offscreen console. This leads to instability, malfunctions, data corruption, excessive energy consumption, etc. In addition, not all applications do the job of removing not only the text itself, but also its properties, due to which formatting and color are lost!

Remote operation using modern ConHost and ConPTY

Surely we can improve the situation? Yes, of course, we can - let's make a few architectural changes and apply our new ConPTY:

The diagram shows that the scheme has changed as follows:

User:
1. Starts the ssh client, and Windows connects the console instance as usual.
2. Enters text into the console that sends keystrokes to the ssh client
ssh client:
1. Reads input as bytes of text data.
2. Sends text data over the network to the sshd listening service.
Sshd service:
1. Creates stdin / stdout channels
2. Calls the ConPTY API to initiate ConPTY
3. Runs a Cmd instance connected to the other end of the ConPTY. Windows initiates and connects a new instance of ConHost
The cmd instance works as always:
1. Collects input from sshd service
2. Performs work
3. Causes the Console API to output / style text, move the cursor, etc.
ConPTY ConHost instance:
1. Performs API calls, updating the output buffer
2. Displays the modified output buffer regions as text / VT in UTF-8 encoding, which is sent back to the console / terminal via ssh

This approach with ConPTY is clearly cleaner and easier for the sshd service. Windows Console API calls are made entirely in the ConHost instance of the command line application, which converts all visible changes to text / VT. Whoever connects to ConHost does not need to know that the application is calling the Console API there, and does not generate text / VT!

Agree that this new mechanism of remote interaction ConPTY leads to an elegant, consistent and simple architecture. Combined with the powerful features built into ConHost, support for older applications, and the display of changes from applications that invoke the console Console API as text / VT, the new ConHost and ConPTY infrastructure helps us move the past into the future.

ConPTY API and how to use it

The ConPTY API is available in the current version of the Windows 10 Insider Preview SDK .

By now, I’m sure that you’re looking forward to seeing some code;)

Take a look at the API declarations:

 // Creates a "Pseudo Console" (ConPTY). HRESULT WINAPI CreatePseudoConsole( _In_ COORD size, // ConPty Dimensions _In_ HANDLE hInput, // ConPty Input _In_ HANDLE hOutput, // ConPty Output _In_ DWORD dwFlags, // ConPty Flags _Out_ HPCON* phPC); // ConPty Reference // Resizes the given ConPTY to the specified size, in characters. HRESULT WINAPI ResizePseudoConsole(_In_ HPCON hPC, _In_ COORD size); // Closes the ConPTY and all associated handles. Client applications attached // to the ConPTY will also terminated. VOID WINAPI ClosePseudoConsole(_In_ HPCON hPC);

The above API ConPTY essentially exposes three new functions for use:

CreatePseudoConsole(size, hInput, hOutput, dwFlags, phPC)

Creates a pty of dimension in the w columns and h rows using channels created by the caller:
- size : width and height (in characters) of the ConPTY buffer
- hInput : for writing input data to the PTY as text / VT sequences in UTF-8
- hOutput : for reading output from PTY as text / VT sequences in UTF-8
- dwFlags : Possible values:
  - PSEUDOCONSOLE_INHERIT_CURSOR : the created ConPTY will attempt to inherit the cursor position of the terminal's parent application
- phPC : console handle for the created ConPty
Returns : success / failure. If successful, phPC contains a handle to the new ConPty.

ResizePseudoConsole(hPC, size)

Resizes the internal ConPTY buffer to display a specific width and height.

ClosePseudoConsole (hPC)

ConPTY . , ConPTY, , ,

ConPTY API

ConPTY API ConPTY.

: GitHub

  // Note: Most error checking removed for brevity. // ... // Initializes the specified startup info struct with the required properties and // updates its thread attribute list with the specified ConPTY handle HRESULT InitializeStartupInfoAttachedToConPTY(STARTUPINFOEX* siEx, HPCON hPC) { HRESULT hr = E_UNEXPECTED; size_t size; siEx->StartupInfo.cb = sizeof(STARTUPINFOEX); // Create the appropriately sized thread attribute list InitializeProcThreadAttributeList(NULL, 1, 0, &size); std::unique_ptr<BYTE[]> attrList = std::make_unique<BYTE[]>(size); // Set startup info's attribute list & initialize it siEx->lpAttributeList = reinterpret_cast<PPROC_THREAD_ATTRIBUTE_LIST>( attrList.get()); bool fSuccess = InitializeProcThreadAttributeList( siEx->lpAttributeList, 1, 0, (PSIZE_T)&size); if (fSuccess) { // Set thread attribute list's Pseudo Console to the specified ConPTY fSuccess = UpdateProcThreadAttribute( lpAttributeList, 0, PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE, hPC, sizeof(HPCON), NULL, NULL); return fSuccess ? S_OK : HRESULT_FROM_WIN32(GetLastError()); } else { hr = HRESULT_FROM_WIN32(GetLastError()); } return hr; } // ... HANDLE hOut, hIn; HANDLE outPipeOurSide, inPipeOurSide; HANDLE outPipePseudoConsoleSide, inPipePseudoConsoleSide; HPCON hPC = 0; // Create the in/out pipes: CreatePipe(&inPipePseudoConsoleSide, &inPipeOurSide, NULL, 0); CreatePipe(&outPipeOurSide, &outPipePseudoConsoleSide, NULL, 0); // Create the Pseudo Console, using the pipes CreatePseudoConsole( {80, 32}, inPipePseudoConsoleSide, outPipePseudoConsoleSide, 0, &hPC); // Prepare the StartupInfoEx structure attached to the ConPTY. STARTUPINFOEX siEx{}; InitializeStartupInfoAttachedToConPTY(&siEx, hPC); // Create the client application, using startup info containing ConPTY info wchar_t* commandline = L"c:\\windows\\system32\\cmd.exe"; PROCESS_INFORMATION piClient{}; fSuccess = CreateProcessW( nullptr, commandline, nullptr, nullptr, TRUE, EXTENDED_STARTUPINFO_PRESENT, nullptr, nullptr, &siEx->StartupInfo, &piClient); // ...

cmd.exe ConPTY, CreatePseudoConsole() . ConPTY / Cmd. ResizePseudoConsole() , — ClosePseudoConsole() .

ConPTY :

 // Input "echo Hello, World!", press enter to have cmd process the command, // input an up arrow (to get the previous command), and enter again to execute. std::string helloWorld = "echo Hello, World!\n\x1b[A\n"; DWORD dwWritten; WriteFile(hIn, helloWorld.c_str(), (DWORD)helloWorld.length(), &dwWritten, nullptr);

, ConPTY:

 // Suppose some other async callback triggered us to resize. // This call will update the Terminal with the size we received. HRESULT hr = ResizePseudoConsole(hPC, {120, 30});

ConPTY:

 ClosePseudoConsole(hPC);

: ConPTY ConHost .

!

ConPTY API — , , Windows … !

ConPTY API Microsoft, Microsoft ( Windows Linux (WSL), Windows Containers, VSCode, Visual Studio .), , @ConEmuMaximus5 — ConEmu Windows.

, ConPTY API.

, : ConHost . Console API. , , .

, VT, , — .

, Windows, /VT UTF-8 Console API: « VT» , Console API (, 16M RGB True Color ).

/

/ , ConPTY API: , , , , .

VSCode ( GitHub #45693 ) , Windows.

ConPTY API

ConPTY API Windows 10 / 2018 .

Windows, , , ConPTY. Win32 API, API Runtime Dynamic Linking LoadLibrary() GetProcAddress() .

Windows ConPTY, API ConPTY. , , .

, ?

… ! , , ! : D

, , , . — , Windows , .

. Windows Console GitHub . , .

Source: https://habr.com/ru/post/420853/

All Articles

Meet the Windows pseudo console (ConPTY)

From TTY to PTY

TTY was first

Terminal distribution

Distribution of software terminals

The emergence of a pseudo terminal (PTY)

What, there is no pseudo-console Windows?

What to do?

Welcome to the Windows pseudo console (ConPTY)

ConHost Console Architecture

How do Windows command line applications work?

How does ConHost work?

ConHost: a contribution to the past for the sake of the future

Remote interaction with Windows command line applications

Remote operation using modern ConHost and ConPTY

ConPTY API and how to use it

ConPTY API

!

/

ConPTY API

, ?

More articles: