Selenium under Windows: everything from the beginning

I present to you the translation of my article on Medium.com.

First released over 30 years ago, Microsoft Windows is today the undisputed leader among desktop operating systems. This simply cannot be ignored when developing web applications. In this article, I would like to discuss some of the features of using Selenium under Windows and propose a simple and proven solution that makes life much easier.

How is Windows different from Linux?

In my previous articles ( first , second , third ), I described open source approaches and tools that allow you to organize a scalable Selenium cluster. We also talked about how using the same tools to effectively run tests on the developer's machine. All articles used Linux as the operating system. How does Windows differ from Linux in terms of Selenium?

Availability of browsers that do not exist on other platforms. Depending on the version, Windows comes preinstalled with Internet Explorer (IE) or Microsoft Edge. Only one version of each browser can be installed at a time. For both browsers, there are ready-made executable files for web drivers (IEDriverServer and EdgeDriver, respectively), which use Windows API calls to launch and control the browser. On this side of Windows, browsers are architecturally no different from browsers on Linux.
A graphical interface is built into the operating system. Most versions of Windows (except the latest versions of Windows Server) have a built-in graphical user interface, which can neither be disabled nor replaced by another graphics server. The interface automatically starts with the operating system and constantly consumes resources. In addition, the Windows graphical user interface by default displays all windows that are opened (including browser windows) on the same desktop and only one of these windows can be in focus at a given point in time. Because of this, attempts to run multiple IE or Edge in parallel often lead to various problems with window focus: different CSS styles (for example, when pointing over links), non-triggered DOM events, and so on. This problem is very difficult to work.
Almost complete lack of Docker support. The latest versions of Windows Server support most of the Docker features natively, but there is no such support on the desktop versions that interest us. Therefore, the only way to run Docker on these versions is with a Linux virtual machine in which Docker is installed.

As you can see, many modern approaches when working with Selenium: using the X server without a monitor and launching browsers in containers do not work in Windows. But is it possible to achieve Linux-like performance and circumvent the known limitations of Windows? Yes, and it's easier than you might think! In the following sections I will explain how to do this.

Create order from chaos

We will move to the goal step by step. To start, we will make the decision as easy as possible. As you know, the usual installation of Selenium on Windows looks like this:

The scheme consists of a Selenium server running using a Java Virtual Machine (JRE), then an executable file IEDriverServer or EdgeDriver and, finally, the browser itself - IE or Edge. There is at least one weak link in this chain - Selenium server and Java. This is because Selenium acts here as a simple proxy server, which starts the driver process on a random port and then sends all requests to this port. Network traffic proxying is the simplest task in any programming language, because the main work is performed by the network subsystem of the operating system. That is why installing Java (50 MB or more) and downloading the Selenium server (20 MB or more) for simple proxying is a bit too cumbersome. Moreover, Selenium server does not work well under load:

It consumes too much memory and sometimes even flows.
Proxying is performed "manually" - for each request, a new instance of the HTTP client is created and the incoming request is copied into it. This approach is very inefficient and in some cases causes strange timeouts during proxying.
We can significantly improve the situation by simply replacing the heavy Selenium server with the lightweight Selenoid .

How to replace Selenium server with Selenoid

Selenoid is a lightweight replacement for the Selenium server, written in the Go language. Selenoid comes as one small (about 7 MB) executable file and has no external dependencies. To start using, you just need to download and run this file. In my previous article I briefly described how convenient Selenoid can be to launch browsers in Docker containers - its main purpose. The second supported mode is launching executable files instead of containers and proxying network traffic to them - just as Selenium server does with IEDriverServer and EdgeDriver. Replacing Selenium server with Selenoid is very simple. For example, start Internet Explorer using Selenoid:

Download the Selenoid executable file from the releases page . The executable file is usually called selenoid_windows_386.exe for 32-bit Windows and selenoid_windows_amd64.exe for Windows 64 bit. As far as I know, desktop versions of Windows do not have a built-in console program for downloading files. But, if you have Cygwin and curl installed, you can download the file like this:
```
 $ curl -o selenoid.exe https://github.com/aerokube/selenoid/releases/download/1.2.1/selenoid_windows_386.exe 
```
Download and unpack the archive with IEDriverServer.exe from the Selenium download page. For example, save IEDriverServer.exe in C:\ .
Configure Internet Explorer as described on the wiki .

Create a simple configuration file for Selenoid - browsers.json :

 { "internet explorer": { "default": "11", "versions": { "11": { "image": ["C:\\IEDriverServer.exe"] } } } }

Run Selenoid instead of Selenium server (port 4444 should be free) using the selenoid.bat file like this:
```
 C:\selenoid.exe -conf C:\browsers.json -disable-docker -limit 4 > C:\selenoid.log 2>&1 
```
Here we assume that all the files from the previous steps have been saved in C:\ . Selenoid logs will be saved in C:\selenoid.log . Pay attention to the -limit parameter - it determines how many sessions can be run simultaneously. When the specified number of sessions is running, new requests are queued in the same way as in the Selenium server.
Done! You can run tests on the same URL:
```
 http://localhost:4444/wd/hub 
```
To remain lightweight, Selenoid does not have a built-in graphical interface. The muzzle is made as a separate executable file: Selenoid UI . Just download the compiled file from the releases page and run it, then open http://localhost:8080/ in the browser.

Run tests on multiple desktops

After replacing Selenium server with Selenoid, you will see a significant reduction in memory and CPU consumption. This simple step may even allow you to launch more browsers in parallel. However, a simple replacement does not cure problems with opening multiple browser windows at the same time. The windows still show on the same desktop and continue to lose focus. In order to circumvent this obstacle is required to learn how to run browsers in separate desktops. The good news is that the internal Windows APIs even in desktop versions have support for virtual desktops — you can switch between desktops and launch windows in these desktops independently of each other. But there is a better news - you do not need to dive into the insides of Windows to get this behavior for Selenium - the necessary functionality is already implemented in the headless-selenium-for-win project. After downloading the release, you will receive an archive with two executable files: desktop_utils.exe and headless_ie_selenium.exe .

The first one is a console utility for manual switching between desktops. The command looks like this:

 C:> desktop_utils.exe -s desktop1

To work with Selenium, we need a second utility, headless_ie_selenium.exe . It is an add-on to IEDriverServer.exe , which processes requests for sessions and automatically launches IEDriverServer.exe in the new desktop. headless_ie_selenium.exe should be in the same directory as IEDriverServer.exe . In order to use the utility with Selenoid, you just need to replace let it before the executable file in browsers.json and restart Selenoid:

 { "internet explorer": { "default": "11", "versions": { "11": { "image": ["C:\\headless_ie_selenium.exe"] } } } }

Now all the problems with the focus of the windows should go.

A bit of magic with Selenium capabilities

By simply replacing Selenium with Selenoid and IEDriverServer.exe with headless_ie_selenium.exe we solved the most acute problems of Selenium under Windows. Let's make a diamond with a diamond by setting some useful capabilities in the tests.

By default, Internet Explorer uses the HTTP proxy system settings. This leads to the fact that proxy settings set in one session "crawl through" into other sessions. To fix this, set:
```
 ie.usePerProcessProxy = true 
```
Your web application may use cookies to store important information. In Windows, these files are stored separately for each user and the default behavior is to reuse the exposed cookies between parallel sessions. This can lead to floating tests. To avoid the reuse of cookies, you can start IE in anonymous mode:
```
 ie.browserCommandLineSwitches = "-private" 
```
Also do not forget to set:
```
 ie.ensureCleanSession = true 
```
In order to avoid strange errors with the window focus, also make sure that the specified capability is not set to false or false:
```
 requireWindowFocus = false 
```

Conclusion

In this article I briefly described the main problems that you may encounter when running Selenium tests under Windows and offered a simple solution. I keep saying - tests in Selenium may not be painful. You just need to know how to cook it.

Source: https://habr.com/ru/post/329256/

All Articles