This article is a continuation of a more general article
“What is Selenium?” , Which explains the position that Selenium WebDriver occupies among other web application automation tools.
Here I will try to tell in more detail what Selenium WebDriver is and why it is meaningless to compare it with TestComplete, QuickTest Pro and other test automation tools. And it's not just that Selenium WebDriver is free and open - it’s just as pointless to compare it with other free tools, such as Sahi or Robot Framework.
Why?
')
Because
Selenium WebDriver is not a test automation tool .
And what is this?
This question can be given several different answers, first I will give short answers, and then more detailed ones.
In addition, I will explain why Selenium WebDriver has such a poor and inconvenient interface (command set), why it does not generate beautiful reports, and why, despite all this, it is so popular :)
Just in case, I’ll make a reservation that although in this article we are talking about WebDriver, many of the arguments are valid with regard to Selenium RC, but I will not specifically say anything about this outdated version, because its place is in the dustbin of history.
So what is Selenium WebDriver?
As intended, Selenium WebDriver is a browser driver , that is, a software library that allows you to develop programs that control the behavior of the browser.
In essence, Selenium WebDriver is :
- specification of the software interface for browser control ,
- reference implementations of this interface for several browsers ,
- a set of client libraries for this interface in several programming languages .
Now it’s clear why it’s pointless to compare Selenium WebDriver with “other testing tools”? Unclear? Then add the details.
Selenium WebDriver is a browser driver
Surely everyone who came across computers, not even an IT specialist, knows the word “driver”. This is such a small program, more precisely a software library that allows other programs to interact with some device. The printer driver allows you to print anything on the printer. Disk driver allows you to read and write data. The network card driver allows you to exchange data with other computers over the network.
With the driver, users do not work directly. They work with application programs that, through drivers, interact with certain devices. The driver has no user interface. Wait, but sometimes there is a user interface for setting the driver? It happens. But this is the interface of the
program for setting the driver , and not the driver itself. The driver has only a software interface, its purpose is to allow user application programs to interact with the device.
So,
Selenium WebDriver, or just WebDriver is a browser driver, that is, a software library that does not have a user interface, which allows various other programs to interact with the browser, control its behavior, receive some data from the browser and force the browser to perform some teams.Based on this definition, it is clear that
WebDriver is not directly related to testing . It only provides autotest access to the browser. This is where its functions end.
Structuring, grouping and running tests, and generating test reports, provides a testing framework, such as
JUnit or
TestNG for Java,
NUnit or
Gallio for .Net,
RSpec or
Cucumber for Ruby, and so on. Test development is carried out in
Eclipse ,
Intellij IDEA ,
Visual Studio ,
RubyMine, and so on. Build by
Maven ,
Gradle ,
Ant ,
NAnt ,
Rake, and so on. Running tests on a schedule and publishing reports performs a continuous integration server -
Jenkins ,
CruiseControl ,
Bamboo ,
TeamCity, and so on. And all this - independent tools that are not related to the project Selenium.
However, within the framework of the Selenium project, not only a driver is developed, but also several related products - Selenium Server allows you to organize a remote launch of the browser, with the help of Selenium Grid you can build a cluster of Selenium servers. They stand in one row with the above tools and frameworks, because they are also involved in building a test run system. In addition, there is a "recorder", which is called Selenium IDE, it can record user actions and generate code that uses the WebDriver interface to perform recorded actions.
But the main thing in the Selenium project is WebDriver, it is a key element of the Selenium ecosystem.
Are there other drivers? Of course.
Inside each commercial “integrated” tool, there are browser drivers, but they usually cannot be used separately outside this tool. There are free open drivers -
Watir provides access to the main browsers,
WatiN has a good driver for the Internet Explorer browser,
Sahi can work with the “big five” browsers.
How to compare Selenium WebDriver with other tools?
From all of the above, we can conclude that comparing WebDriver with some kind of testing tool like TestComplete or Sahi is pointless. They are in different weight categories. It's like comparing a printer driver with a text editor.
And what can you compare?
You can compare WebDriver with drivers that are included in various tools. For example, you can compare:
- which browsers and browser versions are supported, including mobile,
- which operating systems are supported, including mobile,
- can you control multiple browsers on one machine at the same time, are there any conflicts,
- is it possible to control the browser on a remote machine
- what actions in the browser can be performed,
- what data from the browser can be obtained,
- how accurately the driver emulates user actions, that is, whether it generates all the same events in the browser that occur during the operation of this user,
- is it possible to work with dialog boxes (alert, prompt),
- is it possible to work with “native” windows (file upload dialog),
- Is it possible to work with HTTPS protocols and certificates
- and so on.
And here WebDriver is the undisputed leader. However, the very comparison of WebDriver with anything is beyond the scope of this article.
As for the comparison with the “complex” tools like TestComplete or Sahi, for this you need to take a full stack, not WebDriver.
For example, the stack for Java technology might be: Jenkins + Maven + Thucydices + JUnit + WebDriver. To this, all the features of the Java programming language are added, plus a lot of plug-ins for Maven and Jenkins, and for everything to be cool - you can run tests in the clouds using some service like
SauceLabs .
That's when the comparison will be interesting. But this is not only a WebDriver merit; the whole stack is important, not just the browser driver. As for WebDriver, it is worth noting that it fits perfectly into almost any stack, this is one of its advantages as an “independent” driver.
Of course, WebDriver can be used not only for testing. He doesn’t care who or why he wants to control the browser. You can automate some routine tasks. You can make bots that will flood the forums. You can make a script that automatically takes screenshots for documentation. Anything. Driver anyway. It only provides access to the browser.
In addition, whatever tool you use - it is quite possible that WebDriver can be connected to it, which has implementations in many different languages - Java, C #, Ruby, Python. And then you, in addition to all the features of your favorite tool, add all the advantages of WebDriver. It is worth the effort, because among the drivers at the moment he is the best.
Well, yes, I have already repeated several times that “he is the best,” but at the same time he did not compare them with other drivers. And I will not. Because there is an argument that in perspective is more important than any comparisons.
Selenium WebDriver is a browser control interface specification.
The most important difference between WebDriver and all other drivers is that it is a “standard” driver, and all others are “non-standard”.
And this is not a simple figure of speech.
The W3C organization really took WebDriver as the basis for developing a
standard browser control interface . Now he is in a state of public review.
After a year and a half, this standard will be approved. And then the implementation of the WebDriver interface will be entrusted to the browser manufacturers, and WebDriver as an independent driver may disappear completely in the future, because it will be embedded directly into the browsers.
Thus, it can be said that Selenium WebDriver is not a tool at all, but a specification, a document, a standard describing which interface browsers should provide outside, so that the browser can be controlled through this interface.
While the standard is being discussed, browser makers are already working. Within the framework of the Selenium project, several reference implementations for different browsers were developed, but gradually this activity is transferred to the competence of browser manufacturers. The driver for the Chrome browser is
developed as part of the Chromium project , it is made by the same team that develops the browser itself. The driver for the Opera browser is
developed by Opera Software . The driver for the Firefox browser is still being developed by the participants of the Selenium project, but in the depths of Mozilla, a replacement is being prepared for it, which has the code name
Marionette . This new driver for Firefox is already available in the developer’s browser builds. Internet Explorer and Safari are next in line, employees of the respective companies have not yet connected to their development, but there is some progress in this direction, because the standard (even the future one) obliges.
In general, we can say that Selenium is the only project to create automation tools for browser management, in which companies that develop browsers participate directly. This is one of the key reasons for its success.
And what will happen after all standards have been implemented in all browsers?
It would be logical to expect that manufacturers of testing tools will not reinvent the wheel, but will control the browser through a standard interface. We can say that all tools will use WebDriver to interact with the browser. But it will not be Selenium WebDriver as an independent driver, but Selenium WebDriver as an interface specification.
So why does he have such a primitive interface?
Just because WebDriver is:
- browser driver, that is, a library of a fairly low level of abstraction,
- standard for browser control interface, that is, the minimum set of commands that must be implemented in each browser.
When developing Selenium WebDriver the purpose was initially set - not to include in it anything superfluous. The standard browser management interface should be simple and stable.
The set of commands was successively reduced, such “usability-enhancing” commands as check, uncheck (for checkboxes), select (for drop-down lists) were thrown out. They all boil down to a simpler click command and therefore they are redundant. Now there is only one redundant command in the WebDriver interface - this is submit, but it may someday be and it will be eliminated.
In addition, the interface structure was designed in such a way that it could be described in IDL (this is done in the W3C standard) and made implementations in various programming languages. Therefore, a minimum of linguistic idioms, a minimum of "hidden" variables, an interface "blunt and straightforward" was used.
But then, thanks to this primitive interface, now for the WebDriver interface there are implementations of client libraries in Java, C #, Ruby, Python, JavaScript, PHP, Perl, and even Haskell!
And thanks to the same simplicity, WebDriver integrates seamlessly with any other tools and fits into any stack. This is the secret of its popularity and rapid spread - it does not try to “conquer” other tools; instead, it integrates with them.
But what about usability?
This task should be solved by extensions based on Selenium WebDriver. They must provide an extended set of commands, implementing these commands through a primitive WebDriver interface. The Selenium distribution has a Select class, designed to work with drop-down lists, which is a good demonstration of how extensions should be built.
Gradually, there are libraries that are built on the basis of Selenium WebDriver and provide a higher level of abstraction:
Selenide ,
fluent-selenium ,
watir-webdriver ,
Thucidides . Popular test design frameworks allow you to use WebDriver along with other drivers. Among such frameworks we can mention
Robot Framework ,
Capybara and the same
Thucidides .
Sooner or later, auxiliary libraries should appear that make it easier to work with various sets of widgets - jQuery, Prototype, ExtJS, GWT, and others.
The number of such extensions and tools will grow, the complexity too. So it may soon happen that, using a tool, you will perform tests without even knowing that you interact with the browser through the Selenium WebDriver driver.
Is it then worth exploring Selenium?
Could it be better to study these libraries and higher level tools?
To answer this question, I will formulate it in a different way: who should study Selenium and why, and who should use higher-level libraries and tools?
- Whatever tool you use, you need to select a driver that controls the browser. To select it, you need to know the capabilities of the driver - what it can and cannot do. At this level, Selenium needs to be mastered by every automation specialist. In this case, the specific interface WebDriver, if you work with it, there is no need to study.
- A simple set of commands is easier to learn than “advanced”, that is, Selenium is easier to master than its extension. This phenomenon has a downside: if you studied the extended command set, then suddenly it turns out that you also learned the set of WebDriver commands.
- Extensions are usually language-dependent, because the addition of convenience involves the use of language idioms, typical methods of organizing code in a particular programming language. The basic interface of WebDriver is simple, so having mastered it, you can use it in any language, it will look almost the same.
- Most libraries aimed at improving the convenience of the interface improve the element search tools — additional types of locators, a more convenient way of describing locators, and so on. The primitives corresponding to the actions of the user are already quite good in WebDriver. Although, of course, the libraries will implement the typical "bundles", that is, the sequence of these actions, in the same way as was done in the Select class for drop-down lists.
- If you use “tags” to describe tests (as in the Robot Framework) or a special language to describe at the domain level (DSL, Domain Specific Language) - you do not need to know about the WebDriver primitives. But if you implement “fixtures” for tests, describe actions that can be operated on tablets, implement DSL - you will have to work directly with WebDriver, or with some of its extensions, but not too high level.
- And the very last argument, which, I hope, will eventually become less and less relevant - alas, while good extensions are sorely lacking. They will definitely appear. Maybe it is you who implement one of these extensions. To do this, you need to explore the WebDriver interface. And those who will enjoy the fruits of your work will be able to work with a higher-level library. In the meantime, you have to use WebDriver directly with small add-ons above it.
I hope all of the above will allow you to better understand what place Selenium WebDriver takes in the overall picture of the world and how it relates to other tools. If there are still incomprehensible moments - ask questions in the comments, I will try to clarify everything.