📜 ⬆️ ⬇️

Automating the desktop GUI in Python + pywinauto: how to make friends with MS UI Automation

Python library pywinauto is an open source project to automate desktop GUI applications on Windows. Over the past two years, new large features have appeared in it:



We will also make a small review of what is in open source for desktop automation (without any claims for a serious comparison).


This article is partly a transcript of a report from the SQA Days 20 conference in Minsk ( videotape and slides ), partly a Russian version of the Getting Started Guide for pywinauto.




Let's start with a brief overview of the open source in this area. For desktop GUI applications, everything is somewhat more complicated than for the web, which has Selenium. Here are the main approaches:


Coordinate method


Hardcode points of clicks, we hope for successful hits.
[+] Cross-platform, easy to implement.
[+] It is easy to make a "record-replay" test record.
[-] The most unstable to change the screen resolution, themes, fonts, window sizes, etc.
[-] Great support efforts are needed, it is often easier to re-generate tests from scratch or to test manually.
[-] Automates only actions, there are other methods to verify and extract data.


Tools (cross-platform): autopy , PyAutoGUI , PyUserInput and many others. As a rule, more complex tools include this functionality (not always cross-platform).


It is worth saying that the coordinate method can complement the remaining approaches. For example, for custom graphics, you can click on relative coordinates (from the upper left corner of the window / element, and not the entire screen) - this is usually quite reliable, especially if you consider the length / width of the entire element (then the different screen resolution does not hurt).


Another option: to allocate for testing only one machine with stable settings (not cross-platform, but in some cases it is suitable).


Recognition of reference images


[+] Cross-platform
[+ -] Relatively reliable (better than the coordinate method), but still requires some tricks.
[- +] Relatively slow, because requires CPU resources for recognition algorithms.
[-] About text recognition (OCR), as a rule, no speech => text data cannot be obtained. As far as I know, existing OCR solutions are not very reliable for this type of problem, and they are not widely used (welcome in the comments if this is no longer the case).


Tools: Sikuli , Lackey (Sikuli-compatible, in pure Python), PyAutoGUI .


Accessibility technology


[+] The most reliable method, because allows you to search by text, regardless of how it is drawn by the system or framework.
[+] Allows you to extract text data => easier to verify test results.
[+] As a rule, the fastest, because almost does not consume CPU resources.
[-] It is difficult to make a cross-platform tool: absolutely all open-source libraries support one or two accessibility technologies. Windows / Linux / MacOS does not fully support anyone other than paid ones such as TestComplete, UFT or Squish.
[-] Such technology is not always available in principle. For example, testing the boot screen inside VirtualBox - here you cannot do without image recognition. But in many classic cases, the accessibility approach is applicable. About him further and will be discussed.


Tools: TestStack.White on C #, Winium.Desktop on C # (Selenium compatible), MS WinAppDriver on C # (Appium compatible), pywinauto , pyatom (compatible with LDTP), Python-UIAutomation-for-Windows , RAutomation on Ruby, LDTP ( Linux Desktop Testing Project) and its Windows version of Cobra .


LDTP is perhaps the only cross-platform open-source tool (more precisely, a library family) based on accessibility technologies. However, it is not too popular. I did not use it myself, but according to reviews the interface is not the most convenient. If there are positive reviews, please share in the comments.


Test backdoor (aka inner bike)


For cross-platform applications, the developers themselves often make an internal mechanism to ensure testability. For example, they create a service TCP server in the application, tests connect to it and send text commands: what to click, where to get the data, etc. Reliable, but not universal.


Key desktop accessibility technology


Good old Win32 API


Most Windows applications written before the release of WPF and then the Windows Store are built anyway on the Win32 API. Namely, MFC, WTL, C ++ Builder, Delphi, VB6 - all these tools use the Win32 API. Even Windows Forms are pretty much Win32 API compatible.


Tools: AutoIt (similar to VB) and Python wrapper: pyautoit , AutoHotkey (own language, IDispatch COM interface), pywinauto (Python), RAutomation (Ruby), win32-autogui (Ruby).


Microsoft UI Automation


The main advantage: MS UI Automation technology supports the vast majority of GUI applications on Windows with rare exceptions. Problem: it is not much easier to learn than the Win32 API. Otherwise, no one would have done wraps over it.


In fact, this is a set of custom COM interfaces (mostly UIAutomationCore.dll ), and also has a .NET wrapper in the form of the namespace System.Windows.Automation . It, by the way, has the introduced bug because of which some UI elements can be passed. Therefore, it is better to use UIAutomationCore.dll directly (if you heard about UiaComWrapper in C #, then this is it).


Varieties of COM interfaces:


(1) Basic IUknown - "the root of all evil". The lowest level, never user-friendly.
(2) IDispatch and derivatives (for example, Excel.Application ) that can be used in Python using the win32com.client package (included in pyWin32). The most convenient and beautiful option.
(3) Custom interfaces that third-party Python comtypes can work with .


Tools: TestStack.White on C #, pywinauto 0.6.0+, Winium.Desktop on C #, Python-UIAutomation-for-Windows (they do not have the source code of sishnyh wrappers over UIAutomationCore.dll), RAutomation on Ruby.


AT-SPI


Despite the fact that almost all the axes of the Linux family are built on the X Window System (in Fedora 25, the "X's" changed to Wayland), the "X's" allow you to operate only with top-level windows and a mouse / keyboard. For a detailed analysis of the buttons, sheet boxes, and so on - there is technology AT-SPI. The most popular window managers have a so-called AT-SPI registry daemon, which provides an automated GUI for applications (at least Qt and GTK are supported).


Tools: pyatspi2 .


pyatspi2, in my opinion, contains too many dependencies of the same PyGObject type. The technology itself is available in the form of the usual dynamic library libatspi.so . It has a Reference Manual . For the pywinauto library, we plan to implement AT-SPI support as follows: via the libatspi.so download and the ctypes module. There is a small problem only in using the correct version, because for GTK + and Qt applications they are slightly different. A likely release of pywinauto 0.7.0 with full Linux support can be expected in the first half of 2018.


Apple Accessibility API


MacOS has its own AppleScript automation language. To implement something like this in Python, of course, you need to use functions from ObjectiveC. Starting with MacOS 10.6, it seems, the pyobjc package is included in the pre-installed python. This will also ease the dependency list for future support in pywinauto.


Tools: In addition to the Apple Script language, you should pay attention to ATOMac , also known as pyatom. It is interface compatible with LDTP, but is also a standalone library. It has an example of automating iTunes on macOs , written by my student. There is a known problem: flexible timings do not work ( waitFor* methods). But, in general, not a bad thing.




How to start working with pywinauto


The first step is to arm yourself with a GUI object inspector (what is called the Spy tool). It will help to study the application from the inside: how the hierarchy of elements is arranged, what properties are available. The most famous object inspectors:



Enlightening the application through, choose the backend that we will use. It is enough to specify the name of the backend when creating the Application object.



Input points for automation


The application is studied enough. It's time to create an Application object and start it or join an already running one. This is not just a clone of the standard subprocess.Popen class, but an introductory object that limits all your actions to the limits of the process. This is very useful if you run several instances of the application, and do not want to touch the rest.


 from pywinauto.application import Application app = Application(backend="uia").start('notepad.exe') #  ,      Notepad.exe dlg_spec = app.UntitledNotepad #      actionable_dlg = dlg_spec.wait('visible') 

If you want to manage several applications at once, the Desktop class will help you. For example, in the calculator on Win10, the hierarchy of elements is spread over several processes (not just calc.exe ). So without a Desktop object is indispensable.


 from subprocess import Popen from pywinauto import Desktop Popen('calc.exe', shell=True) dlg = Desktop(backend="uia").Calculator dlg.wait('visible') 

The root object ( Application or Desktop ) is the only place where you need to specify the backend. Everything else is transparently in the concept of "specification-> wrapper", which is further.


Specifications of windows / elements


This is the basic concept on which the pywinauto interface is built. You can describe a window / element approximately or in more detail, even if it does not already exist or is already closed. The window specification ( WindowSpecification object) stores the criteria by which to search for a real window or element.


An example of a detailed window specification:


 >>> dlg_spec = app.window(title='Untitled - Notepad') >>> dlg_spec <pywinauto.application.WindowSpecification object at 0x0568B790> >>> dlg_spec.wrapper_object() <pywinauto.controls.win32_controls.DialogWrapper object at 0x05639B70> 

The window search itself is performed by calling the .wrapper_object() method. It returns a certain "wrapper" for a real window / element or throws an ElementNotFoundError (sometimes ElementAmbiguousError , if several elements are found, that is, you need to specify the search criteria). This “vrapper” already knows how to do something with the element or get data from it.


Python can hide the .wrapper_object() call, so the final code gets shorter. We recommend using it only for debugging. The following two lines do exactly the same thing:


 dlg_spec.wrapper_object().minimize() # debugging dlg_spec.minimize() # production 

There are many search criteria for the window specification. Here are just a few examples:


 #     app.window(title_re='.* - Notepad$').window(class_name='Edit') #    ( AND)       dlg = Desktop(backend="uia").Calculator dlg.window(auto_id='num8Button', control_type='Button') 

A list of all possible criteria is on the docks of the pywinauto.findwindows.find_elements (...) function.


Access magic by attribute and by key


Python makes it easy to create window specifications and recognize object attributes dynamically (the __getattribute__ method is overridden __getattribute__ ). Of course, the same restrictions apply to the name of an attribute as to the name of any variable (spaces, commas, and other special characters cannot be inserted). Fortunately, pywinauto uses the so-called "best match" search algorithm, which is resistant to typos and small variations.


 app.UntitledNotepad #   ,  app.window(best_match='UntitledNotepad') 

If you still need Unicode strings (for example, for the Russian language), spaces, etc., you can make access by key (as if it is a regular dictionary):


 app['Untitled - Notepad'] #   ,  app.window(best_match='Untitled - Notepad') 

Five rules for magic names


How to find reference magic names? Those that are assigned to the element before the search. If you specify a name similar to the standard, then the element will be found.


  1. By title (text, name): app.Properties.OK.click()
  2. By text and by item type: app.Properties.OKButton.click()
  3. By type and number: app.Properties.Button3.click() (the names Button0 and Button1 tied to the first element found, Button2 - to the second, and then in order - historically)
  4. By static text (left or top) and by type: app.OpenDialog.FileNameEdit.set_text("") (useful for elements with dynamic text)
  5. By type and text inside: app.Properties.TabControlSharing.select("General")

Usually two or three rules apply at the same time, rarely more. You can use the print_control_identifiers () method to check which specific names are available for each element. It can print a tree of elements both on the screen and in a file. For each element its reference magic names are printed. You can also copy out more detailed specifications of the children from there. The result in the script will look like this:


 app.Properties.child_window(title="Contains:", auto_id="13087", control_type="Edit") 

The element tree itself is usually a rather large footcloth.
 >>> app.Properties.print_control_identifiers() Control Identifiers: Dialog - 'Windows NT Properties' (L688, T518, R1065, B1006) [u'Windows NT PropertiesDialog', u'Dialog', u'Windows NT Properties'] child_window(title="Windows NT Properties", control_type="Window") | | Image - '' (L717, T589, R749, B622) | [u'', u'0', u'Image1', u'Image0', 'Image', u'1'] | child_window(auto_id="13057", control_type="Image") | | Image - '' (L717, T630, R1035, B632) | ['Image2', u'2'] | child_window(auto_id="13095", control_type="Image") | | Edit - 'Folder name:' (L790, T596, R1036, B619) | [u'3', 'Edit', u'Edit1', u'Edit0'] | child_window(title="Folder name:", auto_id="13156", control_type="Edit") | | Static - 'Type:' (L717, T643, R780, B658) | [u'Type:Static', u'Static', u'Static1', u'Static0', u'Type:'] | child_window(title="Type:", auto_id="13080", control_type="Text") | | Edit - 'Type:' (L790, T643, R1036, B666) | [u'4', 'Edit2', u'Type:Edit'] | child_window(title="Type:", auto_id="13059", control_type="Edit") | | Static - 'Location:' (L717, T669, R780, B684) | [u'Location:Static', u'Location:', u'Static2'] | child_window(title="Location:", auto_id="13089", control_type="Text") | | Edit - 'Location:' (L790, T669, R1036, B692) | ['Edit3', u'Location:Edit', u'5'] | child_window(title="Location:", auto_id="13065", control_type="Edit") | | Static - 'Size:' (L717, T695, R780, B710) | [u'Size:Static', u'Size:', u'Static3'] | child_window(title="Size:", auto_id="13081", control_type="Text") | | Edit - 'Size:' (L790, T695, R1036, B718) | ['Edit4', u'6', u'Size:Edit'] | child_window(title="Size:", auto_id="13064", control_type="Edit") | | Static - 'Size on disk:' (L717, T721, R780, B736) | [u'Size on disk:', u'Size on disk:Static', u'Static4'] | child_window(title="Size on disk:", auto_id="13107", control_type="Text") | | Edit - 'Size on disk:' (L790, T721, R1036, B744) | ['Edit5', u'7', u'Size on disk:Edit'] | child_window(title="Size on disk:", auto_id="13106", control_type="Edit") | | Static - 'Contains:' (L717, T747, R780, B762) | [u'Contains:1', u'Contains:0', u'Contains:Static', u'Static5', u'Contains:'] | child_window(title="Contains:", auto_id="13088", control_type="Text") | | Edit - 'Contains:' (L790, T747, R1036, B770) | [u'8', 'Edit6', u'Contains:Edit'] | child_window(title="Contains:", auto_id="13087", control_type="Edit") | | Image - 'Contains:' (L717, T773, R1035, B775) | [u'Contains:Image', 'Image3', u'Contains:2'] | child_window(title="Contains:", auto_id="13096", control_type="Image") | | Static - 'Created:' (L717, T786, R780, B801) | [u'Created:', u'Created:Static', u'Static6', u'Created:1', u'Created:0'] | child_window(title="Created:", auto_id="13092", control_type="Text") | | Edit - 'Created:' (L790, T786, R1036, B809) | [u'Created:Edit', 'Edit7', u'9'] | child_window(title="Created:", auto_id="13072", control_type="Edit") | | Image - 'Created:' (L717, T812, R1035, B814) | [u'Created:Image', 'Image4', u'Created:2'] | child_window(title="Created:", auto_id="13097", control_type="Image") | | Static - 'Attributes:' (L717, T825, R780, B840) | [u'Attributes:Static', u'Static7', u'Attributes:'] | child_window(title="Attributes:", auto_id="13091", control_type="Text") | | CheckBox - 'Read-only (Only applies to files in folder)' (L790, T825, R1035, B841) | [u'CheckBox0', u'CheckBox1', 'CheckBox', u'Read-only (Only applies to files in folder)CheckBox', u'Read-only (Only applies to files in folder)'] | child_window(title="Read-only (Only applies to files in folder)", auto_id="13075", control_type="CheckBox") | | CheckBox - 'Hidden' (L790, T848, R865, B864) | ['CheckBox2', u'HiddenCheckBox', u'Hidden'] | child_window(title="Hidden", auto_id="13076", control_type="CheckBox") | | Button - 'Advanced...' (L930, T845, R1035, B868) | [u'Advanced...', u'Advanced...Button', 'Button', u'Button1', u'Button0'] | child_window(title="Advanced...", auto_id="13154", control_type="Button") | | Button - 'OK' (L814, T968, R889, B991) | ['Button2', u'OK', u'OKButton'] | child_window(title="OK", auto_id="1", control_type="Button") | | Button - 'Cancel' (L895, T968, R970, B991) | ['Button3', u'CancelButton', u'Cancel'] | child_window(title="Cancel", auto_id="2", control_type="Button") | | Button - 'Apply' (L976, T968, R1051, B991) | ['Button4', u'ApplyButton', u'Apply'] | child_window(title="Apply", auto_id="12321", control_type="Button") | | TabControl - '' (L702, T556, R1051, B962) | [u'10', u'TabControlSharing', u'TabControlPrevious Versions', u'TabControlSecurity', u'TabControl', u'TabControlCustomize'] | child_window(auto_id="12320", control_type="Tab") | | | | TabItem - 'General' (L704, T558, R753, B576) | | [u'GeneralTabItem', 'TabItem', u'General', u'TabItem0', u'TabItem1'] | | child_window(title="General", control_type="TabItem") | | | | TabItem - 'Sharing' (L753, T558, R801, B576) | | [u'Sharing', u'SharingTabItem', 'TabItem2'] | | child_window(title="Sharing", control_type="TabItem") | | | | TabItem - 'Security' (L801, T558, R851, B576) | | [u'Security', 'TabItem3', u'SecurityTabItem'] | | child_window(title="Security", control_type="TabItem") | | | | TabItem - 'Previous Versions' (L851, T558, R947, B576) | | [u'Previous VersionsTabItem', u'Previous Versions', 'TabItem4'] | | child_window(title="Previous Versions", control_type="TabItem") | | | | TabItem - 'Customize' (L947, T558, R1007, B576) | | [u'CustomizeTabItem', 'TabItem5', u'Customize'] | | child_window(title="Customize", control_type="TabItem") | | TitleBar - 'None' (L712, T521, R1057, B549) | ['TitleBar', u'11'] | | | | Menu - 'System' (L696, T526, R718, B548) | | [u'System0', u'System', u'System1', u'Menu', u'SystemMenu'] | | child_window(title="System", auto_id="MenuBar", control_type="MenuBar") | | | | | | MenuItem - 'System' (L696, T526, R718, B548) | | | [u'System2', u'MenuItem', u'SystemMenuItem'] | | | child_window(title="System", control_type="MenuItem") | | | | Button - 'Close' (L1024, T519, R1058, B549) | | [u'CloseButton', u'Close', 'Button5'] | | child_window(title="Close", control_type="Button") 

In some cases, printing the entire tree can slow down (for example, in iTunes, there are as many as three thousand elements on one tab!), But you can use the depth parameter (depth): depth=1 — the element itself, depth=2 — only immediate children, and so on . It can also be specified in the specifications when creating a child_window .


Examples


We are constantly updating the list of examples in the repository . From fresh ones, it is worth noting the automation of the WireShark network analyzer (this is a good example of a Qt5 application; although this task can be accomplished without the GUI, there is a scapy.Sniffer from the scapy python package). There is also an example of automating MS Paint with its Ribbon toolbar.


Another great example written by my student: dragging a file from explorer.exe to Chrome page for Google Drive (it will move to the main repository a little later).


And, of course, an example of a subscription to keyboard events (hot keys) and mice:
hook_and_listen.py .


Thanks


Special thanks to those who constantly help to develop the project. For me and Valentine is a constant hobby. Two of my students from UNN recently defended bachelor's degrees in this subject. Alexander made a great contribution to the support of MS UI Automation and recently began to make an automatic code generator on the principle of "record-playback" based on text properties (this is the most complicated feature), so far only for the "uia" backend. Linux AT-SPI ( mouse keyboard python-xlib — 0.6.x).


Python, - , . - . . (QuantifiedCode, Codacy Landscape) ( AppVeyor) 95%.


, , !



StackOverflow ( SO ) . Gitter' .


open-source GUI . Autohotkey ( ) PyAutoGUI ( Al Sweigart: "Automate the Boring Stuff with Python" ).


')

Source: https://habr.com/ru/post/323962/


All Articles