Using Microsoft Active Accessibility technology to access browser content

Let's come up with a solution for such a simple task.
There is : a browser (IE, Chrome or Firefox), already launched by the user.
Required : write a program that will receive the URL that is currently entered in the address bar.

Let's think about how to solve this simple problem will NOT work:

1. FindWindow + GetWindowText

Why not work

The first idea is to find the browser window, there is a child window of the address bar and take the URL from there. Practice shows that a separate child window for the address bar has only IE. FF and Chrome are cross-platform, so they prefer to draw all their content on their own.

2. Browser extension that will give the URL to our program (for example, through a request to localhost)

Why not work

Can. But first, for three browsers, you will need to write 3 different extensions, and second, for FF and Chrome, we will be forced to distribute it only through their extension stores. To write a program, the performance of which depends on whether the left heel of the moderator is combed today - no, thank you.

3. Let's write a sniffer and see what the user opened there.

Why not work

And let's But what next? Even if we separate the data received by the browser and decrypt the HTTP protocol from the traffic stream, we still won’t know the current URL (there will be a lot of links in the stream). In addition, HTTPS connections, HTTP / 2, links to locally open files, links to internal pages (like chrome: // settings ), etc. go straight to the garden.

4. Let's use the Remote Debugging Protocol or some other Selenium

Why not work

Not suitable because of the limitations of the conditions of the original task: the browser is already running, we cannot launch a new instance under control, we need to interact with the existing one.

5. Maybe hooks ?

Why not work

Well, we can penetrate into the browser. And what to hang hooks for? For IE, everything is clear - SetWindowText for the address bar window (but with it the simpler method # 1 passed). And in FF and Chrome, we don’t have any clearly defined objects and interfaces that we can hook up to. You can do something with a specific version of the browser, but there is no universal solution.

6. Screenshot of the browser window, determining the position of the address bar, text recognition from the image!

Why not work

Already somehow starts to look like despair, right? Let us estimate all the variants of the OS color schemes, resolutions, scales, take into account the presence of plug-ins in the browser, color schemes, non-standard layout of elements, right-to-left language locales and end the case when the address bar window is too narrow to fit the URL completely.

7. Your option
And write in the comments what other decisions you have in mind and we will think whether it will work or not.
')
And now one of the correct answers: we will use the already old, but very stable and supported by all browsers in all operating systems from Win95 to Win10, Microsoft Active Accessibility technology, which will give us the opportunity not only to get the current URL (in the same way for all browsers), but in general, to give access to all browser content — from the parent window itself, with its title, menu, toolbar, tabs, and up to the contents of the open web page right up to the most recent element.

Introduction

Microsoft Active Accessibility (MSAA) was invented as early as the 1997th year and made it possible to write screen magnifiers, applications for reading text from the screen and creating other programs that improve interaction with the computer of people with disabilities (eye problems , hearing, etc.). Technology support in IE appeared a long time ago, in FF and Chrome was also added a bit later. With the release of Vista, an improvement appeared - the Windows Automation API, however, the good old MSAA has not gone away, it works fine with the latest OS and browsers.

Code

In general, there is nothing difficult in the code. The entry point for us will be the parent browser window, which can be obtained by its ClassID:

FindWindow(L"IEFrame", NULL); // IE FindWindow(L"MozillaWindowClass", NULL); // Firefox FindWindow(L"Chrome_WidgetWin_1", NULL); // Chrome.    ,  -  (http://www.chromium.org/developers/design-documents/accessibility)    ,     "Chrome",  ,        .    ,       class name   .

Next you need to get a pointer to the IAccessible COM interface from this window.

 ::AccessibleObjectFromWindow(hWndChrome, OBJID_CLIENT, IID_IAccessible, (void**)(&pAccMain));

Yes, do not forget:

Include the header file #include "oleacc.h"
Link Oleacc.lib
Initialize COM by calling the function :: CoInitialize (NULL);
It is very important not to forget! Without this, something might start working for you, but at unexpected moments you will get strange errors. It is also possible that there will be no errors, but you just will not get some data. In general, a very vile and perfect error that cannot be debugged.

So, we have a pointer to IAccessible. What it is? This is the root node of the tree describing the entire browser - window, title, menu, toolbars, address bar, page content, statusbar. How would all this be seen in a visual form? Nothing is easier! Microsoft provides the inspect.exe utility for this (it comes with the Windows SDK, I have it in the C: \ Program Files (x86) \ Windows Kits \ 8.0 \ bin \ x64) folder. Chromium developers recommend the aViewer utility.

Let's see how the trees of the available browser elements look like:
IE

Chrome

Firefox

As we can see, the address bar is available through the IAccessible interface in all browsers. The names of the elements, the position in the tree in different browsers are different, but in general, to access the address bar, we need only a couple of functions: the ability to get the name and value of the current element and the ability to get the children of the current tree element.

Both are easy to write, here’s the final code that gets the current URL for Chrome.

 #include "stdafx.h" #include <string> #include <iostream> #include "windows.h" #include "oleacc.h" #include "atlbase.h" std::wstring GetName(IAccessible *pAcc) { CComBSTR bstrName; if (!pAcc || FAILED(pAcc->get_accName(CComVariant((int)CHILDID_SELF), &bstrName)) || !bstrName.m_str) return L""; return bstrName.m_str; } HRESULT WalkTreeWithAccessibleChildren(CComPtr<IAccessible> pAcc) { long childCount = 0; long returnCount = 0; HRESULT hr = pAcc->get_accChildCount(&childCount); if (childCount == 0) return S_OK; CComVariant* pArray = new CComVariant[childCount]; hr = ::AccessibleChildren(pAcc, 0L, childCount, pArray, &returnCount); if (FAILED(hr)) return hr; for (int x = 0; x < returnCount; x++) { CComVariant vtChild = pArray[x]; if (vtChild.vt != VT_DISPATCH) continue; CComPtr<IDispatch> pDisp = vtChild.pdispVal; CComQIPtr<IAccessible> pAccChild = pDisp; if (!pAccChild) continue; std::wstring name = GetName(pAccChild).data(); if (name.find(L"    ") != -1) { CComBSTR bstrValue; if (SUCCEEDED(pAccChild->get_accValue(CComVariant((int)CHILDID_SELF), &bstrValue)) && bstrValue.m_str) std::wcout << std::wstring(bstrValue.m_str).c_str(); return S_FALSE; } if (WalkTreeWithAccessibleChildren(pAccChild) == S_FALSE) return S_FALSE; } delete[] pArray; return S_OK; } HWND hWndChrome = NULL; BOOL CALLBACK FindChromeWindowProc(HWND hwnd, LPARAM lParam) { wchar_t className[100]; if (GetClassName(hwnd, className, 100) == 0 || wcscmp(className, L"Chrome_WidgetWin_1") != 0) return TRUE; wchar_t title[1000]; if (GetWindowText(hwnd, title, 1000) == 0 || wcslen(title) == 0) return TRUE; hWndChrome = hwnd; return FALSE; } int _tmain(int argc, _TCHAR* argv[]) { ::CoInitialize(NULL); EnumWindows(FindChromeWindowProc, 0); if (hWndChrome == NULL) return 0; CComPtr<IAccessible> pAccMain; HRESULT hr = ::AccessibleObjectFromWindow(hWndChrome, 1, IID_IAccessible, (void**)(&pAccMain)); // 1 -    CComPtr<IAccessible> pAccMain2; ::AccessibleObjectFromWindow(hWndChrome, OBJID_CLIENT, IID_IAccessible, (void**)(&pAccMain2)); WalkTreeWithAccessibleChildren(pAccMain2); return 0; }

Result of work:

For other browsers, everything is the same.

Small nuance

MSAA technology in Chrome is disabled by default. This is related to Chrome architecture: its division into processes leads to the fact that in no one process is there any information about the entire tree of elements required by MSAA. Chrome developers are not fools and have included the inclusion of the collection of this information and its caching in the main process. But since this is all resource intensive, and MSAA technology is needed for a relatively small number of people, they turned it off by default. You can enable it in two ways:

Manual: go to Chrome by chrome: // accessibility link and enable
Program: Chrome creates a special “trap” that can be used to send a message stating that an application using MSAA is present in the system. You can send a message to this trap like this:
```
 CComPtr<IAccessible> pAccMain; HRESULT hr = ::AccessibleObjectFromWindow(hwnd, 1, IID_IAccessible, (void**)(&pAccMain)); // hwnd -   , 1 -    
```

Source: https://habr.com/ru/post/253729/

All Articles

Using Microsoft Active Accessibility technology to access browser content

Introduction

Code

Small nuance

More articles: