Here we are apps, and, as the correct and experienced developers, do not forget to insert a crash reporter into it. We get the first reports, open the stack, look at the environment, try to reproduce, break off and ask the question “
What did you say? But how did it happen? ”What did the user do that the application collapsed?
And here we’d really appreciate the application’s debug logs, or at least the logs of the last user actions that opened, where they were pointing, etc. The following discussion focuses on logging user actions in a WinForms application and including this log in crash report.
The first thing that comes to mind is the naive approach: enter handlers for each user action and add a record to the log. But this is a lot of work and stupid code. Okay, then we can do the basic form / control, where we subscribe to the necessary events for everyone who lies inside? Oh, and it turns out they are able to add and delete, the application is leak-ing and all this is still slowing down because of our code. But how would we write the same thing, only without the basic forms, tracking changes on the form, subscribing to a bunch of events, and not to slow down?
It turns out that everything has already been invented before us - the
hooks ! But, excuse me, what kind of WinForms is this, is this the very same Win32 API ?! And what to do,
you want to live - do not you stir up so checkered , or go?
')
So, we are interested in what the user did to fill up the application. That is, where and how to poke the mouse, what keys I pressed, what I entered, what windows I opened, how the focus moved. Enough for a start.
Mouse hooks,
normal and
low-level , are responsible for the mouse. For the keyboard, respectively, keyboard, also
normal and
low-level . We need to monitor input in only one application, so global low-level hooks for us, perhaps, are brute force.
Lyrical digressionBut low-level hooks are the most for specific tasks, such as adding mouse controls to a console application, which is not a thing of the mouse about a mouse. Moreover, not just in the console, but in DOS-ovskoe. Once upon a time, the author, when he was a
fidoshnik, taught
Golded mouse input precisely with the help of a low-level mouse hook. As it turned out, it worked in many other dos programs.
And since the usual hooks are enough for us, then you can immediately hang up on the
message hooks , there will be a mouse and a keyboard, in one bottle. Activating windows and moving the focus is convenient to catch in the
CBT-hook .
Where to dig is clear
Go!We describe Interopes.
delegate int HookProc(int ncode, IntPtr wParam, IntPtr lParam); [DllImport("kernel32.dll", ExactSpelling = true, CharSet = CharSet.Auto)] static extern int GetCurrentThreadId (); [DllImport("USER32.dll", CharSet = CharSet.Auto)] static extern IntPtr SetWindowsHookEx(int idHook, HookProc lpfn, IntPtr hMod, int dwThreadId); [DllImport("USER32.dll", CharSet = CharSet.Auto)] static extern int CallNextHookEx(IntPtr hhk, int nCode, IntPtr wParam, IntPtr lParam); [DllImport("USER32.dll", CharSet = CharSet.Auto)] static extern bool UnhookWindowsHookEx(IntPtr hhk);
Subscribe:
const int WH_GETMESSAGE = 3; const int WH_CBT = 5; IntPtr getMsgHookHandle = SetWindowsHookEx(WH_GETMESSAGE, GetMsgProc, IntPtr.Zero, threadId); IntPtr cbtHookHandle = SetWindowsHookEx(WH_CBT, CbtProc, IntPtr.Zero, threadId) static int GetMsgProc(int nCode, IntPtr wParam, IntPtr lParam) { return CallNextHookEx(IntPtr.Zero, nCode, wParam, lParam); } static int CbtProc(int nCode, IntPtr wParam, IntPtr lParam) { return CallNextHookEx(IntPtr.Zero, nCode, wParam, lParam); }
For each thread you need to put your hook (see threadId), so we’ll get a dictionary
static readonly Dictionary<int, HookInfo> hooks = new Dictionary<int, HookInfo>();
Where
class HookInfo { internal IntPtr GetMsgHookHandle; internal IntPtr CbtHookHandle; internal bool InHook; }
When you subscribe to the dictionary, we add the entry, when we unsubscribe, we remove, all the orders are chinar.
Hrenak, hrenak, and in production! Yeah, shch, this is .NET, baby. It falls, if not immediately, then after 1-2 subscriptions / unsubscribing of hooks, right on the call to CallNextHookEx. It is intuitively clear that we have a dangling pointer, but with a quick glance at the code it is not clear, but why? Let's look carefully at
SetWindowsHookEx . The second parameter is a pointer to a handler function. That is, the address of the handler, in terms of C # it will be IntPtr. Soooo, what if I replace HookProc lpfn with IntPtr lpfn? Then you have to use
Marshal. GetFunctionPointerForDelegate to get the address from the delegate. We look at the initial code - we don’t create a delegate explicitly. We look with
ilspy :
SetWindowsHookEx(3, new HookProc(Win32HookManagerBase.GetMsgProc), IntPtr.Zero, threadId);
So here you are, reindeer! Thank you, syntactic sugar, you hid an important thing from us - an intermediate delegate object is created containing an address that is successfully passed to the unmanaged code. After that, the delegate object is successfully nailed by the garbage collector and the unmanaged code begins to crash when trying to transfer control to nowhere.
Ok, explicitly save the delegates:
static HookProc procGetMsg = Win32HookManager.GetMsgProc; static HookProc procCbt = Win32HookManager.CbtProc; SetWindowsHookEx(WH_GETMESSAGE, procGetMsg, IntPtr.Zero, threadId); SetWindowsHookEx(WH_CBT, procCbt, IntPtr.Zero, threadId);
Compile, check - works.
Half the battle done, event notifications learned how to receive. Now we need to collect and record information that will allow us to show events in a form that is understandable and readable by humans.
Take for example
Mouse messages Win32.WindowsMessages message = (Win32.WindowsMessages)m.Msg; if (message == Win32.WindowsMessages.WM_LBUTTONUP) ProcessMouseMessage(ref m, MouseButtons.Left, true); else if (message == Win32.WindowsMessages.WM_RBUTTONUP) ProcessMouseMessage(ref m, MouseButtons.Right, true); else if (message == Win32.WindowsMessages.WM_MBUTTONUP) ProcessMouseMessage(ref m, MouseButtons.Middle, true); else if (message == Win32.WindowsMessages.WM_LBUTTONDOWN) ProcessMouseMessage(ref m, MouseButtons.Left, false); else if (message == Win32.WindowsMessages.WM_RBUTTONDOWN) ProcessMouseMessage(ref m, MouseButtons.Right, false); else if (message == Win32.WindowsMessages.WM_MBUTTONDOWN) ProcessMouseMessage(ref m, MouseButtons.Middle, false); void ProcessMouseMessage(ref Message m, MouseButtons button, bool isUp) { Dictionary<string, string> data = new Dictionary<string, string>(); data["mouseButton"] = button.ToString(); data["action"] = isUp ? "up" : "down"; Breadcrumb item = new Breadcrumb(); item.Event = isUp ? BreadcrumbEvent.MouseUp : BreadcrumbEvent.MouseDown; item.CustomData = data; AddBreadcrumb(item); }
Then we will make a simple application in which we will enable the recording of user actions and drop the application in the handler of one of the buttons so that the crash report with “steps for playback” is sent to the system.

static void Main() { LogifyAlert client = LogifyAlert.Instance; client.ApiKey = "<my-api-key>"; client.CollectBreadcrumbs = true; client.StartExceptionsHandling();
On the server side, we generate descriptions of user actions based on the information collected, something like this:
string GenerateMouseEventMessage(Breadcrumb breadcrumb) { return GetBreadcrumbCustomField(breadcrumb, "mouseButton") + " mouse button" + GetBreadcrumbCustomField(breadcrumb, "action"); }
We open and look at the
sent records .

It is clear that nothing is clear. Yes, the user poked with the mouse, but where exactly? And this we did not write and did not send. What we have there interesting in mouse messages is? Oh,
hWnd , that is necessary.
[DllImport("user32.dll")] static extern int GetWindowText(IntPtr hWnd, StringBuilder lpString, int nMaxCount); string GetWindowText(ref Message m) { StringBuilder windowText = new StringBuilder(1024); GetWindowText(m.Hwnd, windowText, 1024); return windowText.ToString(); }
At the same time, let's write to the log what keys the user pressed.
Customer if (message == Win32.WindowsMessages.WM_KEYDOWN) ProcessKeyMessage(ref m, false); else if (message == Win32.WindowsMessages.WM_KEYUP) ProcessKeyMessage(ref m, true); else if (message == Win32.WindowsMessages.WM_CHAR) ProcessKeyCharMessage(ref m); void ProcessKeyMessage(ref Message m, bool isUp) { Dictionary<string, string> data = new Dictionary<string, string>(); data["key"] = ((Keys)m.WParam).ToString(); data["action"] = isUp ? "up" : "down"; data["windowCaption"] = GetWindowText(ref m); Breadcrumb item = new Breadcrumb(); item.Event = isUp ? BreadcrumbEvent.KeyUp : BreadcrumbEvent.KeyDown; item.CustomData = data; AddBreadcrumb(item); } void ProcessKeyCharMessage(ref Message m) { Dictionary<string, string> data = new Dictionary<string, string>(); data["char"] = new string((char)m.WParam, 1); data["action"] = "press"; data["windowCaption"] = GetWindowText(ref m); Breadcrumb item = new Breadcrumb(); item.Event = BreadcrumbEvent.KeyPress; item.CustomData = data; AddBreadcrumb(item); }
Server string GenerateMouseEventMessage(Breadcrumb breadcrumb) { return GetBreadcrumbCustomField(breadcrumb, "mouseButton") + " mouse button" + GetBreadcrumbCustomField(breadcrumb, "action") + " over " + GetBreadcrumbCustomField(breadcrumb, "windowCaption"); } string GenerateKeyEventMessage(Breadcrumb breadcrumb) { return GetBreadcrumbCustomField(breadcrumb, "key") + GetBreadcrumbCustomField(breadcrumb, "action"); } string GenerateKeyCharEventMessage(Breadcrumb breadcrumb) { string character = GetBreadcrumbCustomField(breadcrumb, "char"); if (!String.IsNullOrEmpty(character)) return "Type " + character; else return GetBreadcrumbCustomField(breadcrumb, "key") + " press"; }
Reassemble, launch, generate and view new crash report.

Already not bad. And keystrokes adequately seemed, you can begin to collect passwords. Speaking of birds, passwords. It is probably not good to send private information without the explicit consent of the user. Perhaps,
Replace with asterisks bool ProcessKeyMessage(ref Message m, bool isUp) { string key = ((Keys)m.WParam).ToString(); bool maskKey = IsPasswordBox(ref m); if (maskKey) key = Keys.Multiply.ToString(); Dictionary<string, string> data = new Dictionary<string, string>(); data["key"] = key;
Now you need to implement the IsPasswordBox method, and it will be quite good. What is the difference between the usual input field and the password input field? Again we recall WinAPI memories from the depths of our memory, we find
this example. From it we see that we need an edit control with the
ES_PASSWORD style, which has a non-zero
password char . We realize:
[DllImport("USER32.dll")] static extern int GetWindowLong(IntPtr hwnd, int flags); [DllImport("USER32.dll")] static extern int SendMessage(IntPtr hWnd, int Msg, IntPtr wParam, IntPtr lParam); static bool IsPasswordBox(IntPtr hWnd) { const int ES_PASSWORD = 32; int style = GetWindowLong(hWnd, GWL_STYLE); if ((style & ES_PASSWORD) == 0) return false; int result = SendMessage(hWnd, EM_GETPASSWORDCHAR, IntPtr.Zero, IntPtr.Zero); return result != 0; }
We get:

With buttons and input fields, everything works fine. And if you complicate the puzzle? We build in a demo application with complex controls, grids and ribbons.

Run, poke in the grid and buttons on the ribbon, look at the result:

A sad sight. For the whole ribbon we have one single hWnd, and as many as two on the grid, one for the grid itself and the other for the active editor. And what do we do with all this good? Digging into the guts of specific controls is not an option - there are too many of them from different vendors, and the insides can change quite regularly for themselves. We recall
Section 508 , and begin to find out what it is and what it is eaten with. As a result, we will learn about
IAccessible, and after a while, and about the methods
AccessibleObjectFromWindow and
AccessibleObjectFromPoint . The first method is just suitable for keyboard events, and the second for mouse events.
Another bit of code [return: MarshalAs(UnmanagedType.Bool)] static extern bool ClientToScreen(IntPtr hwnd, ref POINT point); [DllImport("oleacc.dll")] static extern IntPtr AccessibleObjectFromPoint(POINT pt, [Out, MarshalAs(UnmanagedType.Interface)] out IAccessible accObj, [Out] out object ChildID); [DllImport("oleacc.dll")] static extern int AccessibleObjectFromWindow(IntPtr hwnd, int id, ref Guid iid, [In, Out, MarshalAs(UnmanagedType.IUnknown)] ref object ppvObject); static IAccessible GetAccessibleObject(IntPtr hWnd) { Guid guid = new Guid("618736e0-3c3d-11cf-810c-00aa00389b71"); Object instance = null; const int OBJID_WINDOW = 0x0; int hResult = AccessibleObjectFromWindow(hWnd, OBJID_WINDOW, ref guid, ref instance); if (hResult != 0 || instance == null) return null; return instance as IAccessible; } static IAccessible GetAccessibleObject(IntPtr hWnd, Point point) { IAccessible accObj; object obj = new object(); POINT pt = new POINT(); pt.X = point.X; pt.Y = point.Y; ClientToScreen(hWnd, ref pt); if (AccessibleObjectFromPoint(pt, out accObj, out obj) != IntPtr.Zero) return null; if (accObj == null) return null; return accObj; }
Next is a matter of technology. If we get a non-zero IAccessible, then we analyze the
AccessibleRole , and write the
accName of the control and its parent, if necessary.
And again the lyricsThere is at least one more way to find out what the user poked with the mouse. More precisely, even this way: what text the user poked into. But the way is so unconventional that the author will refrain from using it, so as not to be deservedly accused of perrectal
tonsillectomy . The essence of the method is as follows. Learning to intercept WinAPI text rendering functions, for
example . Further, when we need to figure out the text under the arm, turn on the interception and begin to write in a separate notebook, what text and where it was drawn. Then we force the window under the arm to redraw and turn off the interception. Then calmly read the notebook and determine the text that was drawn directly under the arm. According to indirect data (and not only indirectly), this is how the
Quick Lookup feature works for the guys from the Abbyy
Lingvo team.
Next, we finish the server part and get the following picture:

It has become much more informative, it is quite clear to which control (or which part of it) each record belongs. It is already possible to work, but now I wanted to see the same user actions, but in an even more compact form. Than further and do.
PS:→ WinForms client
sourcesIf anyone is interested,
the project
site and
documentation .
Also introductory
article about Logify here, on Habré.
Collect user activity under other platforms:
WPF ,
JavaScript, ASP.NET