Microsoft Office Security: Embedded Objects

Initially, Microsoft Office architecture was based on the concept of composite documents, they are also OLE documents , actively promoted by Microsoft at the dawn of 32-bit Windows. In those days, the idea of “seamless” merging in one document of data of various formats seemed attractive and fascinating, and before identifying the first problems, had time to firmly grow into many large-scale products.

The “bad news” was that the universal way of adding data (and the processing code of this data) to documents became a universal way of introducing vulnerabilities in the product, which even today constantly surprises ~~the malware creators of~~ security researchers.
')
Subsequently, the applications of the package received a fairly rich set of tools for adding images, charts and diagrams to documents, controls that are created and processed by the application itself and are part of it. From a security point of view, these elements are somewhat less interesting than the elements that (mostly) are discussed in the article - elements that use external application code added to documents using OLE.

The “disk” representation of the master document is a CFBF file . This article will look at embedding objects in Microsoft Office documents (or rather, only the security aspect) in the context of the data and code loaded into memory at runtime.

Formally embedded objects in Microsoft Office documents can be divided into the following groups:

ActiveX Controls
OLE Embedded Objects ( OLE Embedded Objects )
Injected files ( Packages )
Non-ole embedded elements

ActiveX controls

ActiveX controls can be thought of as elements of the program window — say, buttons, switches, lists, input fields, and other forms — designed to produce some events or respond to events. Once a good idea seemed to create such controls universal, with the possibility of use in any application, and put them for this purpose in the components of COM .

The embedded ActiveX web pages represented a well-known security hole in Internet Explorer, and security measures increased over time. Browsers from other manufacturers almost immediately abandoned support for ActiveX. New browser Microsoft Edge finally broke up with this relic of the past. Embedding in Office documents, however, is still possible.

ActiveX documents are intended to be used in conjunction with Visual Basic for Applications . However, VBA is not required to load and activate them, and user permission is not required to load items from the white list .

Vulnerabilities in the latter are especially dangerous - the default settings that are set when the application is installed do not imply any protection against downloading these elements or a warning to the user. The administrator must forcibly tighten the settings by prohibiting the download of any ActiveX controls (note, however, that in the safe viewing mode, ActiveX is not loaded).

Example: CVE-2012-0158

One of the most dangerous vulnerabilities in Office documents in 2012 was CVE-2012-0158 . The download code for the Microsoft ListView Control 6.0 item from the MSCOMCTL.OCX library contained the possibility of a buffer overflow, which made it possible to replace the return address and execute arbitrary code. Since the item was in the ActiveX whitelist, the download started immediately when the document was opened. The vulnerability has now been fixed; the ListView Control element is still considered “secure”.

Adding ActiveX to a document

To add a control item to a Microsoft Office document (take Word for simplicity) using the user interface, open the Developer tab (its visibility is configured in the Word Options menu) and select Controls -> Tools from previous versions -> ActiveX controls. The menu will show a set of icons corresponding to Microsoft Forms elements, as well as the ability to select ActiveX from a list made up of elements available in the system, selected by a number of criteria.

The displayed list does not correspond to the set of elements that can actually be loaded into the document, so it cannot be guided by it when searching for vulnerable elements. Complicated multi-level verification of downloadable ActiveX has several stages, it differs for Office versions and changes from update to update, so the most accurate way to check the download ability is to “manually” compose the document file with the element of interest and try to open it in Office. Possible document formats are described below.

Software presentation

Each ActiveX control is essentially an object of one of the COM classes that meets certain requirements. The element is loaded using the COM subsystem, and the executable code is contained in one of the modules, as a rule, “external” to the container application. Like any COM object, an ActiveX control can be implemented as a DLL, or as an executable EXE file. In the first case, the library will be loaded into the address space of the container, in the second, the element will be processed in a separate process, with data transfer between the container and the object via COM marshaling.

Like any COM object, ActiveX has Interfaces , Properties, and Methods .

Interfaces are primarily a set of standard interfaces that must have an ActiveX class for full loading and interacting with a container, in particular, IOleControl and IOleObject.

The absence of any necessary interfaces can reduce the functionality of the element or interrupt its loading at some stage.

Example: CVE-2015-2424
The vulnerability CVE-2015-2424 was related to the TaskSymbol Class element from the mmcndmgr.dll library. The item was not intended for use in documents, and did not export the IDispatch interface. During the element loading process, the procedure that requested this interface received an error and destroyed the internal structure of the element, which led to a use-after-free vulnerability. At the moment, the item is forbidden to download (despite this, it can still be found in the list to be added to the Developer menu). The vulnerability itself is not fixed.

In addition to the standard, each ActiveX class exports the “main” interface, representing its own unique functionality. For example, for the Forms.CommandButton.1 class, this is ICommandButton.

You can view the ActiveX interfaces using the OleView tool included in the Microsoft Visual Studio package.

The element interface defines its Methods and Properties . Properties represent some data that determines the appearance and operation of an element. The developer of the ActiveX element assigns each property a specific name, say BackColor or GridLineWidth, and a type, for example, a string, an integer, or a real double precision. For bitmap images and icons, there is a type of property like a picture. The client program can set individual properties of the control, setting their integer indices and values.

From the point of view of a low-level implementation, the division into methods and properties is formal, since “properties” are represented by a set of get / set methods. However, there is a significant difference: The methods of the element (its main interface) can only be called programmatically, in the case of Office documents, only from a running VBA program. From a security point of view, this is not of great interest, since the execution of VBA is already a compromise of the operating system. Properties are stored in the document and when opened, they will be processed and loaded into structures in memory even if VBA execution is prohibited .

From a programmatic point of view, from the side of the element, to save its properties and state in the document, the container provides the interfaces IStream , IStorage and IPropertyBag . Their implementation and presentation of data in a disk file is no longer a concern of the ActiveX control, and depends entirely on the container and the document format. It should be noted that the set and format of the stored data may correspond to the “publicly” exported property set, or it may be completely different. Consider implementation examples related to Microsoft Office.

Compound file (CFBF)

The legacy Office document format , where the low-level ObjectPool storage and separate subdirectories inside it were allocated for storing ActiveX data. The stream "\ 001CompObj" contains the class identifier, which ultimately determines the class of the loaded object. Replacing the identifier directly in hex will result in an attempt to load an object of a completely different class .

Office Open XML

Modern XML document format. The file is a zip archive. ActiveX control data is stored in the ActiveX subdirectory in files with simple names like activeX1.xml.

Sample file:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<ax:ocx ax: classid="{D7053240-CE69-11CD-A777-00DD01143C57} " ax: persistence="persistPropertyBag" xmlns:ax="http://schemas.microsoft.com/office/2006/activeX" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<ax:ocxPr ax:name="Caption" ax:value="CommandButton1"/>
<ax:ocxPr ax:name="Size" ax:value="2540;847"/>
<ax:ocxPr ax:name="FontName" ax:value="Calibri"/>
<ax:ocxPr ax:name="FontHeight" ax:value="225"/>
<ax:ocxPr ax:name="FontCharSet" ax:value="204"/>
<ax:ocxPr ax:name="FontPitchAndFamily" ax:value="2"/>
<ax:ocxPr ax:name="ParagraphAlign" ax:value="3"/>
</ax:ocx>

These files in the text form identifies the classes. Replacing an identifier will also cause an attempt to load an item of another class.

Next, the file contains an indication of the type of data storage object: persistPropertyBag, persistStorage or persistStream. If the element supports storage of the persistPropertyBag properties, its data can be saved in the same text file , see the example above. If it needs a repository or binary stream , the data will be saved in a file with the name of the type activeX1.bin, which is a CFBF file .

RTF

In the rtf document, the ActiveX element is defined by the tags \ object \ objocx . The \ objdata tag contains the element property store as a hex representation of the CFBF file.

{\object\objocx\f37\objsetsize\objw1440\objh480{\*\objclass Forms.CommandButton.1}
{\*\objdata 010500000200000016000000
466f726d732e436f6d6d616e64427574746f6e2e31000000000000000000000e0000
d0cf11e 0a1b11ae1000000000000000000000000000000003e000300feff090006

Download from file

The process of loading ActiveX from a document as a whole is quite simple. The container application creates a clean object of the specified class, queries it for the specified storage interface, and provides a pointer to the repository, stream, or "property package."

Filtering items that can be loaded has many steps. First of all, the classes listed in the blacklist, known as Office COM Kill Bit, are removed (registry registry * OFFICE_KEY * \ Common \ COM Compatibility). For example, flags that prevent downloads have classes such as Microsoft Scriptlet Component and Microsoft Web Browser.

The remaining classes will pass the initial load. This means that the DLL will be loaded into the container application, or the COM server process implemented in the EXE file will be launched . Only after that will the other checks be performed, including the elementary — whether the object itself is an ActiveX representative.

Example: CVE-2015-6128
In 2015, the researcher discovered that the preloading of COM modules can be used to bypass the ASLR and execute arbitrary code by loading bogus dynamic libraries. In the description of the CVE-2015-6128 released later, there is not a word about Microsoft Office.

If the identifier really identifies ActiveX, it will pass several more checks in several black and white lists.

List of ActiveX loaded from .docx on clean Windows 7 and Office 2016 with default settings.

{00024522-0000-0000-C000-000000000046} RefEdit.Ctrl
{02AF6DD2-77E6-44DF-B3E1-57CF1476D8EA} Microsoft Forms 2.0 OptionButton
{04082FC6-E032-49F2-A263-FE64E9DA1FA3} Microsoft Forms 2.0 HTML TEXT
{0B314611-2C19-4AB4-8513-A6EEA569D3C4} Microsoft Slider Control, version 6.0
{13D557B6-A469-4362-BEAF-52BFD0F180E2} Microsoft Forms 2.0 HTML TextAREA
{19FED08E-EFD1-45da-B524-7BE4774A6AEE} Microsoft Forms 2.0 ListBox
{20DD1B9E-87C4-11D1-8BE3-0000F8754DA1} Microsoft Date and Time Picker Control 6.0 (SP4)
{227B1F3B-C276-4DE0-9FAA-C0AD42ADDCF0} Microsoft Forms 2.0 HTML RESET
{232E456A-87C3-11D1-8BE3-0000F8754DA1} Microsoft MonthView Control 6.0 (SP4)
{3D0FD779-0C2D-4708-A9BA-62F7458A5A53} Microsoft Forms 2.0 ToggleButton
{444D2D27-02E8-486B-9018-3644958EF8A9} FieldListCtrl.2 Object
{4C599241-6926-101B-9992-00000B65C6F9} Microsoft Forms 2.0 Image
{5052A832-2C0F-46c7-B67C-1F1FEC37B280} Microsoft Forms 2.0 Label
{5512D110-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML SUBMIT
{5512D112-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML IMAGE
{5512D114-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML RESET
{5512D116-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML CHECKBOX
{5512D118-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML OPTION
{5512D11A-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML TEXT
{5512D11C-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML Hidden
{5512D11E-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML Password
{5512D122-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML SELECT
{5512D124-5CC6-11CF-8D67-00AA00BDCE1D} Microsoft Forms 2.0 HTML TextAREA
{556C2772-F1AD-4DE1-8456-BD6E8F66113B} Microsoft ImageList Control 6.0 (SP6)
{585AA280-ED8B-46B2-93AE-132ECFA1DAFC} Microsoft StatusBar Control 6.0 (SP6)
{5CBA34AE-E344-40CF-B61D-FBA4D0D1FF54} Microsoft Forms 2.0 HTML CHECKBOX
{5E90CC8B-E402-4350-82D7-996E92010608} Microsoft Forms 2.0 HTML OPTION
{603C7E80-87C2-11D1-8BE3-0000F8754DA1} Microsoft UpDown Control 6.0 (SP4)
{6240EF28-7EAB-4dc7-A5E3-7CFB35EFB34D} Microsoft Forms 2.0 ScrollBar
{65BCBEE4-7728-41A0-97BE-14E1CAE36AAE} Microsoft Office List 16.0
{6C177EBD-C42D-4728-A04B-4131892EDBF6} Microsoft Forms 2.0 ComboBox
{787A2D6B-EF66-488D-A303-513C9C75C344} Microsoft Forms 2.0 HTML Password
{79176FB0-B7F2-11CE-97EF-00AA006D2776} Microsoft Forms 2.0 SpinButton
{86F56B7F-A81B-478d-B231-50FD37CBE761} Microsoft Forms 2.0 CommandButton
{87DACC48-F1C5-4AF3-84BA-A2A72C2AB959} Microsoft ImageComboBox Control, version 6.0
{8B2ADD10-33B7-4506-9569-0A1E1DBBEBAE} Microsoft Toolbar Control 6.0 (SP6)
{8BD21D10-EC42-11CE-9E0D-00AA006002F3} Microsoft Forms 2.0 TextBox
{8BD21D20-EC42-11CE-9E0D-00AA006002F3} Microsoft Forms 2.0 ListBox
{8BD21D30-EC42-11CE-9E0D-00AA006002F3} Microsoft Forms 2.0 ComboBox
{8BD21D40-EC42-11CE-9E0D-00AA006002F3} Microsoft Forms 2.0 CheckBox
{8BD21D50-EC42-11CE-9E0D-00AA006002F3} Microsoft Forms 2.0 OptionButton
{8BD21D60-EC42-11CE-9E0D-00AA006002F3} Microsoft Forms 2.0 ToggleButton
{9432194C-DF54-4824-8E24-B013BF2B90E3} Microsoft Forms 2.0 HTML SUBMIT
{95F0B3BE-E8AC-4995-9DCA-419849E06410} Microsoft TreeView Control 6.0 (SP6)
{978C9E23-D4B0-11CE-BF2D-00AA003F40D0} Microsoft Forms 2.0 Label
{9A948063-66C3-4F63-AB46-582EDAA35047} Microsoft TabStrip Control 6.0 (SP6)
{9BDAC276-BE24-4F04-BB22-11469B28A496} Microsoft Forms 2.0 HTML IMAGE
{A0E7BF67-8D30-4620-8825-7111714C7CAB} Microsoft ProgressBar Control, version 6.0
{CCDB0DF2-FD1A-4856-80BC-32929D8359B7} Microsoft ListView Control 6.0 (SP6)
{D7053240-CE69-11CD-A777-00DD01143C57} Microsoft Forms 2.0 CommandButton
{DCA0ED3C-B95D-490f-9C60-0FF3726C789A} Microsoft Forms 2.0 Image
{DD4CB8C5-F540-47ff-84D7-67390D2743CA} Microsoft Forms 2.0 TextBox
{DFD181E0-5E2F-11CE-A449-00AA004A803D} Microsoft Forms 2.0 ScrollBar
{E9729012-8271-4e1f-BC56-CF85F914915A} Microsoft Forms 2.0 CheckBox
{EA778DB4-CE69-4da5-BC1D-34E2168D5EED} Microsoft Forms 2.0 SpinButton
{EAE50EB0-4A62-11CE-BED6-00AA00611080} Microsoft Forms 2.0 TabStrip
{F14E8B03-D080-4D3A-AEBA-355E77B20F3D} Microsoft Forms 2.0 HTML SELECT
{F8CF7A98-2C45-4c8d-9151-2D716989DDAB} Microsoft Visio Document
{FB453AD8-2EF4-44D3-98A8-8C6474E63CE4} Microsoft Forms 2.0 HTML Hidden
{FDEA20DB-AC7A-42f8-90EE-82208B9B4FC0} Microsoft Forms 2.0 TabStrip
{FE38753A-44A3-11D1-B5B7-0000C09000C4} Microsoft Flat Scrollbar Control 6.0 (SP4)

You can see that a significant place in the list is occupied by components of the Microsoft Forms group. This is a set of controls that come with Office, you can see them in the panel "ActiveX controls." Initially, all of them were registered as “safe”, but over time it turned out that this was not the case for individual elements. For example, the Frame element loads any other ActiveX without checking any lists (in recent versions this is “fixed”, but the Frame's own blacklist is different from the general Office). For this reason, some Microsoft Forms items can be loaded into a document only with the permission of the user. Microsoft Forms Frame also requires user consent (at default settings), but it allows you to download some items from the Kill Bit list that could not be loaded under other conditions.

Therefore, if the attacker manages to convince the user to allow the download of ActiveX, Frame will help him to significantly expand the “arsenal” at the expense of such elements as, for example, the Web Browser.

The Microsoft Forms properties storage format is partially documented by the [MS-OFORMS] specification .

During the ActiveX scanning process, it turned out that the set of classes for doc, docx and rtf is different, as well as different lists of available ActiveX for an application running in the usual way and running in automation mode .

Many popular applications complement these lists with their own ActiveX. If a vulnerability is discovered, it will be reflected in the bulletin as relevant to the application to which it belongs. In this case, the only way to exploit the vulnerability may be Office documents.

Example: Flash ActiveX
Flash ActiveX is especially popular with virus writers for consistently detectable vulnerabilities and a permanent place in the white lists of IE and Office. The first known vulnerabilities in this component appeared in 2008, one of the latest CVE-2018-4878 closed in February of this year. With the decline of IE popularity, Office documents have become the main distribution path for Flash exploits.

OLE embedded data items

OLE introduced elements are designed to implement the concept of “document in document” with the ability to edit “in place” of data of various formats processed by other applications. Like ActiveX, OLE documents are implemented based on COM.

You can add an OLE element to a Word document as follows: open the Insert tab and select Text -> Object. The program will display a list of document types for which OLE-handlers are registered. As in the case of ActiveX, this list corresponds little to the set of classes that can actually be loaded as OLE documents.

Software presentation

As in the case of ActiveX, the implementation of any OLE document is represented by the corresponding COM class, made in the form of a DLL or EXE. The component exports the necessary service interfaces, and state saving in the container document is done through IPersist * interfaces.

In a CFBF document, OLE object data is stored in a second-level ObjectPool store. The set of threads is generally similar to the corresponding ActiveX controls.

In Open Office XML documents, OLE object data is stored in the embeddings subdirectory, in a CFBF storage file with a name of type oleObject1.bin.

In RTF documents, information about an object is stored under the \ object \ objemb \ tag. The section also contains the storage encoded as a hex representation of the CFBF file.

{\object\objemb\objw8307\objh553{\*\objclass WordPad.Document.1}
{\*\objdata 010500000200000013000000576f72645061642e446f63756d656e742e31000000000000000000000a0000
d0cf11e 0a1b11ae100000000

The RTF format is distinguished by the fact that it supports the \ objupdate tag, which causes an automatic activation of an element while, by default, OLE elements are inactive when loaded.

Example: CVE-2017-11882
The vulnerability of the CVE-2017-11882 OLE component Equation Editor due to the processing of the object in a separate process made it possible to stable and universal operation. The \ objupdate tag forced Word to load the vulnerable component immediately upon opening the document.

Example: Excel Inline Elements with Macro Virus
Researchers have detected malicious rtf-documents that do not use any new vulnerabilities. Documents contain as embedded objects several Excel documents with macros. The calculation is made on the fact that the user, forced after opening the document several times in a row to refuse to run the macro, will eventually “surrender” and allow execution. At the moment, the technique is still working.

A significant difference from ActiveX in the case of embedded OLE elements is that the class identifier is written directly to the storage file by the WriteClassStg function. This technique is inherited from very old times, when Microsoft enthusiastically developed the concept of "serialization" and storing objects with their state in CFBF format. The container document also stores the class identifier of the element being implemented, but the object of the class specified in the repository will be loaded. This identifier can be replaced by forcing the application to load an object that is not intended for this purpose at all.
It is also possible to edit the data of the element, which in certain cases leads to the identification of vulnerabilities.

OLE objects also undergo numerous checks on the ability to load, making it difficult to obtain a complete list of potentially downloadable items. The set of elements that can be loaded as OLE objects differs from the list of loaded ActiveX. In particular, they are checked on the KillBit list belonging not to Office, but to Internet Explorer (HKEY_LOCAL_MACHINE \ SOFTWARE \ Microsoft \ Internet Explorer \ ActiveX Compatibility).

OLE by reference

OLE distinguishes between two mechanisms for embedding content in a document — directly embedding an OLE document and creating a link within the main document to another document. In the case of embedding an OLE object by reference, the main document contains an indication of the path to the file of the embedded document. The path can be local or network, or an Internet address. OLE-handler is determined by the file extension, the corresponding handler must be registered in the operating system.

Example: CVE-2017-0199
The vulnerability of CVE-2017-0199 was the ability to add an “object by reference” document in the hta format. The latter is an html with the ability to execute code, that is, it is actually an executable file. The handler automatically downloaded and executed hta, allowing you to execute arbitrary code when opening a document.

Before updating the embedded object, the application requests user permission. In this case, the file is uploaded in advance, which can be used to disclose information about the user.

Injected files (Packages)

Office documents support the ability to add any file (Object -> Create from file or simply drag the file icon into the edit box). Technically, this is implemented by adding an Object Packager embedded element to the document that writes the desired file into its own data. Object Packager allows you to replace the icon and signature of the file, as well as set the command line to open. It may also include files “by reference” , when the file is opened not from its own storage, but by the specified path, including the network path.

Recently, the functionality of the Object Packager has been significantly clipped, and initially the element could save any files, including executables, links, and even the command line. All the user had to do to launch the content was to double-click the icon in the text of the document.

Example: Files in the Outlook Message Body
Outlook messages, which are also master documents, allow you to add Object Packager elements to the message body . For the user, the element looks like an image arbitrarily chosen by the attacker. Double click on the image opens the packaged file. The attacker is left to choose the type of data from those that have not yet come under tightening security policies .

Non-OLE inline elements

At the moment, the greatest threat / interest from non-OLE elements may be images added to a document by reference . When a document is opened in an unprotected mode, the images are downloaded automatically, which can lead to the disclosure of the location and identity of the user who downloaded the document through anonymizing proxies or received a confidential document from third parties. This technique, in particular, was implemented in the tool Scribbles , which is in service with US intelligence.
In a Windows local network, automatic downloading of images by reference makes it possible to exploit the NTLMRelay vulnerability . The mechanism of links to pictures is not compatible with the security requirements of ActiveDirectory networks, since the administrator who receives such a document essentially executes the attacker's code with full administrative privileges.

Protection methods

What can be done? In general, a bit.

Currently, the most effective method of protection against vulnerabilities in objects embedded in Office documents is the protected view mode . In this mode, both the loading of objects and the loading of data from external sources are excluded. Unfortunately, the transition to the full-featured mode requires elementary user actions, easily provoked by social engineering methods.