📜 ⬆️ ⬇️

OpenOffice COM Automation: Reading Clipboard Content

Part One (I hope not the last)


For a long time for me, OpenOffice remained a thing-in-itself. I knew that it is perfectly automated by pythons and BASICs, but, behold, I could not find a suitable tool for PHP. Quite by chance I discovered such an interesting feature of OpenOffice: accessing the contents of the Windows clipboard. Then I really lacked the ability to write simple CLI scripts that process text in a buffer in PHP. Therefore, I decided to thoroughly understand how you can steer the open-office with php from under Windows.


Here is the decision


<?php // PHP OpenOffice:   COM- $oo = new COM("com.sun.star.ServiceManager"); $clipboard = $oo->CreateInstance( "com.sun.star.datatransfer.clipboard.SystemClipboard"); $converter = $oo->CreateInstance("com.sun.star.script.Converter"); $contents = $clipboard->getContents(); $flavors = $contents->getTransferDataFlavors(); $result = false; foreach ($flavors as $mm) { $mime = $mm->MimeType; // echo "$mime\r\n"; // DEBUG if ($mime=="text/plain;charset=utf-16") { $data = $contents->getTransferData($mm); // "com.sun.star.uno.TypeClass.STRING" ==> 12 $result = $converter->convertToSimpleType($data, 12); break; } } echo $result; 


How it all works


First, the service manager component, "com.sun.star.ServiceManager" , is created, which is needed to connect the buffer and converter components, since it "com.sun.star.datatransfer.clipboard.SystemClipboard" impossible to directly create the buffer component "com.sun.star.datatransfer.clipboard.SystemClipboard" . The manager is responsible for dispatching calls to UNO functions. As a result, in response to CreateInstance() requests to a higher “instance”, we get full-fledged instances of the COM components we need.
The contents of the buffer is retrieved by the getContents() method. This content is very cleverly arranged, presented in several different formats (in taste and color). The full set of flavor formats is getTransferDataFlavors() method. As a result, we have a composite object, the elements of which can be foreach (..as..) in a foreach (..as..) .
')
Each element in itself is also no less tricky. Using the MimeType property, the content type is determined. This content type is returned as a regular string. We will be interested only in "text/plain;charset=utf-16" .

To get the transferred buffer data, you need the getTransferData() method.

And, here we are waiting for the first bummer:


Unlike MimeType , which is a simple text value, the result of the method is not given as a string (which you could then simply transcode with the iconv() function into the desired encoding), but as an alternative type that is not so easy to make friends with in PHP .

Most likely it is done this way, because besides the text content, the buffer may contain pictures, music and other multimedia, and it’s not always kosher to display it as a string.

Conversion


This problem is solved by a special converter component "com.sun.star.script.Converter" , which is also created by the manager.

The converter has a method that converts variant values ​​to simple types convertToSimpleType() , which you need to feed the variant itself, and pass the "magic" constant 12 ( "com.sun.star.uno.TypeClass.STRING" ) corresponding to the usual strings.

But, here - the second bummer:


The result is a string encoded in Windows-1251 , which can lead to a distortion or loss of the original characters (in Unicode encoding) that do not fit into the Procrustean bed of the Windows code table.

Disclaimer


In my opinion, the decision turned out to be quite elegant, but I expect to receive the opposite reaction from the real programming gurus that, again, like, they say, a new generation of PHP-Bydlokoderov shambles sitting in the Windows command line and writing Helloord for Habré appears office COM automation to just read the text from the buffer.

In general, the place for this topic should be in the Q & A blog, and its content was artificially inflated to the size of a “full-length” article.

Unfortunately, in the Habra-Sandbox interface, it is not possible to specify a preferred blog for posting.
Also, from the Read-Only account there is no possibility to write a letter directly to any Habrayuser asking to post a question on the Q & A blog.

Here are the questions themselves:


1. Are there any other alternative ways to get access to the contents of the Windows clipboard, similar to OpenOffice, without connecting additional php extensions via COM-automation, for example, some MS-Office or even Internet Explorer ? The article would have been more interesting if the reader was offered a choice of several different ways to solve the problem, and the ability to automatically access the contents of the buffer in any way possible, depending on what specific additional software was installed in the system. That is, to provide some kind of cross-platform (well, or "cross-office", if you want).

2. Well, we somehow learned to read the buffer, but how now to write something in this buffer? Immediately forced to warn that the decision to write to the buffer will not look so elegant and transparent. At least, I couldn’t succeed in “bicycling” the solution of this task on my own. And, of course, the description of the inverse operation is simply obliged to be present in the full-fledged article. Although, once again, I repeat that I myself first needed to read the contents of the text buffer , and the Windows-1251 encoding fully corresponded to my appetites.

3. Well, if, with the textual contents of the buffer, everything is clear, then what about the graphics ? I would very much like to get the graphic contents of the buffer, for example, in the form of a GD2 object, and, also, so that you can “draw” directly in the buffer, that is, be able to synchronize the contents of the buffer with the state of the GD2 object. I remember how, even at the time of Windows 98, one of my friends made an indelible impression on me by pasting a FILM copied from the Media Player in playback mode into MsPaint from the clipboard. I was then just shocked when I saw a moving image on the background of an open pattern. At that time, I still didn’t understand how Windows was arranged, and I perceived it as real magic.

PS


The article, of course, would be more useful if it described universal read-write access to any type of content, so that, for example, you could export the contents of a particular type into files of the appropriate format.

I really hope to find in the comments to the article the answers to all these questions, or at least these questions themselves, kindly transferred by the interested readers of this article to the Q & A blog if this topic does not succeed in leaving the Habrabox sandbox.

I repent that I did not give references to reputable sources of inspiration, I hope that this flaw will be compensated in the comments.

Source: https://habr.com/ru/post/133990/


All Articles