📜 ⬆️ ⬇️

How do we take screenshots of pages

Hello!

This is our first article on Habré, and in it we would like to tell you about how we make beautiful screenshots of pages for our service and how we came to this, which rakes we attacked.

This service is extremely important to us in order to display updates. And the faster and closer to reality (read - flash support) it works, the more pleasant it is for users.
')



After studying the Internet, the following options were found:

In general, having looked at it and not seeing the ideal, our first attempt to implement the most authentic screenshots was the launch of Firefox under Xvfb , and the screenshot of the entire screen for 30 seconds.
Xvfb is an indispensable utility for such purposes. This is a virtual framebuffer, which allows you not to put full X on the server, and run the "displays" with any resolution. Well, the applications in them :)
The method was, of course, imperfect. Firstly, firefox turned out to be stubborn and all the time he strove to slip a window for updating himself or plug-ins. Secondly, sometimes I just crashed screenshots (a monochrome picture was obtained) or hung. Thirdly, taking a screenshot strictly after 30 seconds is clearly suboptimal - if the site was loaded earlier, there is nothing to wait :) And, of course, he ate the memory.

The next step we tried was to write our own utility on Qt - in order to be able to edit it for ourselves, because after all, the service of making screenshots is really very important for us. Qt is not for nothing that the authors of most of these programs use it - there is a modern engine and a convenient API. There are fewer problems, but they remain - sometimes the utility hangs, sometimes it makes a broken screenshot ... And for some reason it did not work on the combat servers with a flash.

The developer of this utility at some point suggested that we switch to PhantomJS - which we used. The situation again became somewhat better, but not perfect. It also sometimes hangs, and there are links that always cause an error, as well as cases of sporadic failures. The flash did not work either, showed black rectangles, which in principle already suited us more or less.
But PhantomJS at the expense of being able to perform js on the page allowed us to retrieve the page description from the DOM (from meta tags or the secret algorithm directly from the text), the title, and potentially do any interesting things, for example, to define pages that prohibit showing yourself in the frame, which is also quite a critical feature for us. HTML can be such a mess that it can be a pain to work with it on the server; Browser engines have been accustomed to endure any bullying for decades.
So we would have lived with him for a long time, but when we began to add a fair number of links to us, Phantom pleased us with something new - working in parallel several copies of the program, processing links with a flash, SUDDENLY began to take it off, but, let's say, in one screenshot I got a video from another link, and in the second screenshot there was a combination of two videos at once. In short, the blood-intestine-dismemberment.

Hell, I thought, and remembered that Google Chrome generally has a built-in flash player, and it was already clear that all self-written programs suffered from a lack of reliability. Chrome is hard to blame for this :) In addition, it is extremely easy to write extensions on pure js, which could do everything the same as PhantomJS, and even more.
He also had all the magic command line keys that allow him to run without an interface and without any annoying suggestions. He also knew how to write in STDERR, and work in incognito mode, which ensures the most repetitive results of screenshots. We have become available all the power of CSS-selectors and we have earned a flash .
In general, Google Chrome fulfilled all our erotic dreams ;)
Actually the screenshots are taken with a snapshot of the entire screen in which the browser works, well, plus some special cases for special, magical sites.

In a sense, history has made a spiral revolution - we started with a browser, and finished with a browser. Just working on this task with Chrome turned out to be much easier. Yes, it is possible to write an extension under Firefox that will probably be able to do the same thing that we do with Chrome - but it will be more difficult. After writing a toolbar for Firefox, I guarantee it :)
And he works with memory more carefully. By the way, Chrome even has the ability to take screenshots without external utilities - they can be received in the form of a data-uri, but, ironically, in this mode the flash is not removed :)


So, here is a small checklist, how to get your own screenshots service, avoiding long trial and error:



Do not forget the steps "..." and "Profit!";)

In the end, I think we got a solution that surpasses all commercial analogues that I have seen.

Source: https://habr.com/ru/post/137847/


All Articles