📜 ⬆️ ⬇️

A little bit about WebKit internals

I set myself with the task, get acquainted with how the basis of all modern browsers works in one form or another - WebKit, how the resource loading process takes place, and what can be done with this all. Documentation on the issue, in principle, is enough:

* structured, but not covering, and the 10th part of Apple ;
* scattered articles on the wiki , different in the degree of detail and degree of coverage.

The purpose of this article is not a general view of the system from above, but just a point and detailed analysis of one of the processes occurring in the system. Which, in my opinion, sometimes gives a better idea of ​​the system as a whole than an abstract look. Or maybe it will just be a small brick that the developer will need to compile information about the system from a scattered mosaic of information.

Structure


')
Document loading process, loader

programm1.c
mainWidget = new QWebView(parent); // mainWidget->setHtml("<html><body>HEIL</body></html>"); mainWidget->page()->mainFrame()->setHtml( "<html><body>HEIL</body></html>",QUrl() ); 


 QWebFrame::setHtml() → QWebFrameAdapter::setHtml // qt/qtwebkit/Source/WebKit/qt/WebCoreSupport/QWebFrameAdapter.cpp:284 ... → WebCore::FrameLoader::load // frame->loader()->load(WebCore::FrameLoadRequest(frame, request, substituteData)); // "request" description WebCore::FrameLoadRequest( WebCore::Frame WebCore::ResourceRequest // Request Description -   WebCore::SubstituteData // Data description - qt/qtwebkit/Source/WebCore/loader/SubstituteData.h // data - WTF::RefPtr<WebCore::SharedBuffer> // mime-type - "text/html" // encoding - "utf-8" // failingURL ) ) → FrameLoaderClientQt::createDocumentLoader() // RefPtr<DocumentLoader> loader = m_client->createDocumentLoader(request,substitute_data) // WebCore::FrameLoaderClient() → FrameLoader::load(DocumentLoader* newDocumentLoader) // FrameLoader::load(loader.get()) newDocumentLoader.m_frame = 0 ; → FrameLoader::loadWithDocumentLoader(newDocumentLoader, type, 0) → FrameLoader::loadWithDocumentLoader(DocumentLoader* loader, FrameLoadType type, PassRefPtr<FormState> prpFormState) ... setPolicyDocumentLoader(loader); //       m_frame  loader. → loader→setFrame(m_frame); → m_writer.setFrame(frame); ... // Check policies and with callback jump to callContinueLoadAfterNavigationPolicy via static jumper FrameLoader::callContinueLoadAfterNavigationPolicy(const ResourceRequest&, PassRefPtr<FormState> formState, bool shouldContinue ) → //   : m_policyDocumentLoader.get()->substituteData().content()->data(); setProvisionalDocumentLoader(m_policyDocumentLoader.get()); // m_provisionalDocumentLoader = m_policyDocumentLoader.get(); FrameLoader::continueLoadAfterWillSubmitForm // RefPtr<DocumentLoader> m_provisionalDocumentLoader; m_provisionalDocumentLoader→startLoadingMainResource // DocumentLoader::m_mainResource - CachedResourceHandle<CachedRawResource> // DocumentLoader::ResourceRequest m_request; //    m_substituteData.content()->data() → handleSubstituteDataLoadSoon → handleSubstituteDataLoadNow WebCore::ResourceResponse response(url, m_substituteData.mimeType(), m_substituteData.content()→size(), m_substituteData.textEncoding(), ""); → DocumentLoader::responseReceived(0, response) // responseReceived(CachedResource* resource, const ResourceResponse& response) m_response = response; // m_m_identifierForLoadWithoutResourceLoader=1 // notifier()       View . → frameLoader()→notifier()→dispatchDidReceiveResponse(this, m_identifierForLoadWithoutResourceLoader, m_response, 0); → DocumentLoader::continueAfterContentPolicy(PolicyUse); // PolicyUse is enum val enum PolicyAction {PolicyUse,PolicyDownload,PolicyIgnore}; // m_response.isHTTP()=0 isLoadingMainResource()=1 isStopping()=0 ... → DocumentLoader::dataReceived(0, m_substituteData.content()→data(), m_substituteData.content()→size()); → frameLoader()→notifier()→dispatchDidReceiveData(this, m_identifierForLoadWithoutResourceLoader, data, length, -1); → DocumentLoader::commitLoad(const char* data, int length) // data – our html data → frameLoader→client()→committedLoad(this, data, length); // FrameLoaderClient::committedLoad → FrameLoaderClientQt::committedLoad → void DocumentLoader::commitData(const char* bytes, size_t length) // loader->commitData(data, length) –     DocumentWriter,  documentLoader   m_frame. m_writer.begin(documentURL(), false); m_writer.setDocumentWasLoadedAsPartOfNavigation(); → void m_writer.addData(bytes, length); // DocumentWriter::addData ... → finishLoading(0) 


The callstack above is a chain of calls from setHtml to communication with the Document. If you are interested in the process of placing the policy and various types of data download, then you need to dig there. I threw out part of the transitions, leaving in my opinion only those that reflect any meaningful operation.
The process that interests us in the callstack above is the start of writing to the document and what is needed for that. By looking at exactly how DocumentWriter prepares in DocumentLoader :: commitData, you can “simplify” programm1.cpp (not in the sense of code, but in the sense of approaching it “to the ground”).

programm2.c

  QWebPage *page = mainWidget->page(); QWebFrame *qtWebFrame = mainWidget->page()->mainFrame(); QWebFramePrivate *qtWebFramePrivate = qtWebFrame->d; WebCore::Frame *frame = qtWebFramePrivate->frame; WebCore::DocumentWriter m_writer(frame); m_writer.setFrame(frame); m_writer.begin(url, false); m_writer.setDocumentWasLoadedAsPartOfNavigation(); m_writer.setEncoding("utf-8", true); m_writer.addData(html ,strlen(html) ); m_writer.end(); 


In the same way, you need to go down below - before parsing (the WebCore :: DocumentParser family of classes). Writer cannot get rid of the interlayer completely: the appendBytes call contains a writer, plus the Writer is responsible for creating the decoding interface and communicating with the View: DocumentWriter::reportDataReceived - m_frame->document()->recalcStyle(Node::Force).

 void DecodedDataDocumentParser::appendBytes(DocumentWriter* writer, const char* data, size_t length) { ... String decoded = writer->createDecoderIfNeeded()->decode(data, length); ... writer->reportDataReceived(); ... } 


Introducing initialization steps from DocumentWriter :: begin, we get programm3.cpp:

programm3.cpp

  const char *html = "<html><body>HEIL<b>IGO</b><script>document.write(1234);</script></body></html>"; size_t htmlen = strlen(html); RefPtr<WebCore::Document> document = WebCore::DOMImplementation::createDocument("text/html", frame, url, false); document->createDOMWindow(); frame->setDocument(document); document->implicitOpen(); frame->script()->updatePlatformScriptObjects(); RefPtr<WebCore::DocumentParser> parser = document->parser(); WebCore::DocumentWriter writer(frame); m_parser->appendBytes(&writer,html ,htmlen); m_parser->finish(); 


PS

On this my journey to WebKit is suspended. I hope that for someone this text will lower the point of entry into the project.

The last section contains recipes for quickly deploying an environment for experimentation.

Compilation

The basis was the qt-everywhere-opensource-src-5.3.1 package. From there it is not necessary to take everything - it is enough qtbase i qtwebkit. When building, I ran into the following problem: a static build with the -debug ( ./configure -static -debug ) flag enabled creates sub-libraries that I did not manage to link to the working example on an 8GB machine. Well, even if you wait, linking is an option, waiting for a few minutes of recompiling the simplest examples is not very suitable. Without symbol info, linking takes a couple of seconds, libWebkit.a ~ 54Mb itself.

There is another problem with the shared library: qt - does not export the webkita API, but hides it behind its own. It is treated with " -fvisibility = default " instead of hidden . For a hack, this is enough - there are no overlapping names in the library, for a normal project, it is necessary to long and tediously rewrite definitions of exported functions using WTF_EXPORT directives. About exporting there is information here , but it did not help me.

It is also necessary to disclose access to WebCore to the world from QWebFrame ( qtwebkit / Source / WebKit / qt / WidgetApi / qwebframe.h ) - the implementation is hidden behind the private variable QWebFramePrivate* QWebFrame::d - it should be made public. Then to WebCore :: Frame it will be possible to reach through
WebCore::Frame *frame = [ QWebView ]->page()->mainFrame()->d->frame;


For tests, I, nevertheless, recommend a dynamic build, otherwise neither gdb nor just linking will give you any pleasure. Here are my approximate settings:

cd ./qtbase
./configure -opensource -confirm-license -release -nomake tools -nomake examples -no-compile-examples -no-opengl -no-openvg -no-egl -no-eglfs -no-sql-sqlite2 -D QT_NO_GRAPHICSVIEW -D QT_NO_GRAPHICSEFFECT -D QT_NO_STYLESHEET -D QT_NO_STYLE_CDE -D QT_NO_STYLE_CLEANLOOKS-how-itable (de). -no-kms -no-libudev -no-linuxfb -no-mtdev -no-nis -no-pulseaudio -no-sm -no-xinerama -no-xinput2 -no-xkb -no-xrender -openssl -openssl-linked -icu -fontconfig -system-freetype -system-libpng -system-libjpeg -system-zlib -qt-pcre -qt-harfbuzz -qt-sql-sqlite -qt-xcb -debug
make -j 8

cd ../qtwebkit
../qtbase/bin/qmake WEBKIT_CONFIG- = use_glib \ use_gstreamer \ use_gstreamer010 \ use_native_fullscreen_video \ legacy_web_audio \ web_audio \ video \ gamepad -o Makefile.WebCore.Target WebKit.WebCore.Target WebKit.WebCore.Target WebKit.WebCore.Target WebKit.WebCore.Target WebKit.WebCore.Target WebKit.WebCore.Target WebKit.WebCore.Target WebKit.WebCore.Target WebKit.WebCore.Target WebKit.WebCore

# remove -fvisibility = default from received make, this can also be done with an option.

make -f Makefile.WebCore.Target -j 8


It remains to pick up all the libraries from ../lib/ and you can link the project. The minimum required was:

 libQt5Core.so libQt5Gui.so libQt5PrintSupport.so libQt5WebKit.so libQt5WebKitWidgets.so libQt5Widgets.so libQt5Newtork.so libQt5Core.so.5 libQt5Gui.so.5 libQt5PrintSupport.so.5 libQt5WebKit.so.5 libQt5WebKitWidgets.so.5 libQt5Widgets.so.5 libQt5Network.so.5 


Build issues

If you saw:
This application failed to start because it could not find or load the Qt platform plugin "xcb".
So for a static build, you forgot:
Q_IMPORT_PLUGIN(QXcbIntegrationPlugin)

And for dynamic, either libxcb.so is lost, or if you have built qtbase with the -qt-xcb key, you need to create a ./platforms folder with the qt / qtbase / plugins / platforms in the project (just one libqxcb.so file is enough ).

When compiling the test module, you need to disable the inline functions ( -fno-inline-small-functions ) or take the entire list of defines with which the library was compiled, otherwise, since you use the same includes as the compiled WebKit, you will have to keep track of all get into the headline. In WebCore :: Document, my client code due to this error for the m_parser variable lost 8 bytes in offset, and they appeared — errors — in a wide variety of places.

Source: https://habr.com/ru/post/237771/


All Articles