📜 ⬆️ ⬇️

Improving client side forum engine

Many of us are aware of the possibility of changing the appearance of websites by local means, without modifying the files on the server. There are various ways to do this, the most popular ones are custom scripts and styles that are automatically applied by the browser to the loaded page. Of course, the possibilities of such editing are very limited, and to solve some serious problem that requires a non-standard query to the database, this way will not work.

However, even in this case, it may turn out that not everything is so hopeless. In this article I want to talk about my own experience of creating a local add-on for the phpBB2 forum engine, which corrects the display of the read status for topics and posts. Although the result of this attempt was quite a workable product, which I now constantly use, the purpose of writing this article is still not a product presentation, but a description of the approach to solving the problem. I spread the code of the received program, but due to the specifics of the task, it is not universal enough and cannot be used as it is, without prior (and rather painstaking) settings for the engine of a particular site. I decided to warn about this beforehand so that there would be no disappointment.

Well, now in more detail about the formulation of the problem.

Despite the fact that the third version of phpBB has long been considered the latest, the second is still quite often found on the Internet. Here is a list of phpBB2 flaws that my tool is meant to fight:
  1. After relogining or restarting the browser, as well as after a certain period of inactivity, all unread topics are automatically marked as read.
  2. When you open a topic, it becomes read all over, no matter what page you open.
  3. The read status is stored in the cookies as a serialized array. The maximum size of cookies is usually limited to 4 Kb, which allows storing status only for ~ 140 topics. For an active forum this is very little.
Some may shrug their shoulders: and what is the problem? And the problem is that if you strive to be aware of all the cases, then marking the unread topics and messages with an orange leaf (in the base skin) helps to quickly see which topics are updated and which are not. If you, say, decide to go to the forum now and quickly reply to some message, and read the rest of the updated topics to postpone until later, the next time you see the unread status will be reset, and you will have to search for new messages manually, remembering when it was last visited. specific topic, and looking for dates or texts of posts, what appeared in it from that moment. And even if you always read all the threads at once, this is not a guarantee: after all, browsers can sometimes crash, the computer can fail, and the light can turn off (and not everyone has a UPS).
')
Of course, the most appropriate solution to the problem would be to modify the forum engine, but in practice this is not always possible to implement. Therefore, I decided to try to solve this problem on my own, on the client side. Of course, there can be no talk of a complete ideal solution without direct access to the database on the server. In particular, in most cases it is impossible to unambiguously determine the presence / absence of unread topics in a single sub-forum in order to correctly display its icon on the main page (unless you download and analyze the full set of all pages with its topics, but this is a long time). However, some significant improvements are quite realistic.

The overall architecture of the add-in looks like this: a local database is stored on the client machine, which stores the timestamp of the last visit for each topic, plus a proxy server runs that passes all forum pages through itself, correcting the labels for topics and messages in accordance with the actual status readings taken from the local database (and, of course, updating this database as needed).

As the base format, I chose a text file with a set of lines like "00000000000000". Each line corresponds to one topic, the line number (counted from zero) is the topic identifier, and the timestamp of the last visit in the UNIX-time format is recorded as the content of the line. The zero topic does not exist, so the zero line is used in a special way: it stores the date / time for the forum as a whole, so that you can quickly mark all the forum topics with readings at once. Thus, for each topic, the real time of the last visit is considered the maximum of two values: general forum and specifically this topic.

Why I stopped on the text format? First, it was convenient to correct errors if necessary, without getting into the binary editor. Secondly, my proxy server is written in Perl, and in it it is more convenient to work with text files than with binary ones. Parsing numbers from a text string is not such a resource-intensive operation, and thanks to the fixed length of the string, you can go directly to the desired record by index, without reading the entire file line by line.

As for the proxy server implementation, the Pearl language was chosen for the reason that it is very convenient to work with text on it (and HTML parsing will, of course, be the key part). The resulting speed, by my standards, turned out to be quite acceptable (in any case, the potential performance gain from switching to another language looks less significant to me than wasting time and effort). The proxy server listens to its port and, upon receiving a request from the browser, sends the request to the target server, reads the response and sends the contents of the received page to the browser. On its way, the page passes through a filter whose behavior depends on the target address. If this is one of the forums we need, and not just anything, but one of the scripts viewtopic.php , viewforum.php , index.php (or just the root URL of the forum), then the filter starts parsing the page, replacing the labels of topics and messages based on the date / time of the last visit, taken from the local database. Otherwise, the filter does not work, but simply sends the unmodified content to the browser.

The most difficult and unpredictable part is parsing. The problem is that different themes and extensions can be installed on each specific forum, so that the HTML-code of the pages received from the server varies widely. To take into account the peculiarities of different forums, I only use the key structural features of the phpBB2 engine, and specific signal lines are rendered into separate modules that are loaded by the proxy server at the start and allow each forum to be processed according to their own sets of rules. Of course, if you need to add support for a new forum, you will have to do all the work on selecting and setting up signal lines manually, by analyzing the HTML page received from the server. If the changes in the engine are small, everything will be limited to this. But it may also be that the engine has been seriously reworked, and the HTML structure has become completely different. Then the entire main module will have to be redone, and it’s not a fact that the “multi-forum” will be kept at all. It is possible that it will be easier to keep a separate version of the proxy server, sharpened specifically for one of the "clever" forums.

For proper parsing, you will also have to make changes to your forum profile. The fact is that by default the engine gives the date / time of messages without specifying seconds, and since the time to take us nowhere else, it turns out that the error in placing marks will be one minute, which of course is too much. Therefore, it is necessary to choose a more complete time format, with seconds, in your profiles [settings]. I dwelled on the “D dmY, G: i: s” format, which will allow displaying the date in the form of “Mon 02/14/2011, 10:57:44 AM” (of course, the proxy server is not needed for the day of the week, but the convenience of a loved one too Do not forget). If you prefer a different format, you will need to make the appropriate corrections to the timestamp function. We should not forget about the modifications of phpBB, which instead of the date can insert the words "today" or "yesterday." Unfortunately, all this will also require the completion of the code.

Well, it remains to mention a few features.

Well, if someone has not scared away all this confusion yet, the server itself can be downloaded from here (archive, 5 Kb). Included is a set of rules for the official forum Total Commander, which I took as a basis for my experiments and for the sake of which, basically, this whole bodyaga was started.

Source: https://habr.com/ru/post/113177/


All Articles