📜 ⬆️ ⬇️

Automatic filtering of comments in Livejournal using XML-RPC

In this article I will talk about how using XML-RPC to get information about comments to my posts in LJ, and even to delete them a little.

Initially, the script was written in connection with the situation in LJ Navalny, where an unknown bot thousands of thousands of posts the same message, preventing the discussion in the comments. I do not fully share his point of view (and certainly do not belong to his fans), but freedom of speech on the Internet is dear to me, so I spent some time studying the possibilities of processing the comments with the script.

Each entry in Livejournal has a unique id ( jitemid ) that is not repeated within the same journal. This is a regular auto-increment identifier, but it does not appear in the entry URL directly. Instead, ditemid is used, calculated by the following formula:

ditemid = 256 * jitemid + anum
')
Here anum is a random number from 0 to 255, given once during the creation of a post and stored along with information about it.

The same technique applies to comments, except that the anum is not their own, but is taken from the post. The relevant comment options are jtalkid and dtalkid .

The XML-RPC protocol, as applied to LJ, is described here . Using the getevents method, we can get all the records in the log (I take the last n records, for this you need to specify the parameters selecttype = lastn and howmany = n). Information about records obtained in this way contains itemid (= jitemid) and anum - just what we wanted to get. Also transmitted is reply_count - the number of comments to the entry. By the way, by clearly indicating the userjournal , you can get the records available to you from any journal (including subzamochnye if you are included in the list of friends).

Go directly to the comments. Here we are lurking bummer. Judging by the description of the protocol, there are no methods for working with comments.

Fortunately, reading the source allows us to find the necessary methods in the code of the protocol handler (use the force, read the source!). It turns out that there are many interesting things there:

my %HANDLERS = (
login => \&login,
getfriendgroups => \&getfriendgroups,
getfriends => \&getfriends,
friendof => \&friendof,
checkfriends => \&checkfriends,
getdaycounts => \&getdaycounts,
postevent => \&postevent,
editevent => \&editevent,
syncitems => \&syncitems,
getevents => \&getevents,
editfriends => \&editfriends,
editfriendgroups => \&editfriendgroups,
consolecommand => \&consolecommand,
getchallenge => \&getchallenge,
sessiongenerate => \&sessiongenerate,
sessionexpire => \&sessionexpire,
getusertags => \&getusertags,
getfriendspage => \&getfriendspage,
getinbox => \&getinbox,
sendmessage => \&sendmessage,
setmessageread => \&setmessageread,
addcomment => \&addcomment,
checksession => \&checksession,
getrecentcomments => \&getrecentcomments,
getcomments => \&getcomments,
delcomments => \&delcomments,
screencomments => \&screencomments,
unscreencomments => \&unscreencomments,
freezecomments => \&unfreezecomments,
editcomment => \&editcomment,
);

For our purposes, getcomments and delcomments will be useful.

To get comments on the post using getcomments, we give it the following: ditemid (our post ID), journal (the journal that hosts this post, for example, navalny), page (number of comments page). If we do not want to deploy threads, we specify expand_strategy = mobile_thread. As with getevents, we can get any comments available to us. In response, we will receive an array of comments of the first level. If the comment has answers, they are attached to it as an array.

Then we just have to go through all the pages of comments, and on the pages - on all available comments, and check them. I used just a stop word, but you can check anything you like, up to the date of the registration of the author (you will need a separate request for each). The selected dtalkids are passed to the delcomments method (the optional parameter recursive allows you to delete the entire thread). Voila! We have a clean magazine!

Script sources in PHP can be viewed on GitHub .

I am grateful to the authors of this and this for valuable information.

Source: https://habr.com/ru/post/114385/


All Articles