📜 ⬆️ ⬇️

Web Private Detective 1.0

Introduction




Last time we talked to you about how you can search for people in the text . Such things turn out to be very useful for analysts, who follow the news every day and are forced to monitor the presence of one or another person in the media. The biggest problems begin if this person is also of local importance, and on the Internet you can find a lot of information about her. But all the same, information about it is very dispersed and unstructured. Who is this person connected with? Who does she most often interact with? In the context of which topics does a person most often come up on the Internet?

It is not necessary that this person be a man. It would be nice to introduce, for example, the title of the series “X-Files” to get information about who plays in it, which characters are the main ones, which main objects of the film (FBI for example;)), etc.
')
Developing the idea, it can be said that it would be very nice to be able to monitor all the connections of the desired object in real time. Why? It would not be bad for a specialist from the department of internal security to come to work and see a notice that one of the wards suddenly began to be active under his “personal” email address in the competitors forum.
And even though the last paragraph for us is, so to speak, RoadMap, the first two can already be considered real and realized, to some extent.

Quite a bit of history


By the way, earlier, before the first stable version, it was called MadWin (I, II). And only now, having reached a stable release, the project received a new breath, and with it a new icon and name. The project itself is distributed under a commercial license and over time will go on a paid basis. In fact, support and assistance with the introduction (integration into third-party software) it is already paid for our main and regular customers. But for now this is the first stable release that will be available for at least a month for free download and for informational use only.

Functional


So, what has already been implemented and what functionality the program has:

All specified functions of the program can perform in relation to one of the possible sources of information:

Unfortunately, it is not yet possible to indicate several sources. It is also impossible to force the program to analyze several different folders, a couple of files and several different pages on the site. But it will appear in version 1.1, like many other goodies.

After specifying the source, it is enough just to specify the address for the program to save the result. HTML report. And that's all. Although sometimes, if the links in the text are rather confused, it may be necessary to specify a smaller depth of link analysis in order to speed up the result of the algorithm.

results


Well, now we will offer readers, for familiarization, examples of the operation of the algorithm. For example, site analysis: kde.org. The following is a link to the result report.

The project will be distributed in an assembly for the most basic platforms in the form of deb and rpm binary packages under a 32-bit architecture. There is also a version that runs under 32-bit Windows with an installer. Program updates can always be found here , or on the official page of the project . More detailed instructions with screenshots and step by step description of the work, see here .

And then what?


If you are interested in the project, here are brief innovations that can be seen in the future release 1.1:
.
But the most delicious innovation, in addition to those listed above, will be the ability to automatically assign labels to files in KDE. The program will be able to receive the address of the folder at the entrance, having analyzed each TXT or HTML file, automatically assign tags to each file with the persons referred to in the file. Perhaps this functionality will be available to users of Windows 7, but most likely not.

links


project website
Opendesktop
Author's blog
twitter

Source: https://habr.com/ru/post/117198/


All Articles