📜 ⬆️ ⬇️

Habrastats

Habrastatistika


Actually, after the appearance of a rather interesting and popular Habrakamp topic , comrade opium asked a question , where he suggested creating a statistics script.

Despite the work, personal life and other factors that delayed the creation of the script, I still finished it to a certain condition.

I also ran into problems that I had never seen before.
Useful experience, as in any way.

Problem one


Initially, upon seeing the topic, I wanted a client script, i.e. in javascript language.
But here a razor in the form of a security policy was waiting for me, so all my attempts at ajax requests and iframe were unsuccessful. I subconsciously understood that there was some kind of bjaka on js, so be it.
')

Problem two, tricky


I think, well, okay, I will write in PHP (although I really wanted to write in Python, which I study slowly, but it would take a long time). I thought over the details, started writing code.
I received first level comment ratings using DOM tools. My thoughts are spinning in my head “I must bring them to the integer”. I understand that there are values ​​0, values ​​with plus and minus. Let there be a minus, plus I cut off the caster fireball in the integer, I watch what happened through print_r. I knew that I should meet a rating of -34, but he was not there! I fell into such a stupor that I decided to go for a little walk and dispel the brains that were boiling with surprise.
For a long time to tell how I was looking for a bug, I will say right away: the problem was in the minus sign, which for some reason went somehow coded. The fix looks like a crutch; I really hope that there are people on Habré who know the correct solution to the problem (I ask you to note that I did not find the answer on stackoverflow and php manual, maybe I was looking bad?):
/** * Here is awesome "-" * Parsed num is -34 * var_dump($num); string(5) "–34" * TODO: FIX THIS UTF-8 SHIT */ if (strlen($int) !== strlen($num)) { preg_match('/\d+/', $num, $m); $int = intval('-' . $m[0]); } 


Some useful information



libxml errors


When I loaded data into a DOMDocument object, my entities were poorly parsed and I naturally got E_WARNING. It's good that this is not the first DOM experience, so I wrapped
 libxml_use_internal_errors(true); $dom->loadHTML('html content'); libxml_clear_errors(); 

I will not describe the functions, everything is perfectly indicated in the documentation.
For good, you need to call libxml_use_internal_errors with a new one and specify the parameter false, but since I don’t need to parse more, I decided to omit this moment.

DOMXpath


As you might have guessed, the DOMXPath class has entered the fray, which simplifies the job of finding the right elements in the document.
I understood that the root element was in my div # comments, so I saved it for future use. And the xpath request itself encapsulated.
DOMXPath-> query returns a DOMNodeList. I added DOMElement [] to save autocomplete for IDE during iteration of result in foreach loop.
Plus, I added the ability to specify a custom context that may be needed in the future. I am now thinking about how to implement the counting of answers to questions (yes, what to hide, the algorithm for collecting questions also needs to be corrected so that it recursively walks the comment tree), and I understand that the request context here will be just the way.
 /** * Execute xpath query * * @param string $query XPath query * @param DOMNode $context [Optional] Context * @return DOMNodeList|DOMElement[] */ private function query($query, DOMNode $context = null) { if ($context === null) { $context = $this->context; } return $this->xpath->query($query, $context); } 


FINISH HIM



Link to the project:
github.com/miraage/habrastats
For local tests, I saved the habratopic.htm file, so as not to wait every time the topic loads.
The default topic is Habrakamp. You can pass through habrastats.php? Id = XXXX



There is no place to host a demo at the moment.
UPD.
Comrade Anonym posted a demo:
habrastats.m.tom.ru

Update


Completely forgot to say! This is only an intermediate sketch.
How to rake all my affairs - I will bring the script to mind.

The end



I will be glad to your comments, comments, constructive criticism and pull requests.

PS
If you still decide to do a pull request, then make it from a separate feature branch.
This article describes quite clearly the work with branches in git.

Source: https://habr.com/ru/post/148939/


All Articles