To create this article, I was inspired by the publication “Getting members of the vk.com community in seconds . ” My article is written by a newbie and reflects the experience of solving one problem. The main purpose of writing this article for me is to collect opinions, feedback and criticism of the applied approach from more experienced colleagues. In addition, I hope that the information given here will be useful to someone.Not so long ago, in one of the test tasks for a junior php programmer's job, I came across a simple but interesting task for me.
"Make a script in php that returns a list of id users on VKontakte, separated by newline characters that are men over 25 years old and belong to the group
vk.com/habr ."
Access to information from the VKontakte database is carried out using the VK API. Starting acquaintance with the VK API is better with
official documentation . In order to call the API method of VKontakte, it is necessary to make a POST or GET request via the HTTPS protocol to a URL of the following form:
')
api.vk.com/method/METHOD_NAME?PARAMETERS&access_token=ACCESS_TOKEN , where METHOD_NAME is the name of the method from the API method list, PARAMETERS are the parameters of the corresponding method, ACCESS_TOKEN is the access key.
In our task, we use the
groups.getMembers method, which returns a list of community members. All method parameters are described in the documentation. The method does not require an access key. In the standard form, the answer comes in the form of a JSON file. In one request, you can get data no more than 1000 users. To see the method output live, it is enough to enter the simplest query in the address bar of your browser:
api.vk.com/method/groups.getMembers?group_id=habr .
We
get a JSON-structure with the total number of members of the
vk.com/habr community and the thousand first id in the default list, sorted in ascending order.
By the condition of the problem, we need to display the id of users of a certain gender and age. The obvious way is to select the users of the group together with their data about the field and age by VK requests, and then analyze them in PHP code and display only the necessary ones. Another possible method, the
execute method, allows you to transfer a script in a special VKScript language in one request to manipulate data on the server and return the already processed data. I’ll say right away that I failed to solve the problem using the execute method. Maybe in the comments someone will indicate such a solution.
Let's go on the first path. The groups.getMembers method using the sex value of the fields parameter can display the user's gender, but it does not display the age. Instead, the fields parameter has a bdate field - date of birth. In addition, we select thousands of users in requests, which means that each next request should return the next thousand. For this, there is an offset parameter that shows the position to start the selection from. We also indicate in the request version of the API.
As a result, the request will look something like this:
https://api.vk.com/method/groups.getMembers?group_id=habr&offset=0&fields=sex,bdate&version=5.27To pick up a file by reference, in PHP there is a function
file_get_contents () . It receives the content by reference and returns it as a string. It should be noted that in order for the file_get_contents () to understand the HTTPS protocol, openssl support in the web server is needed.
Then the resulting JSON content can be converted to an array by the
json_decode () function. The array will contain both id and gender. Date of birth may not be specified.
If the date of birth is still specified, it remains to get the age from the date of birth.
Birthdates in bdate are stored in DD.MM.YYYY format strings if the year of birth is specified, or DD.MM if the year of birth is not specified. To find out in what format the string is actually, I used the first thing that came to my head: count (explode ("., $ User_array ['bdate'])) is 2 or 3. This method works and I do not think this is the narrowest place script.
To calculate the age by date of birth, I found the formula
hashcode.ru/questions/137939#137940 . The
strtotime () function understands the format of the bdate field.
Checking gender and age. If they satisfy the condition, output id.
This option works fine on relatively small groups, but on groups of more than 100 thousand subscribers the script doesn’t work to the end - at some point, for some reason, the error “file_get_contents (...) falls out: failed to open stream: Connection timed out in ... on line ... ". Tried to increase the runtime of the script and the timeout of the web server did not help. I could not find a pattern.
Then there was another option - to load the response of the request to use
cURL . To apply this method, you must install the libcurl library in the OS, for example, in Ubuntu -
sudo apt-get install libcurl3
and enable cURL in PHP, for example, in Ubuntu -
sudo apt-get install php5-curl
Now you can open the curl session with the
curl_init () function in the PHP script, set the connection parameters (including the URL) with the
curl_setopt () function and download the content of the JSON files to the string with the
curl_exec () function. Then you should close the session -
curl_close () . The rest of the code remains unchanged:
As I have already said, I think an approach with the execute method is possible, but so far I have not managed to get a satisfactory result in this direction.
PS I ask you not to think that I want to get the solution of the test task from the Habr audience. The above options, I have long sent and received a response. I just spent a lot of time on this task and would like to know if I was moving in the right direction and what other approaches could be used.