Recently I witnessed one interesting dispute about how really you need to determine the IP address of the end user from PHP scripts.
Actually, each word sabzh displays the actual situation. It was a religious debate, aggravated by spring wonderful weather, in which, I believe, there were no right and wrong, but which led me to a mini-study and, to my luck, put an end to understanding this confessional but in fact a very simple question.
For those who, like me, I was sure that I understood everything, but I was
afraid to ask if I was too lazy to understand the little things - under the cat.
Prehistory
Being engaged in the development of VOD service for Samsung SmartTV platform, we certainly need to know the user's country, so that inadvertently not showing a happy user a movie where the copyright holder prohibits ... missteps).
[The question, as noted in the comments, Legal, and fraud is possible, but the article is not even about how to try to prevent such fraud, but about how to make friends with php and nginx]
On the server we have the following: php-fpm + nginx
')
How to determine the country? Well, of course through the user's IP and GEO IP database
maxmind
"Pfff ...." -
it seemed to
all of us - yes, simpler than simple. And in order not to write my bike,
google stackoverflow ,
even penetrated into each line, screwed it and left it there as the code grew:
public function getUserHostAddress(){ if (!empty($_SERVER['HTTP_X_REAL_IP']))
And everything worked! Almost a year ... until something unexpected happened. Naturally unexpected for this code ...
How to confuse php or proxy chain (still part of history)
It broke! And this happened when we had to screw one of the payment systems and all this code collapsed because in HTTP_X_FORWARDED_FOR not one address came, but a comma-separated list of addresses (which is strictly speaking legal, acceptable, and not even regulated in a
php dock )
And no one would have noticed if HTTP_X_REAL_IP or HTTP_CLIENT_IP (which is also not regulated by the dock) contained the IP you were looking for, but alas, they were empty :(
“Well, okay” - we thought (now I was no longer alone), we would rewrite everything and ask the admins to push the user IP into the variable REMOTE_ADDR:
public function getUserHostAddress(){ $ip=$_SERVER['REMOTE_ADDR']; return $ip; }
And everything worked! Almost a month ... until something unexpected happened. Naturally unexpected for this code ...
Spring dispute tough men (this is not irony - they are cool)
It broke! This happened because we had to update nginx. And we turned to the professionals in this business - to our admins.
And those, in turn, decided to update the config and get rid of our “crutch / not crutch” (until we understood this) with a forwarding to REMOTE_ADDR.
REMOTE_ADDR left unchanged i. there now shone something like "127.0.0.1"
in HTTP_X_FORWARDED_FOR, the user's IP was skipped (which, in the meantime, was easily overridden by sending the header `x-forwarded-for: 999.999.999.999` from the browser)
And then it started - P = Developed, A = Admin:
A: you have broken everything, and since we have a nginx proxy, then the address you need is in HTTP_X_FORWARDED_FOR and in REMOTE_ADDR there will be a real client IP address to php-fpm (ie, 127.0.0.1)
R: but we cannot believe HTTP_X_FORWARDED_FOR, because this is a variable that can be easily redefined via the header to the server, referring to a very interesting
article
A: No, we will do so that it will contain the real IP of the end user, and in REMOTE_ADDR the real client address to php
R: then we don’t follow the sequence of proxies, and still for universalization on another server (say, no proxy) these configs may not be true, push everything into REMOTE_ADDR which will work in any case.
... it is brief and without mats ...
As a result, of course, everything got started ... and we stopped at transparent proxying, when php thinks that clients connect to it directly without any proxies and all variables (or rather, the one to which we pay attention) are in the state we need.
However, there is not enough feng shui in this matter, and in fact we
have a proxy or maybe not one.
Who is to blame of them who is right
Judge us, but no one!
If we really have a lot of clients directly to php, or transparent proxying, then everything is simple - use REMOTE_ADDR for health and enjoy.
But what about the Feng Shui and where should it be if we use normal proxying and want PHP to know about it?
The recipe ... but not a panacea:
- REMOTE_ADDR - contains the IP address of the nginx directly accessing it, in our case 127.0.0.1
- HTTP_X_FORWARDED_FOR - contains a chain of proxy addresses and the last is the IP of the direct client who accessed the proxy server. And here we consider two special cases:
- Not cascading proxying. In HTTP_X_FORWARDED_FOR, the last or only IP address (depending on what the user sent / did not send in the x-forwarded-for header) will be the real, desired, same user address.
It would seem well what the problem is to parse this variable and get the last element from there. But in our case, the settings were not completely correct and the entire HTTP_X_FORWARDED_FOR was replaced with the header from the x-forwarded-for browser, but had to paste the real IP of the direct user to it.
For example, checked on industrial vps hosting:
It is also scary to trust such data, but if everything is done correctly in the settings, then the last IP will be the user's address, regardless of what comes in the headers.
- Cascading proxying In this case, the HTTP_X_FORWARDED_FOR is really a chain of proxy addresses and the last is the IP of the direct client who accessed the proxy server. But this is not the real IP of the user, but only the IP of the previous proxy in the list.
It would seem well what the problem is to parse this variable and get the first element out. But as it was shown above in the figure, this is certainly not the correct data and the user can mislead us in two accounts by sending to x-forwarded-for the first element that IP wants
- HTTP_X_REAL_IP (or any other variable that Admin and Razrab agree on ) - contains the IP of the user accessing the php or the first non-trusted proxy from the server (which is equal to the client's address for us)
For convenience, you can use a special module for nginx which eliminates the problems of determining cascade and non-cascade proxying, but it defaults to “in standard assemblies of centos, debik and nginx fed, for some reason without the - with-http_realip_module parameter” (c) Admin , as well as for it the chain must be correctly formed in HTTP_X_FORWARDED_FOR and the addresses of trusted proxy servers are configured from which we can take the last element from HTTP_X_FORWARDED_FOR
However, again, HTTP_X_REAL_IP is not the real IP of the end user in general , but only the first IP in the list of proxies during cascade proxying.
Although if the proxying is not cascading, then there may be the address of the end user.
And if the proxying is cascaded and the http_realip module is correctly configured, then there should be either the end user's IP or the correct IP of the first untrusted proxy from the php server, which is good for us too - HTTP_CLIENT_IP (or any other variable that Admin and Develop will agree on ) contains the first IP from HTTP_X_FORWARDED_FOR for any type of proxying, and in the absence of proxying, the contents of the http client-ip header. Which can be used for reference only. And in no case to determine the real IP of the user .
In custody
There are several proxying options for php + nginx
- Transparent is characterized by the constant content of variables in _SERVER (including REMOTE_ADDR) as if we were working directly with php
- Not transparent, not cascading - it’s characteristic that the Admin and Razrab need to agree on where the real IP address of the user will be stored :)
- Not transparent cascade - the same is characteristic of non-transparent non-cascade + correctly configured module for nginx . Also, you should remember about the possibility of cascading proxying and that the user is evil and can send very erroneous data to _SERVER ["HTTP_xxxx"]
PS
Later we will set up Feng Shui in the settings and get rid of transparent proxying, as well as write the universal function of determining IP for both cases of proxying.
Pps
For fun, who cares: if someone in the comments writes this function and the nginx config for us and we use it, then at fair word, he will get 100r on the phone.
But this function and the config should be truly Orthodox and take into account everything :) all the clues are in the article.
The main thing is Zen: take your time - suddenly the first ones will write with errors and you will take them into account, take your time - suddenly the first correct answer will be up to you.
Thanks to all. Have a nice spring! Negotiate with colleagues and love them! :)
UDP:
Own implementation:
public function getUserHostAddress( $ip_param_name = null, $allow_non_trusted = false, array $non_trusted_param_names = array('HTTP_X_REAL_IP','HTTP_CLIENT_IP','HTTP_X_FORWARDED_FOR','REMOTE_ADDR') ){ if(empty($ip_param_name) || !is_string($ip_param_name)){