I know that the topic of SQL injection has already been harassed by everyone.
However, the topic is very exciting. They constantly talk about it and fan the fire of distrust towards themselves, scare panic and it becomes even scary to those who were confident in their code. ')
There was already a mass of articles about how to prevent injections - I will not repeat it - it all boils down to several banal points of practice:
All data from any external sources (even if this data is taken from XML, retrieved from a super-secret and secure site) should be checked.
Be sure to check the type and lead to what you expect. If you were waiting for an integer number, and letters came and (or) a dot, we report an error somewhere, and give the client 500/404 / or the code you need in response.
At the input, it receives the variable to be examined, and, at the output, it returns the correct value, or null if the value is incorrect or empty. This is only an example, but the essence is clear - you can make your super function with blackjack and fairies of not heavy behavior.
We must remember and take into account the possibility of spaces in the input parameters of the forms. For example, this function correctly discards spaces by returning only a number. Thus, "123" and "123" for this function are correct values, but it will always return "123"
In the SQL query that you write, EVERYTHING should be allocated, even what seems to be unnecessary (for example, numeric values)
Those. it is necessary to arrange the correct quotes, as for the names of fields, tables, etc., and for any values.
Firstly, it will save you from unnecessary pauses for thinking: “to put a quote or not”, and secondly, it will leave less chance to write a potentially vulnerable query like
"select * from`table`where`id`={$id}"
(I think for many, when it comes to id it is written that way).
All data (and this is not only what has come from the outside but also what has been formed inside the engine function) should be screened.
You can use mysql_real_escape_string, you can use your own.
Quotation marks inside the parameters for it are not needed - it substitutes them itself - it will save from the fact that you can accidentally forget about n2 :)
Replacement data is taken from the mysql manual from here.
Now about mysql_real_escape_string, which is already on the edge of the teeth , one of the parameters of which is connection id. For example, this is not very convenient for me, because not for all uses I have it But why does he need this function for some reason? They wrote that without it, it works incorrectly :) If you look into the mysql sources, then this function is in mysys / charset.c and uses the identifier to get the CHARSET_INFO by it, and in turn to determine only the multibyte encoding or not. And how much is the length of one character. In any case, I understood it from listing. Maybe there is a guru C? Correct me if I'm wrong.
Since I work with UTF when connecting to the base or, in the worst case, with WIN1251, prepareStr can be completely dispensed with without encoding with the encoding, so if you have it really exotic, then it is better to use mysql_real_escape_string.
The prepareStr function , like mysql_real_escape_string, has one vulnerability - the fact is that both of them do not escape the "%" and "_" that are used in the LIKE construction. Inside LIKE, the "\%" and "\ _" type of shielding works, but outside - no - returns two characters. Why? - unknown, but true. Therefore, if you want to get rid of this ambiguity, I see two ways - either use a different version of the function for LIKE or use one function, but for universality you have to replace all the characters “%” and “_” with something like
"CONCAT( '%_', '\%\_', '%_ ')"
Perhaps there is a better solution, but I don’t know about it.
Use placeholders / prepared statements (or integrate their PDO / ActiveRecord / ORM, etc.)
By the way, these are completely different things. Briefly, placeholders are an assistant that forms a plain text query to the database, but escaping all the parameters itself, i.e. it eliminates the need for "...". mysql_real_escape_string ($ param). "...". And prepared statements is a way to ask for a database to prepare a query, for example, for a multiple call.
From my point of view, using a prepared statement for a regular site and a regular CRUD is like shooting a bazooka at sparrows. They have many pitfalls with caching, etc. If you know them or you know that this is necessary for you - well, then the matter is different.
And the most important thing to learn about SQL injection is that there is no silver bullet.
When writing queries you need to use the brain and ass.BRAIN - because he must understand what he is writing, and the ASS - because she feels that something is wrong;)
And if without humor - it is funny to me when people using PDO or a prepared statement believe that everything, SQL injections in their projects are impossible - this is not so, because all of this is just TOOLS that reduce the number of places where you need to think, but not clean their. And if a person has made a vulnerable request, it doesn’t matter what he throws him into the database through - he will be vulnerable.
An example encountered in one of the projects that I analyzed (such as this):
"select * from`{$table}`where`col`LIKE ?"
And the system correctly escaped everything and substituted it, except that in the $ table variable our encoder was very sure - because it was taken from the plug-in configuration file for the engine. And this file was in the directory where upload was allowed ... And so, because of one vulnerable plug-in bad uncles leaked the entire base ...
Somehow like this. I hope I helped someone with porridge in my head after heaps of articles and readings, how will I protect myself, and I have more people who are confident in their code. I look forward to your comments and additions, and thank you for reading to the end.
PS About errors, if you find it - please in PM.
Interesting from the comments
As zapimir rightly noted , to use mysql_real_escape_string is CORRECT, a locale setting via mysql_set_charset is required. SET NAMES does not affect it (will be taken by default)