MIT course "Computer Systems Security". Lecture 9: "Web Application Security", part 3
Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems". Nikolai Zeldovich, James Mykens. year 2014
Computer Systems Security is a course on the development and implementation of secure computer systems. Lectures cover threat models, attacks that compromise security, and security methods based on the latest scientific work. Topics include operating system (OS) security, capabilities, information flow control, language security, network protocols, hardware protection and security in web applications.
Audience: so what prevents an attacker from finding the key? Where is this secret key? ')
Professor: yes, that is a good question. In most cases, the AWS client is not a browser, but some virtual machines running in the cloud. Thus, you only see the communication between the virtual machines. You can also imagine that users can somehow distribute these links or embed them somehow in HTML. If you have something like HTML on the board inside HTML or JavaScript source code, you will also have code to create such a query. So if I give you one of these things, you can make a request on my behalf.
Audience: Is it possible to use MAC for normal clients?
Professor: for normal - do you mean browsers?
Audience: for ordinary users.
Professor: the fact is that the question of where the key actually lives is extremely important. Because if a key can be stolen as easily as a cookie, then we don’t win anything. Therefore, in many cases, all these things are stored somewhere in the cloud and serve to exchange data between virtual machines and are transferred from server to server in the cloud too. In this way, the developer of the application starts a VM, which uses the outsourcing of a heap of things stored in AWS.
Audience: Is there a problem with network latency, so an attacker can send the same request immediately after the user and also get access?
Professor: yes, suffice it to say that several people defended dissertations on the topic of timestamp security. But you are absolutely right, because we have considered a rather rough example.
Imagine that here, in this example, the String To Sign in the DATE line we will have the value "Monday, June 4th." Then, if in some way the attacker can access all of this, then he will be able to repeat the user's request. The fact is that AWS allows you to use the expiration date of these things. So one thing you can do is add an Expires field here, and assume that the expiration date has been set.
Then I can give this link to a bunch of different people, and the server will check if their requests have expired.
Audience: even if the expiration date is only 200 milliseconds, but the attacker is watching the network, he can send just in case several copies of the request instead of one.
Professor: Exactly, if an attacker attacks a network and sees how these things are transmitted by wire, and there is enough room for maneuver on the expiration date, then he can definitely make this attack.
So this was an overview of how stateless cookies work. Here an interesting question arises: what does it mean to log out of the system, having cookies of this type? The answer is that in reality you are not leaving. I mean that you have this key, and whenever you want to send a request, you just send it. But the server could revoke your key.
Suppose the server has revoked the key. But you can create one of these GET things, and when you send a message to the server, it will say: "Yeah, I already know your User ID, User ID, the key has been revoked, so I will not fulfill your request." However, there are nuances, and if we talk about such things as SSL, then it is much easier to empower a person than to recall them.
Thus, there are several other things that you can use if you want to avoid traditional cookies to implement authentication. One of them is the use of the DOM repository, which contains information about the authentication of the client side. You could use the DOM repository to store some session states that are usually placed inside cookies.
If you remember from the last lecture, the DOM repository is a key interface of the values ​​that the browser provides to each source, that is, the browser takes them from there and inserts them into a string.
The good thing is, the DOM does not have such stupid rules regarding the same policy of the same origin. So, if these were ordinary cookies, you could do all these tricks with subdomains and the like. The DOM repository is actually strictly bound to one source, so you will not be able to extend any subdomain. Therefore, frameworks such as Meteor use this repository.
But note that if you want to store the authentication information in the DOM repository, you will need to write your own JavaScript code in order to transfer this information to the server, encrypt it, and so on. Here is what you need to do in this case. You could use client-side certificates, for example, the x.509 format, which contains information about the owner, the public key, the certification center information, and the digital signature. In these certificates, it is good that JavaScript does not have an explicit interface for accessing these things. So, unlike cookies, where there is always an “arms race” to find errors of a policy of the same origin, there is no explicit JavaScript interface for this in the certificates. So it is very good in terms of security.
One of the problems I mentioned briefly and which we will look at in detail in subsequent lectures is the recall of certificates. If a user leaves your organization, how can you take a certificate from him? It is quite difficult.
In addition, these things are not very convenient to use, because no one wants to install a bunch of certificates for each site that you visit. Therefore, authentication certificates are not very popular, with the exception of companies or organizations that relate to security with great responsibility. This concludes the discussion of cookies.
Let's talk now about protocol vulnerabilities in the web stack. One of the interesting types of attacks is to use errors in browser components, for example, when parsing URLs. So how can URL parsing bring trouble to us?
Suppose we have a URL of this kind, where for some reason strange characters are embedded at the end:
The question is, what is the origin of this particular URL? Flash would have thought that the hostname is example.com. But when the address analyzes the browser, it will think that the origin of the host in this case is foo.com.
This is very bad, because when we have two different entities that are confused in the origin of the origin of the same resource, this is fraught with unpleasant problems. For example, a flash code can be malicious and download some material from example.com. If the exploit was built into the page from foo.com, he could also do some evil things there. And then it takes some code from example.com and runs it with the authority of foo.com. Many complicated parsing rules like these make life very difficult. It happens all the time.
We have just considered content disinfection, the basic idea of ​​which is often much better when there are simpler parsing rules for this kind of thing. However, in retrospect, this is difficult to do, because HTML is already there. And now let's talk about my most favorite security vulnerability - files with the .jar extension, which are a ZIP archive with a part of a Java program. The object of the attack are browser JAR files, mainly Java applets. Around 2007, one great site called lifehacker.com explained how to embed zip files into images. It’s not quite clear who you are trying to hide from by doing this, but lifehacker.com convinces you that this can be done.
They mainly use the fact that if you look at image formats, such as GIF, then, as a rule, the parser runs from top to bottom. He first finds the information in the header, and then considers the remaining bits located at the bottom.
As it turned out, programs that usually manipulate ZIP files work from the bottom up, that is, opposite to the direction of parsing images. First, they find information in the file footer and unpack what is contained inside the archive. Thus, if you place an image file containing a ZIP archive, it will go through all the checks, even Flickr, like any other image, and it will even appear as an image in your browser.
But only you will know the hidden truth. Only you will be aware that if you take this file, you can unpack it and use the information enclosed there. It seems like a cheap trick, but hackers never sleep, they constantly want to ruin our lives. So how do they implement this idea?
They understand that JAR files are derived from the .zip format. This means that you can create a GIF animation or a static image that would have a JAR file, that is, executable JavaScript code, at the very bottom.
Later, people called this attack method GIFAR, half GIF, half JAR, and both half evil. It was awesome. When people first discovered this opportunity, they found it amazing, but they didn’t understand at all how to use it. But as it turned out, the following things can be done on its basis.
So how can you do this? You just use CAD. Take .gif, take .jar, use self-extracting archive - boom, and GIFAR attacked you!
So as soon as you got this, what can you do? There are some sensitive sites that allow users to provide data, but not arbitrary data types. So Flickr or something like this may not allow you to send a custom ActiveX or any other custom HTML. But you will be allowed to send images. So you could build one of these things and present it on one of these confidential sites, which allows you to send images. What do you need to do to make a successful attack in this case?
First, send this “stuffed” image to one of these sites. Secondly, use the XSS cross-site scripting attack method, exploiting the existing vulnerabilities. To do this, you need to insert an applet by writing the following expression in JavaScript:
This code exploits a cross-site scripting vulnerability, so it will be run in site content. GIFAR will pass origin checks because it comes from a site with a common source of origin, despite the fact that this code was inserted by an attacker.
So, now the attacker gets the opportunity to run this Java applet in the context of the victim's site with all the authority of origin. And one of these things will actually be correctly identified as a GIF image. But there is a hidden code. Let me remind you that first the browser unpacks the archived files, so first of all it will launch the JAR part, ignore the top part of the GIF. So it is actually quite amazing.
There are some pretty simple ways to fix this. For example, you can use an applet loader that understands that there should not be any random garbage. In many cases, metadata information is used that indicates the length of this resource. In this case, the loader will start, as expected, from the top, analyze its length, see that the applet ends at the top, and stop. It doesn’t care about the lower part, it is possible that it is even 0. In our case, such a loader will not help, as it will start processing the request from the lower, archived part, and stop in front of the upper one, ignoring it.
What I like about this is that it really shows how wide the stack of software for the Internet is. Taking only these two formats, GIF and JAR, you can create a really unpleasant attack.
You can do this with PDF files. You can put PDF instead of GIF and call this attack something harm PDFAR. But in the end, people have dealt with this problem, and vulnerabilities of this kind have now been eliminated.
Audience: What can you do with this kind of attack, which you can't do with a regular XSS cross-site scripting attack?
Professor: this is a good question. So, what's good about this is that Java can often be a more powerful tool than regular JavaScript, because it uses slightly different rules, the same origin policy, and the like. But you are right that if you can execute cross-site scripts, the launch of JavaScript itself can cause quite a lot of damage. But the main advantage of this method is that this attack technology works inside the applet and can do what the usual malicious script code is unable to do.
So, as I said, this is my favorite attack of all time, mainly only because it made respectable people in the field of computer security come up with a word such as GIFAR.
Another interesting thing is the use of attacks that are based on time. Usually, people do not think of time as a resource that can be a vector for attacks. But as I noted a few minutes ago, that time can actually be a means of allowing an exploit to be introduced into the system.
The specific attack that I'm going to talk about with you is an attack on a hidden channel, Covert channel attack. The idea of ​​this attack is that the attacker finds a way to exchange information between the two applications, and this exchange operation is not authorized. An attacker is somehow using some part of the system to transfer bits of information between two different resources.
A good example of this is the sniff-based CSS attack. What is such an attack?
Suppose an attacker has a website that a user can visit. Getting a user to visit a website is actually quite simple. You create an advertisement or send a phishing email.
Thus, the attacker has a website that the user visits. And the attacker's goal is to find out what other sites the user has visited. An attacker may want to find out for several reasons. Perhaps he’s interested in the user's search queries, or he is trying to figure out where this person works, or maybe he wants to know if this person enters some “shameful” sites and so on.
How is the attacker going to do this if the only thing he controls is the website he wants to convince a user to log in on? A possible way is to use link colors. As you know, if you clicked on a link once, the next time it will appear in your browser of a different color, indicating that you have already clicked on this link. This is actually a security vulnerability.
Because it means that an attacker on his site can generate a huge list of possible URLs that you could visit, and then use JavaScript to see what color these URLs acquired. And if the URL color is purple, it means that you have visited this site. So this is quite a subtle trick.
Interestingly, in many cases you don’t even need to display URLs. You can arrange links in the form of headings on the screen like a domino, and these headings will change color if the user used this link. Perhaps you will think it is not too tedious to scan all these URLs of sites visited by the user? But after all, this process can be optimized by using several filtered passes on the list of addresses. For example, you can first see if the user has visited the top-level URLs — cnn.com, Facebook.com, and so on and so forth. If the answer is yes, you can select the most visited top-level pages. Thus, you can really limit the amount of search.
So the harmless function that browsers support to help the user, saying, “hey, buddy, that's where you visited!”, Can be used by the attacker as compromising evidence on you.
How can I prevent Covert channel attack? In practice, it is done so that the browser simply deceives JavaScript about the correct color of the links. When JavaScript tries to look at the link and its styling, the browser always says that the link has never been visited. And although this does not seem to be a very good solution, it prevents this kind of attack. I think we can come to terms with the fact that JavaScript will not be able to read the colors of the links, this is not the end of the world. But does this eliminate the problem of an attacker who wants to find out which sites you visited? Of course not.
Thus, the next attack that an attacker can organize is a cache-based attack. Its purpose is the same - the hacker wants to know which sites you visit.
As an exploit, it uses cached information, which serves to provide the user with quick access to the previously visited site. In fact, the ability to quickly access a page is the reason why you first cache its address.
, — , , , , , . , , , , . ?
, . . , .
, Google Map Tiles. , «» Google Map, , , , . .
? , . , , , . , .
, , , JavaScript . , , DNS. : , , DNS , . , , -, , -. , , DNS . , , DNS , .