MIT course "Computer Systems Security". Lecture 14: "SSL and HTTPS", part 2

Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems". Nikolai Zeldovich, James Mykens. year 2014

Computer Systems Security is a course on the development and implementation of secure computer systems. Lectures cover threat models, attacks that compromise security, and security methods based on the latest scientific work. Topics include operating system (OS) security, capabilities, information flow control, language security, network protocols, hardware protection and security in web applications.

Lecture 1: "Introduction: threat models" Part 1 / Part 2 / Part 3
Lecture 2: "Control of hacker attacks" Part 1 / Part 2 / Part 3
Lecture 3: "Buffer overflow: exploits and protection" Part 1 / Part 2 / Part 3
Lecture 4: "Separation of privileges" Part 1 / Part 2 / Part 3
Lecture 5: "Where Security Errors Come From" Part 1 / Part 2
Lecture 6: "Opportunities" Part 1 / Part 2 / Part 3
Lecture 7: "Sandbox Native Client" Part 1 / Part 2 / Part 3
Lecture 8: "Model of network security" Part 1 / Part 2 / Part 3
Lecture 9: "Web Application Security" Part 1 / Part 2 / Part 3
Lecture 10: "Symbolic execution" Part 1 / Part 2 / Part 3
Lecture 11: "Ur / Web programming language" Part 1 / Part 2 / Part 3
Lecture 12: "Network Security" Part 1 / Part 2 / Part 3
Lecture 13: "Network Protocols" Part 1 / Part 2 / Part 3
Lecture 14: "SSL and HTTPS" Part 1 / Part 2 / Part 3

The first reason is that the OCSP protocol adds a delay to every request you make. Every time you want to connect to the server, you first need to connect to OCSP, wait for it to respond, and then do something else. So connection delays do not contribute to the popularity of this protocol.
')
The second reason is that you do not want OCSP to affect your ability to browse the web. Suppose that the OSCP server has disconnected, and then you can lose the Internet altogether, because the protocol considers that it cannot verify someone’s certificate, it is possible that all sites on the Internet are bad and you cannot be allowed there. But no one needs this, so most customers view the non-interference of the OCSP server as a positive event.

This is really bad in terms of security. Because if you are an attacker and want to convince someone that you have a legitimate certificate, but in fact this certificate has been revoked, all you have to do is somehow prevent the client from communicating with the OCSP server.

The client will say this: “I am trying to request verification of the certificate of the site I need, but this OCSP does not seem to be around, so I’ll just go to this site.” So using OCSP is not a good plan.

In practice, people try to create this alternative, because customers simply tend to make serious mistakes. For example, the Chrome web browser is delivered to the client, having already inside itself a list of certificates that Google really wants to revoke. So if someone incorrectly issues a certificate for Gmail or another important site, such as Facebook or Amazon, then the next version of Chrome will already contain this information in the built-in verification list. Thus, you do not have to contact the CRL server and communicate with OCSP. If the browser has verified that the certificate is no longer valid, the client rejects it.

Student: let's say I stole the secret key of the CA certificate, because not all public keys are encrypted?

Professor: yes, it will have bad consequences. I do not think there is any solution to this problem. Of course, there were situations when certification authorities were compromised, for example, in 2011 there were two compromised CAs that in some way fraudulently issued certificates for Gmail, Facebook, and so on. It is not entirely clear how this happened, perhaps someone actually stole their secret key. But regardless of the reasons for the compromise, these CAs were removed from the list of trusted certificate authorities that are built into the browsers, so that in the next release of Chrome they were no longer there.

In fact, it caused trouble for the legal holders of certificates issued by these centers, because their previous certificates became invalid and they had to get new certificates. So in practice, all this fussing with certificates is a rather complicated matter.

So, we have considered the general principle of validity of certificates. They are better than Kerberos in the sense that you no longer need this guy to be on the Internet all the time. In addition, they are more scalable, because you can have several KDCs and you don’t need to communicate with them every time you connect.

Another interesting feature of this protocol is that, unlike Kerberos, you are not required to authenticate both sides of the connection. You can connect to the web server without having a certificate for yourself, and this happens all the time. If you visit amazon.com, you’re going to check that Amazon is the right site, but Amazon has no idea who you are and won’t know about it until you authenticate to the site. Thus, at the encryption protocol level, you do not have a certificate, and Amazon has one.

This is much better than Kerberos, because you must have an entry in its database in order to connect to Kerberos services. The only disadvantage of using this protocol is that the server must have a certificate. So you can't connect to the server and say, “hey, let's just encrypt our stuff. I have no idea who you are, and you have no idea who I am, but let's encrypt it anyway. ” This is called opportunistic encryption, and of course, it is vulnerable to man-in-the-middle attacks. You can encrypt common things with someone, while not knowing him, then an attacker preparing to attack you can also encrypt their packages later and protect themselves from spying.

So it is a pity that these protocols we are considering here - SSL, TLS - do not offer this kind of opportunistic encryption. But such is life.

Student: I'm just curious. Let's just say, once a year, they create pairs of keys with new names. Why not try using this particular key for a whole year?

Professor: I think they do. But it seems that something is wrong with this scheme. Here, as in the case of Kerberos, people start with the use of strong encryption, but over time it gets worse and worse. Computers are becoming faster, new algorithms are being developed that successfully break this encryption. And if people do not care about improving reliability, problems grow. This is the case, for example, when a large number of certificates are signed.

There are two nuances. There is a public key signature scheme. Further, given that the encrypted public key has some limitations, you, signing the message, in fact, only the hash of this message is signed, because it is difficult to sign the giant message, but it is easy to sign the compact hash.

The problem arose because people used MD5 as a hash function, turning the signing of a huge message into a 128-bit thing that was encrypted. Perhaps 20 years ago, MD5 was good, but over time, people discovered weaknesses in it that could be exploited by an attacker.

Suppose at some point someone actually asked for a certificate with a specific MD5 hash, and then carefully disassembled another message that was hashed with the same MD5 value. As a result, he had a hashed CA signature, and then another message appeared, or another key, or another name, and now he can convince someone that it is signed with the correct certificate. And this really happens. For example, if you spend a lot of time trying to crack one key, you will eventually succeed. If this certificate uses encryption, it can be hacked using the brute-force method.

Another example of unsuccessful use of encryption is the RSA algorithm. We did not talk about RSA, but RSA is one of these public-key cryptographic systems that allows you to encrypt and sign messages. Nowadays, you can spend a lot of money, but in the end, hack 1000-bit RSA keys. You may have to do a huge amount of work, but this is easily done during the year. You can ask the certificate authority to sign a message or even take someone’s existing public key, try to find the corresponding private key for it, or crack it manually.
Thus, you must keep up with the attacker, you must use larger RSA keys or use another encryption scheme.

For example, now people do not use MD5 hashes and certificates. They use the SHA-1 cryptographic hashing algorithm. For some time he provided the necessary security, but today it is a weak defense. Now Google is actively trying to force web and browser developers to abandon the use of SHA-1 and use another hash function, because it is quite clear that perhaps in 5 or 10 years it will be easy to attack SHA-1. His weakness has already been proven.

So, I suppose, the magic bullet as such does not exist. You just have to make sure that you continue to grow in parallel with the hackers. Of course, the problem exists. Therefore, all the things we talked about should be based on the correct encryption, or on the fact that it is very difficult to hack. Therefore, you must select the appropriate parameters. At least, there is a shelf life here, so it’s better to choose the parameters for a shelf life of 1 year, rather than 10 years.

This CAs key creates a more serious problem, since it does not have a mandatory shelf life. Therefore, you should choose more aggressive security options, for example, 4000 or 6000 RSA bit keys, or something else. Or another encryption scheme, or all together, but do not use SHA-1 here.

And now let's see how we integrate this protocol into a specific application, namely into a web browser. If you want to communicate online or communicate with sites using cryptography, there are three things in the browser that we need to protect.

The first thing, A - is the protection of data on the network. This is relatively easy, because we are just going to start a protocol that is very similar to the one I described so far. We will encrypt all messages, we will sign them, we will be convinced that they were not forged, in general, we will do all these wonderful things. This is how we will protect the data.

But there are two more things in the web browser that we really should worry about. So, the first, B - is the code that is used in the browser, for example, JavaScript or important data that is stored in the browser, your cookies, or local storage, all this should be somehow protected from hackers. In a second I will tell you how to protect them.

Last, C, what you often do not think about, but what can be a real problem in practice is the protection of the user interface. And the reason for this is that, ultimately, most of the confidential data that we care about comes from the user. So, the user prints data on some site, and he probably has several tabs of different sites open at the same time, so you need to be able to distinguish which site he actually interacts with, at any given time.

If he accidentally enters an Amazon password on some web forum, it will not be disastrous, depending on how much he cares for his password, but it will still be unpleasant. Therefore, you really want to have a good user interface that helps the user understand what he is doing, whether he prints sensitive data on the correct website and whether something will happen to this data after he sends it. So this turns out to be a rather important issue for protecting web applications.

So let's talk about how A, B, and C modern browsers do these things. As I mentioned earlier, we’ll simply use this protocol, called SSL or TLS, to protect data on the network if we use data encryption and authentication.

This is very similar to what we discussed, and includes certificate authorities, and so on. And then, of course, there are many more details. For example, TLS is extremely complicated, but we will not consider it from this point of view. We will focus on browser protection, which is much more interesting. We need to make sure that any code or data delivered over unencrypted connections cannot change the code and data received from an encrypted connection, because our threat model is such that everything unencrypted can be faked by the attacker over the network.

So if we have some kind of unencrypted JavaScript code running in our browser, we have to assume that it could have been tampered with by an intruder, because it was not encrypted. It did not pass network authentication. And, therefore, we must prevent it from interfering with any page that was delivered via an unencrypted link.

Thus, the general plan is that for this we are going to introduce a new URL scheme, which we will call HTTPS. You often see this in URLs. The new URL scheme is that now these URLs are simply different from HTTP addresses. So if you have a URL with this HTTPS: //, then it has a different origin origin than the usual HTTP URLs, because the latter go through unencrypted fixes, they go through SSL / TLS. Thus, you will never confuse these types of addresses if the same origin policy works correctly.

So this is one piece of the puzzle. But then you should also make sure that you correctly distinguish the encrypted sites from each other, as for historical reasons they use different cookie policies. So let's first talk about how we will distinguish different encrypted sites from each other.

The plan is that the hostname via the URL should be the name in the certificate. In fact, it turns out that CAs are going to sign the host name, which appears in the URL as the name of the public key of the web server. Thus, Amazon allegedly has a certificate for www.amazon.com . This is the name in our table that has a public key corresponding to their private key.

This is what the browser will look for. So if he gets a certificate, if he tries to connect or get the URL of foo.com , it means that the server accurately represents the authentic certificate of foo.com. Otherwise, let's say, we tried to contact one guy, and contacted another, because his certificate has a completely different name to which we are connected. This will be a mismatch of certificates.

This is how we will distinguish different sites from each other: we will attract CAs to help them distinguish these sites from each other, because CAs promise to issue certificates only to the correct members of the network. So this is part of the same origin policy, according to which we divide the code into parts. As you remember, cookies have a slightly different policy. They are almost the same origin, but not quite, cookies have a slightly different plan. Hooks have a so-called security flag, Secure Flag. The rule is that if a cookie has such a flag, then they are sent only in response to HTTPS requests or with HTTPS requests. Kukiz with the security flag and without such a flag correspond to each other as https requests and http.

It is a bit difficult. It would be simpler if the cookie simply indicated that this is a cookie for the HTTPS host, and this is the cookie for the HTTP host, and they are completely different. This would be very clear in terms of isolating secure sites from unsafe sites. Unfortunately, for historical reasons, cookies use this strange kind of interaction.

Therefore, if a cookie is marked as secure, it only applies to HTTPS sites, that is, it has the correct host. Secure cookies apply only to HTTPS host URLs, and unsafe cookies apply to both types of addresses, both for https and for http, so in just a second this will be a source of problems for us.

And the final touch that web browsers put to try to help us in this plan is an aspect of the user interface in which they are going to enter some kind of lock icon for users to see. Thus, you should pay attention to the lock icon in the address bar of your browser and the URL to find out which site you are on.

Web browser developers expect that you will behave this way: when you hit a website, you first look at the URL and make sure that this is the name of the host you want to talk to, and then find the lock icon and understand that all is well. This is an aspect of the browser user interface.

However, this is not enough. It turns out that many phishing sites will simply include the image of the lock icon in the site itself, but use a different URL. And if you do not know what the address of this site should be, you can be deceived. In this sense, this side of the user interface is a bit confused, in part because users themselves are often confused. So it's hard to say what is right here. Therefore, we focus mainly on the second aspect, B, which is definitely much easier to discuss. Any questions about this?

Student: I noticed that some sites eventually turn from HTTP to HTTPS.

Professor: Yes, browsers evolve over time, and this is confirmed by the fact that they get the lock icon. Some browsers set the lock icon only if all the content or all of the resources on your page are also transmitted via https. So one of the problems that HTTPS is trying to solve forcibly is the mixed content or the problems of unsafe kinds of content embedded in the page. Therefore, sometimes you will not be able to get the lock icon because of this check. If the Chrome browser believes that the site certificate is not good enough and uses weak cryptography, then it will not give you a lock icon. However, different browsers come in different ways, and if Chrome does not give you a lock icon, then Firefox can give. Thus, again, there is no clear definition of what this lock icon means.

Let's see what problems may arise when implementing this plan. In normal HTTP, we are used to relying on DNS, which should give us the correct IP address on the server. DNS HTTPS URL-? DNS DNS?

: , , , IP-.

: , , , amazon.com.

: , - amazon.com, IP-.

: , , – - DNS . , DNS , . , DNS , IP-, . , - DNS- IP-? ?

: , HTTPS?

: , , .

: , HTTP URL.

: , HTTPS, .

: .

: , . . , CA, , , , - , .

, - https , - , , , , , , .

HTTPS , - . , . , , , . -. « , ». , , , - , . , - .

, , , . , amazon.com www.amazon.com , , , .

-, , , . , : „ , , , , ». , - , . .

, DNS, , , .

, , DNS, . , DNS-, SSL / TLS HTTPS, DNS . , DNS . DoS , , .

, — , ? , , ? , ?

: , - , . , .

: , , , , , , : « »! , , - , , , , . , .

: , .

: , . . , , , , , cookies, , URL-, , origin. , - amazon.com , , , , amazon.com. , amazon.com, , , , , , JavaScript .

, , -. , . - amazon.com «» . , amazon.com, , , , . . , , .

52:10

MIT course "Computer Systems Security". 14: «SSL HTTPS», 3

Full version of the course is available here .

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, 30% discount for Habr users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps until December for free if you pay for a period of six months, you can order here .

Dell R730xd 2 times cheaper? Only we have 2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about How to build an infrastructure building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?

Source: https://habr.com/ru/post/427785/

All Articles

MIT course "Computer Systems Security". Lecture 14: "SSL and HTTPS", part 2

Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems". Nikolai Zeldovich, James Mykens. year 2014

More articles: