Some of these are old, some of these are new. The point is, all of them resulted from a failure to do proper validation. The Heartbleed Bug is probably the best known example of this at least now. It's important to understand that Heartbleed was a bug in SSL. In fact, people were saying, "This means the death of the Internet." We've been hearing that since at least 1982, that various problems that are found will cause the Internet to become completely unusable and worthless. But the same truth now was there was back then. Anyway, Heartbleed was a bug in the OpenSSL program. It was not a protocol bug, it was a bug in the implementation. Now, it arises because of the need to keep connections open. If the client and the server don't communicate after a period of time, one of them is going to wonder, well G is the other one working. In this case, the client would send a message to the server basically saying, hi, are you there?, And the server would respond saying, yes I am. This is called a heartbeat. The way the heartbeat worked in OpenSSL was, as packet would be sent from the client to the server, the packet consisted of the usual packet headers, and then a payload that consisted of a length, which was two bytes and then the actual payload data. That had to be at least 16 bytes long. When the server gets it, it responds. It takes the copy of data that it received, the payload data, which it knows because it's got the payload, and the first two bytes of the length of the data, and it simply sends a write back to the server. By the way, the server can request a heartbeat from the client and the client will do exactly the same thing. So that's the protocol part. Now, the implementation bug is described on the next slide. What happens if we say for example the payload data is four bytes long? We'll see that in the header, but in the length that's stored in the payload we lie. We say that the length is something very very much bigger than the actual data stored in the packet body. Example here is 65535, but it can be anything larger up to two to the 16th minus one because the length that's stored in the body of the packet is at most 16 bits. So what happens? So you send that over to the server, and here's where the bug arose. OpenSSL did not check the length. It should have and it does now because you can compare the length in the body of the packet with the length of the packet stored in the header, but it didn't do that. It trusted the value in the packet. So let's say I had 32 bytes of data in the packet and the length I gave it was 65532. The OpenSSL server gets it, says, oh G the length is 65532. So I'm just going to send back that many bytes. What happens is, when the packet comes in the payload, the data is stored in a buffer of length 65535. So when you get the 32 bytes, it fills the first 32 bytes of the buffer but it doesn't empty the rest of it. It doesn't clear the rest of it that still got the data that was there before. So when it sends everything back, when it sends the heartbeat message back, it gives you the 32 bytes you sent plus 65500, in this case, bytes from the rest of the buffer. So now you can go past through that and find all sorts of interesting things like cookies, decrypted passwords, stuff like that. That's why it's called Heartbleed because in essence, the buffer bleeds over into the client. Now, XKCD is a comic strip that I highly recommend. It was one of the best descriptions of the bug that I've ever seen, and it showed the user Meg talking to a server, and the conversation went, Meg saying, "Server, are you there? If so, reply POTATO" and that's six letters. The server says well, "Meg wants these six letters: POTATO, so we'll send it to him". Meg then says, "Server, are you still there? If so, reply BIRD", and the server returns the four characters, BIRD. Meg then says, "Server, you still there? If so, reply HAT but give me 500 letters instead of just three". The server says, "Meg wants these 500 letters. The first ones are the HAT and then the rest are whatever happens to be in the buffer". In this case, it is command to set the administrator password to "gleepsnork". So now I should get the administrator password to the remote site. I highly encourage you to go to that website.xkcd, it's quite a bit of fun and it's very good in computer security. It also talks about other computer science and other fun stuff too. OpenSSL trusted the contents of the packet without verifying it, and the rest of that simply says what I said earlier. If in fact the two differ, the RFC that controls the heartbeat says that you should ignore the heartbeat, just throw it away don't respond. But OpenSSL responded. Now the effects of this. This sounds like a very arcane effect that wouldn't bother most people and most places. It's usually seen as a flaw on the server. But as I said earlier, the client can do the same thing to the server so the clients were also vulnerable. It also turns out that OpenSSL was extremely widespread. It's freeware. Then by a couple of people from Germany and England. So various people tested sites and you can see a list of sites there that worked. Others that used OpenSSL all of which immediately fixed the problem. But many other places had to scramble to fix the problem as well because they used OpenSSL and the OpenSSL had this severe problem. It turned out by the way there was a good result of this. The people who were writing and maintaining OpenSSL were doing it as a volunteer effort. The code for the heartbeat was added basically in a day. As a result of this, people started giving them funds to hire others and so they could spend more time working on the program instead of doing OpenSSL in their spare time. So there was some social benefit from this. Now, that's an example of an assumption where the server assume that the data in the packet body, the length, was correct. We've already mentioned about the assumption about the client checking to be sure that everything is correct. Clients may do that but the service shouldn't trust it. Another common one is on Linux and Unix systems. You have to be root or the administrator in order to send a message originating from a port with number below 1024, not inclusive, so zero to 1023. Other systems don't share this. So any program that trusts a message coming from a port with one of those numbers is privileged and therefore should be trusted, is making a big mistake. Because without knowing the type of system and the way the system is configured, you don't know that. Many other programs trust IP addresses for authentication. IPv4 is completely untrustworthy because it's very easy to forge. So if you're going to use an IP address for any type of authentication, first of all, consider it very weak authentication because it's not cryptographically protected. Secondly, do something like when you get the IP address, do a lookup for the host name and then do a lookup of the IP address of that host name. This is sometimes called the double reverse lookup. The theory there is that if someone's forging the IP address and there's anything in the message that indicates where it's from, that should match what's stored in the DNS. This assumption is not always valid particularly with content delivery networks which is why if you're going to do it, be aware that you may get false negatives. In general, if you don't have cryptographic protection, don't assume that the message is confidential or that it hasn't been changed in transit. With cryptography, that is generally a safe assumption. I emphasized generally because there are still problems that can arise from the way the cryptography is written, or the infrastructure supporting it. But, cryptography is a lot better than not cryptography, for this purpose. So when you're writing something or analyzing a program, what should you look for? The first thing is look for the environment in which the program is going to start up. Is it one that meets the needs of the program? Is it one you can trust? Does your program trusted? Everything I said earlier applies to this. Secondly, if you're writing a server or writing a client, don't assume the client or server will check anything. If you're writing client for a server, do checking at the client. If you're writing a server, do checking at the server. Also, sometimes it's fun if you've got a client that you're trying to improve or maintain or even with your own client. See what happens if the server sends a response, an illegal response. See what the client does. What it should do is say something to the effect of illegal response and give some sort of an error message. But in many cases, it'll do something else. See what you can do. Gets interesting. Same with the server. With the server, be prepared to handle signals such as a message coming in at bands or a connection being dropped, something like that. The reason is that most servers will run with some privileges on the system. Daemon, or mail, or whatever. They normally would never run as root unless you need that access, SSHD. SSH daemon would for example. But in general, the clients are in privileged and the servers have some privileges even if it's only being referenced on the system. So as a result, when a client talks to a server, in essence, it's getting some sort of extra privilege if nothing else the privilege of talking to an entity that's on that particular system to which the client user does not have access. So you want to check everything very carefully there. Now, if you want to attack the client in order to give the user bogus information or even better to cause the client to go to the wrong system, and this is called a hijack attack. It's quite common with clients such as browsers, there are a number of ways to do this or a number of things that your client has to guard against. The DNS for example, Domain Name Service, is a mechanism that given a host name it returns the IP address. So there's a trick called poisoning the DNS. If I have control over the DNS entry, what I can do is give an illegal name. You can see when they're in the line below, I poison the DNS, the host name is going to be knob;copybeenSH2Etsytelnetd. What happens there is if the particular command that the user is running is to go to host 192.168.100.5. That's right, but erase that. Everything back to this example. Let's say for example in the DNS that the address given there corresponds to the host shown in the line below I poisoned the DNS. But let's say I come in from host 192.168.100.5, the connection is not going to use the name, but it'll only use the IP address. So when you go out to resolve that, to get the host name for the subject in the mail, that host name will be replaced by knob:race. That host name will be replaced by knob;CPbinSHEtsytelnetd. What that does is it will send the message to the admin with the name knob, but the semi-colon ends the command and it will then execute the command to copy a command shell over the Telnet daemon, which basically means when it connected the Telnet daemon, I immediately have root access. By the way, this of course has all been fixed now. Next thing to do when you're writing a server, ask yourself, what does the server trust? Here's another very good example that has long been fixed. The identity protocol which incidentally is the only one I've ever seen in which the Internet RFC, the Internet standard says not recommended. Basically, queries that host and says, "Who made this connection?" Then the remote host will send you back information allowing you to identify the user to some extent. We'll not go into an analysis of this protocol, there are all sorts of problems with it. But, one version of a mail daemon that accepted letters and delivered them would send an ident requests back to the host from which it was getting the mail, but you have no way of knowing what that hosts will return as valid. So why you need to check it, and this version of the mail server did not do the checking. So what the attacker could do is send the letter from the system she controls. Then when the request comes in for identity information, from ident, send back a very long message that overflows the send mails internal buffer for receiving that message. Then essentially, you can control sendmail by a buffer overflow attack. It turned out also there was another bug in the same program that would do the same thing, but it would overflow a buffer in the logging daemons as well. So you had two points of attack, not just one. Incidentally, this mechanism was implemented to improve security, but it just goes to show even security mechanisms can have problems, and sometimes if badly implemented, security mechanisms more dangerous than not using the mechanism at all. Another example on systems. Let's take a look at how PS, the Process Status Lister on Linux works. Typically, what it does is it looks for the kernel file, finds that, opens it up and looks in the symbol table attached to the kernel for the address of the process table. Once it's got that, it then prints out what's in memory, and there are various other ways to do this as well. Most Linux systems now we'll open a directory called a proc, look in there for the process ID, and then do get the status information from there. The point is, they pull it out of a file. The question is, which file? Now, on some systems you, can control which file it looks at, and this is useful when you're debugging kernels. But, the problem is, how do I know that that's the right file? The next slide shows an exploit. Essentially, what I do is I supply a bogus kernel. Give it an address that I'm really interested in, not the real process table. Then use PS to list the "Processes" at that area, at that address. In that case, what's going to happen is I'll get a bunch of information. It'll be badly formatted, so I'll need to play around with the exact address. But I'll get a bunch of information from within the kernel that is quite probably confidential. So basically, I can read any space, anything in kernel memory given this attack. So this is a good example of, what can you trust? The bottom line is, if you don't supply it, don't trust it. Validate it. With PS here, if you're reading from the kernel file that's owned by root and is not world writable, then you're fine. Because if you can't trust root, don't worry about secure programming. That's the least of your problems. But if I'm reading from a file called kernel that is not owned by root or that is world writeable, then anyone could have done anything to that, and therefore, the information in there should be regarded with great suspicion. By the way, this doesn't include the arguments you supply to PS, it also includes the name of the program because that's just treated as argument zero in the C program. Now, why is that important? Because most programs would tend to ignore that, right? Well, yes and no. Most programs do ignore that. But some of the critical ones, the ones that run with root privilege tailor their behavior depending on how they're called. For example, the shell if it's called as RSH to restricted shell, or RESSH, depends on which system you're on, behaves very differently than if you call it with bare RESSH. So that argument zero can play an important role. In particular, one of the mail daemons, which ran as root had exactly this problem. When it caught a signal, it would jump to a signal handler. That would then re-execute the program with appropriate flags and arguments to recover. It'd do it with the same privileges as the mail daemon. In this case the mail daemon had administrative privileges. So what you did was change the name of the mail program to be one of your programs. Run the mail program and then send it a signal. What will happen is then it will execute your program with the requisite privileges. So that's one set of issues. Tied into that is the whole question of error messages and error handlers. The biggest problem we've seen with them is buffer overflow. The reason is, they often write error messages into a buffer and then on return, those messages are printed or sometimes even within the handler the buffer's printed, or the buffer is given to another program that will lock it. Well, if I can somehow arrange for the error message should be longer than the buffer, I can have a buffer overflow attack. Sometimes I can manipulate the environment to do this, more often what he can do is manipulate filenames or arguments or other things to do this. The bottom line is, again, check buffer lengths. Do the validation and verification. With that, let's go on to the next set of slides.