We're going to look at other types of buffer overflows. These are more or less random ones that have done damage in the past, so you need to be aware of them. The reason we're going through these is because in many cases you won't believe what how people do this, that it's possible and yet it is. The next one's going to show you something very interesting about how an attack can work. It's called selective buffer overflow. In this case, you don't want to overwrite all of the data. You just want overwrite data in certain places. So, for example, if an array contains a file name that's 10 characters long, all you want to do is overwrite it to change character five and seven for example. Principles here are the same, but the problem is, you just can't overwrite everything, you have to aim for those two. In this case, you can't do approximation. You've got to know exactly where things are stored relative to where you're going to be doing the attack, the overflow. Best way to describe this one is to give you an example and this is from quite a while ago. Again, and all sendmail that I know of this has been fixed. Sendmail is a message transport agent that Linux and Unix use and it is hideously complex. The configuration file in fact is not meant for human eyes. There are a number of programs that will take a fairly readable configuration file about how to forward things and so forth, and then turn it into a sendmail config file. So, sendmail provides an extensive debugging facility so that you can figure out what's going wrong with your program, or rather, with mail delivery. Now, it has about 200 or so flags, and each flag indicates a different area of the program or the process that you want to check and you give it a value to tell it how much you want to see. If the value was low you're not going to get that much output, if the value was higher than the more than you would ever want to deal with. So, in the example here, I'm giving the flag seven a value of 102 which is about middle of the road here. Now, sendmail stores these flags in an array, in the data segment. The array is simply zero to 255, there are 255 possible flags and they each store a number. In this case, the array element with index seven would store the integer 102. Now, the default configuration file is also stored in memory and it's called sendmail.cf. The last thing you need to know is what you can do once you've exploited this. The answer to that lies in the sendmail configuration file. Look for something called Mlocal. What that does is it tells sendmail look, this is a local message. Here's how I want you to deliver it and then sendmail will run the name of the program associated with this Mlocal. It's usually been mail which we'll just stick it in your mailbox. The next slide shows this in pictures. The configuration file begins at 0.100 and you've got 'etc' sendmail.cf, and it goes all the way to, in this example, 128, 128 bytes long. So, what it would do now is, if you notice the config file it's at 'etc,' e, t, c, I can't change that. It's a system area. But there's another area that I can't change in three letters, t, m, p. So, what I do is I create my own sendmail configuration file, but the local mail delivery system becomes something else like a shelf or whatever. Then I went sendmail with the debug flags. If you notice where it says, bite for flag zero, the debug flag array immediately follows the array with the configuration file name. So, what I would ideally like to do in this case is simply use negative numbers, the flag negative 27 to t and so forth. Flag negative 27 will overwrite location 101, negative 26,102 and negative 25,103. So, what this does, at least in theory, is change the 'etc' to 'tmp' which is what I want, and it should then execute my config file instead of the system one. But there's a slight wrinkle to this. Sendmail does not allow negative flag numbers and they check for it. So, if it sees flag minus 27 box, it says, "What's this?" Now, we go to a quirk in the C language about type punning. Arrays indices are always signed because sometimes it's useful to be able to go to an array element that's before the first one, where their complex situations, where this happens. The key point though is the array index is signed. Now, the way sendmail checks for negative numbers, it looks for a negative sign. If it sees that and the character stream for filing, rather for the flag value, then it says, "No, I am not going to do that." So, when is a positive number equivalent to a negative number? On a 32-bit machine, what we can do is simply add two to the 32nd to it, and then we get a very large positive number. But notice that the first byte is four. Then minus 26 becomes that thing, and minus 25 becomes that thing. So, these will be read in this positive numbers because there's no minus sign. But, when they're given to the array of flags, it will be read as a negative number because that will be signed. This result I can go backwards in essence and change things. That by the way is very useful. When you write your programs don't just check for the sign, check to be sure that the number itself is positive or non-zero if you want to avoid negative numbers. So, that's the data segment. Now, let's take a look at the heap. The heap is very much like the stack, except, what you do is you find something in the heap which you can alter. Typically, this will be data from a function called malloc which allocates storage space. There're a couple of interesting things about malloc. When you do a malloc you'll get back a pointer to free space and the current in the heap. But that pointer is not actually the beginning of the space. For eight or 16 bytes before it, there's information that malloc uses, for example, the length of the segment and so forth. That way when you free the segment, you can join it with other segments very easily. But what we can do is simply write data into the malloc space. If we can cause a branch into the heap and some systems do allow this, then I can execute it. The other thing I can do is play games with malloc itself by changing information that malloc uses to manage storage. Here's an example. Here I have got two segments of malloc space next to each other,, they're contiguous. Notice though the two different headers. So, what it can do is fill up the second malloc space and then overflow it to go right into the first one. This will pause a problem, cause a problem if I care about things being free properly. Usually, when I'm attacking I don't care about things being free properly. I'll happily accept the overwrite to the data one header. Okay, so given this, what can I do? Basically, function pointers are probably the best thing to go for. Because in the heap, you very often store addresses of functions. This is true also on the stack. If I can change those when the program is executing, it will go to the wrong address, the address under my control. This may be explicitly stored, allocated in array and then you store the addresses of functions in the array or whatever. Or it can be an implicit. The function at exit is a good example. It stores a set of function pointers. When you exit, the functions in each of those pointers point to will be executed before you leave. So, if I can find that set of pointers, I can change them, and then when you leave the program, it'll execute something I want. The other thing that you can do is look at fault handlers. These often are stored at the beginning of the heap. So, what it can do is just keep overwriting things until I get to the end and now I've corrupted the fault handler. Again, this requires knowing something about where allocations are performed and where they're placed and stored. Typically though, this is not an absolute address. It's an address relative to something else. Again, you have to know what a good value is to be able to branch to the location you want to get the results that you want. So in the next slide, we have the general rule. Always assume input will cause a buffer overflow, always check length designed to prevent this, so that if something is too long, it will be handled properly. Don't trust it to be of the right form of the right length. If you're going to deal with arrays, which is usually where you store things, because an array after all is really just a buffer. When you use functions on that array, make sure those functions understand about buffer bounds. So strcat and strcopy which concatenate two strings, or copy a string from one location to another. Our prime examples of things that do not check length. Strcat copies, appends the second argument, the characters in the second argument, to the first, to make one long string. That string happens to overflow the bounds of the first. Too bad, it just goes ahead and does it. Same thing with strcopy. If you're copying from location A to location B, and the string in location A is longer than the space allocated for B, you're going to get a buffer overflow. So that's why you use fgets, or strncopy, or strncat, or strnprintf, because those say, "Okay, we'll do the operation, but when we hit N characters," and that's a parameter, "When we hit N characters, were stopping." So you don't get overflow. But there's a trick here again. If it stops to avoid overflow, if it stops when it reaches the end of the memory allocated to it to prevent an overflow attack, it often doesn't put a NUL byte at the end of the string, so you can go sailing right off the end. You don't want that. So always make sure in your programs that when you call those functions, that there's a NUL byte at the end of the string. An easy way to do this, by the way, is simply to put a NUL byte at the end of the array representing the buffer. To find these, look in the manual, or look at the description of the functions. Look for those which handle arrays and do not say we check length. Many of these will say we handle arrays, we look for a termination character. That's not enough, because if the termination character is after the number of characters that the size of the array is, you're still going to get overflow. So check that, and also check all array references, especially when they're in loops, but even when they're not, to make sure that you're actually referencing a legitimate element of the array. Now, here's a fun little puzzle, and the suggested activity is going to ask you to try this out. We have strcopy, which is going to copy, strncopy, I'm sorry, which is going to copy n characters from string one string two. What happens when that n is negative? I say, "Copy minus five characters from string one to string two." On most systems, the length argument to strncopy is unsigned. So that negative five is translated into two to the 32nd minus five, for example. I've got a huge string that I'm copying. The second one is a quirk of strncopy. If you read the manual, what that says is that the source array and the target array cannot overlap. They must be separate. If you do that, if you use that and then you overlap, what happens? Well, the behavior's undefined, mainly because this copying is done by a machine language that are routines that handled things differently. So the behavior is undefined, and whenever you see the word undefined, that means you should avoid using it. Try to figure out how to be more precise. Now, buffer overflows also occur in unlikely places. For example, error handlers. When an error handler is invoked, it usually prints out some sort of format, some sort of an error message. You need to be sure that that error message doesn't overflow buffers. I may be able to manipulate the environment in such a way that, first of all, the message you get, your print is bogus, or secondly, that more characters are appended. For example, if it is the error handling routine prints the filename and make it very long filename, longer than the buffer length, and that allows me to view the overflow. Okay. So let's talk about a very well-known way to fix this, and it was developed by Crispin Cowan and his colleagues. It's called Canaries. Essentially, what you do is you put a canary right before the address you want to protect. So for example, on this next slide, this, if you will recall, was the original one that I put up to show you how buffer overflows work. Notice it's the same picture except with the word canary, before the return address of main. Now what's happened is, when the program calls this function, it generates a random number, and puts it in before the return address. That's called the canary. Then on return, the random number at the location mark canaries compared to the value that was put in there. If the two match, the theory is that there's no buffer overflow, which is usually true. If the two do not match, there is a huge problem, and the program will stop. Now, after this was presented, some smart cookies tried to figure out a way around it. One of them came up with a very ingenious way. This particular person, if you look at the next slide did an analysis of how things were sorted in memory. On this particular system, the exit address was stored essentially as a pointer and it was stored on the stack. So what happened here was essentially what she did earlier, except on a larger scale. You're going to change that exit address to point into the buffer to execute the shell code. Now, what's interesting here is the canary works when you overflow in this way. On the return, the system pulls out the canary, pops the canary, and then compares it to the number that was generated, and the two are going to match. I'm sorry, the two do not match. That immediately invokes trap because you have to now go to the signal handler and say, "Hey, look this doesn't match, pinch error message and we'll quit." Well, you go ahead and read the error message and then you quit. To quit, you call the exit address, but the exit address has been changed to be the address of the buffer where your shell code is stored. So that gets executed, and now you're in control of the system. What's really interesting about this one is, it depends on a failure and the exit. If you didn't do the exit, this wouldn't work.