We're going to look at
other types of buffer overflows. These are more or
less random ones that have done
damage in the past, so you need to be aware of them. The reason we're going
through these is because in many cases you won't believe
what how people do this, that it's possible and yet it is. The next one's going to
show you something very interesting about how
an attack can work. It's called selective
buffer overflow. In this case, you don't want to overwrite all of the data. You just want overwrite
data in certain places. So, for example, if an array contains a file name
that's 10 characters long, all you want to do
is overwrite it to change character five
and seven for example. Principles here are the
same, but the problem is, you just can't
overwrite everything, you have to aim for those two. In this case, you can't
do approximation. You've got to know exactly
where things are stored relative to where you're going to be doing
the attack, the overflow. Best way to describe
this one is to give you an example and this is
from quite a while ago. Again, and all sendmail that I know of this
has been fixed. Sendmail is a message
transport agent that Linux and Unix use and
it is hideously complex. The configuration file in fact is not meant for human eyes. There are a number of
programs that will take a fairly readable
configuration file about how to forward
things and so forth, and then turn it into a
sendmail config file. So, sendmail provides
an extensive debugging facility so that you can figure out what's going wrong with your program, or rather, with mail delivery. Now, it has
about 200 or so flags, and each flag indicates
a different area of the program or
the process that you want to check and you give it a value to tell it
how much you want to see. If the value was low you're not going to get
that much output, if the value was
higher than the more than you would ever
want to deal with. So, in the example here, I'm giving the flag
seven a value of 102 which is about
middle of the road here. Now, sendmail stores
these flags in an array, in the data segment. The array is simply zero to 255, there are 255 possible flags and they each store a number. In this case, the array element with index seven would
store the integer 102. Now, the default
configuration file is also stored in memory and
it's called sendmail.cf. The last thing you need to know is what you can do once
you've exploited this. The answer to that lies in the sendmail configuration file. Look for something called Mlocal. What that does is it
tells sendmail look, this is a local message. Here's how I want you to deliver it and then
sendmail will run the name of the program
associated with this Mlocal. It's usually been
mail which we'll just stick it in your mailbox. The next slide shows
this in pictures. The configuration file begins at 0.100 and you've got
'etc' sendmail.cf, and it goes all the way
to, in this example, 128, 128 bytes long. So, what it would do now is, if you notice the config file
it's at 'etc,' e, t, c, I can't change that.
It's a system area. But there's another area that I can't change in three letters, t, m, p. So, what I do is I create my own
sendmail configuration file, but the local mail
delivery system becomes something else
like a shelf or whatever. Then I went sendmail
with the debug flags. If you notice where it
says, bite for flag zero, the debug flag array
immediately follows the array with
the configuration file name. So, what I would
ideally like to do in this case is simply
use negative numbers, the flag negative 27
to t and so forth. Flag negative 27 will
overwrite location 101, negative 26,102 and
negative 25,103. So, what this does, at least in theory, is change the 'etc' to
'tmp' which is what I want, and it should then execute my config file instead
of the system one. But there's a slight
wrinkle to this. Sendmail does not allow negative flag numbers
and they check for it. So, if it sees flag minus 27 box, it says, "What's this?" Now, we go to a quirk in the C language
about type punning. Arrays indices are always signed because sometimes
it's useful to be able to go to an array element that's
before the first one, where their complex situations,
where this happens. The key point though is
the array index is signed. Now, the way sendmail checks
for negative numbers, it looks for a negative sign. If it sees that and
the character stream for filing, rather for the flag value,
then it says, "No, I am not going to do that." So, when is a positive number equivalent to a negative number? On a 32-bit machine, what we can do is simply
add two to the 32nd to it, and then we get
a very large positive number. But notice that
the first byte is four. Then minus 26 becomes that thing, and minus 25 becomes that thing. So, these will be read in this positive numbers because
there's no minus sign. But, when they're given
to the array of flags, it will be read as
a negative number because that will be signed. This result I can go backwards in essence
and change things. That by the way is very useful. When you write
your programs don't just check for the sign, check to be sure that
the number itself is positive or non-zero if you want to avoid
negative numbers. So, that's the data segment. Now, let's take
a look at the heap. The heap is very
much like the stack, except, what you do is you find something in the heap
which you can alter. Typically, this will be data from a function called malloc which allocates
storage space. There're a couple of
interesting things about malloc. When you do a malloc you'll
get back a pointer to free space and the
current in the heap. But that pointer is not actually the beginning
of the space. For eight or 16 bytes before it, there's information
that malloc uses, for example, the length of
the segment and so forth. That way when you
free the segment, you can join it with
other segments very easily. But what we can do
is simply write data into the malloc space. If we can cause a branch into the heap and some systems
do allow this, then I can execute it. The other thing I can do is play games with malloc itself by changing information that
malloc uses to manage storage. Here's an example. Here I have got two segments of malloc space next to each
other,, they're contiguous. Notice though
the two different headers. So, what it can do is fill up the second malloc space and then overflow it to go right
into the first one. This will pause a problem, cause a problem if I care about things
being free properly. Usually, when I'm
attacking I don't care about things
being free properly. I'll happily accept the overwrite
to the data one header. Okay, so given this,
what can I do? Basically, function pointers are probably the best
thing to go for. Because in the heap, you very often store
addresses of functions. This is true also on the stack. If I can change those when
the program is executing, it will go to the wrong address, the address under my control. This may be explicitly stored, allocated in array
and then you store the addresses of functions
in the array or whatever. Or it can be an implicit. The function at exit
is a good example. It stores a set of
function pointers. When you exit,
the functions in each of those pointers point to will be executed before you leave. So, if I can find
that set of pointers, I can change them, and then
when you leave the program, it'll execute something I want. The other thing that you can do is look at fault handlers. These often are stored at
the beginning of the heap. So, what it can do is just
keep overwriting things until I get to the end and now I've corrupted the fault handler. Again, this requires
knowing something about where allocations are performed and where
they're placed and stored. Typically though, this is
not an absolute address. It's an address relative
to something else. Again, you have to
know what a good value is to be able to branch to the location you want to get the results
that you want. So in the next slide, we have the general rule. Always assume input will
cause a buffer overflow, always check length
designed to prevent this, so that if something is too long, it will be handled properly. Don't trust it to be of the right form of the right length. If you're going to
deal with arrays, which is usually where
you store things, because an array after all
is really just a buffer. When you use functions
on that array, make sure those functions
understand about buffer bounds. So strcat and strcopy which
concatenate two strings, or copy a string from
one location to another. Our prime examples of things
that do not check length. Strcat copies, appends
the second argument, the characters in
the second argument, to the first, to make
one long string. That string happens to overflow
the bounds of the first. Too bad, it just goes
ahead and does it. Same thing with strcopy. If you're copying from
location A to location B, and the string in location A is longer than the space
allocated for B, you're going to get
a buffer overflow. So that's why you use
fgets, or strncopy, or strncat, or strnprintf,
because those say, "Okay, we'll do the operation, but when we hit N characters,"
and that's a parameter, "When we hit N characters,
were stopping." So you don't get overflow. But there's a trick here again. If it stops to avoid overflow, if it stops when it reaches the end of the memory allocated to it to prevent
an overflow attack, it often doesn't put a NUL byte
at the end of the string, so you can go sailing
right off the end. You don't want that.
So always make sure in your programs that
when you call those functions, that there's a NUL byte
at the end of the string. An easy way to do
this, by the way, is simply to put
a NUL byte at the end of the array representing
the buffer. To find these, look
in the manual, or look at the description
of the functions. Look for those which handle arrays and do not
say we check length. Many of these will
say we handle arrays, we look for a
termination character. That's not enough, because if the termination character is after the number of characters that the size
of the array is, you're still going
to get overflow. So check that, and also
check all array references, especially when they're in loops, but even when they're not, to make sure that you're actually referencing a legitimate
element of the array. Now, here's a fun little puzzle, and the suggested activity is going to ask you
to try this out. We have strcopy, which
is going to copy, strncopy, I'm sorry,
which is going to copy n characters from
string one string two. What happens when
that n is negative? I say, "Copy minus five characters from string
one to string two." On most systems,
the length argument to strncopy is unsigned. So that negative five
is translated into two to the 32nd minus
five, for example. I've got a huge string
that I'm copying. The second one is
a quirk of strncopy. If you read the manual, what that says is that the source array and
the target array cannot overlap. They must be separate. If you do that, if you use that and then
you overlap, what happens? Well, the behavior's undefined, mainly because
this copying is done by a machine language that are routines that handled
things differently. So the behavior is undefined, and whenever you see
the word undefined, that means you should
avoid using it. Try to figure out how
to be more precise. Now, buffer overflows also
occur in unlikely places. For example, error handlers. When an error handler is invoked, it usually prints out
some sort of format, some sort of an error message. You need to be sure
that that error message doesn't overflow buffers. I may be able to manipulate the environment in such
a way that, first of all, the message you get,
your print is bogus, or secondly, that
more characters are appended. For example, if it is the error handling routine prints
the filename and make it very long filename, longer than the buffer length, and that allows me to
view the overflow. Okay. So let's talk about a very
well-known way to fix this, and it was developed
by Crispin Cowan and his colleagues. It's
called Canaries. Essentially, what
you do is you put a canary right before
the address you want to protect. So for example, on
this next slide, this, if you will recall, was
the original one that I put up to show you how
buffer overflows work. Notice it's the same picture
except with the word canary, before the return
address of main. Now what's happened is, when the program calls this function, it generates a random number, and puts it in before
the return address. That's called the canary. Then on return, the random number at the location mark canaries compared to the value
that was put in there. If the two match,
the theory is that there's no buffer overflow,
which is usually true. If the two do not match, there is a huge problem, and the program will stop. Now, after this was presented, some smart cookies tried to
figure out a way around it. One of them came up with
a very ingenious way. This particular person,
if you look at the next slide did an analysis of how things
were sorted in memory. On this particular system, the exit address was
stored essentially as a pointer and it was
stored on the stack. So what happened here was essentially what
she did earlier, except on a larger scale. You're going to change
that exit address to point into the buffer
to execute the shell code. Now, what's interesting here is the canary works when you
overflow in this way. On the return, the system
pulls out the canary, pops the canary, and then compares it to the number
that was generated, and the two are going to match. I'm sorry, the two do not match. That immediately invokes trap because you have to now go to
the signal handler and say, "Hey, look this doesn't match, pinch error message
and we'll quit." Well, you go ahead and read the error message
and then you quit. To quit, you call
the exit address, but the exit address
has been changed to be the address of the buffer where your shell code is stored. So that gets executed, and now you're in
control of the system. What's really interesting
about this one is, it depends on a failure
and the exit. If you didn't do the exit,
this wouldn't work.