Some of these are old, some of these are new. The point is, all
of them resulted from a failure to do
proper validation. The Heartbleed Bug is probably the best known example
of this at least now. It's important to understand that Heartbleed was a bug in SSL. In fact, people were saying, "This means the death
of the Internet." We've been hearing that
since at least 1982, that various problems
that are found will cause the Internet to become completely
unusable and worthless. But the same truth now
was there was back then. Anyway, Heartbleed was a bug
in the OpenSSL program. It was not a protocol bug, it was a bug in
the implementation. Now, it arises because of the need to keep
connections open. If the client and the server don't communicate after
a period of time, one of them is going to wonder, well G is the other one working. In this case, the
client would send a message to the server
basically saying, hi, are you there?,
And the server would respond saying, yes I am. This is called a heartbeat. The way the heartbeat
worked in OpenSSL was, as packet would be sent from
the client to the server, the packet consisted of
the usual packet headers, and then a payload that
consisted of a length, which was two bytes and then
the actual payload data. That had to be at least
16 bytes long. When the server gets
it, it responds. It takes the copy of data that it received,
the payload data, which it knows because
it's got the payload, and the first two bytes of
the length of the data, and it simply sends a write
back to the server. By the way, the server can
request a heartbeat from the client and the client will
do exactly the same thing. So that's the protocol part. Now, the implementation bug is described on the next slide. What happens if we
say for example the payload data is
four bytes long? We'll see that in the header, but in the length that's
stored in the payload we lie. We say that the length is something very very
much bigger than the actual data stored
in the packet body. Example here is 65535, but it can be
anything larger up to two to the 16th minus one because the
length that's stored in the body of the packet
is at most 16 bits. So what happens? So you send
that over to the server, and here's where the bug arose. OpenSSL did not check the length. It should have and it does now because you can
compare the length in the body of the packet
with the length of the packet stored in the header,
but it didn't do that. It trusted the value
in the packet. So let's say I had
32 bytes of data in the packet and the length
I gave it was 65532. The OpenSSL server gets it, says, oh G the length is 65532. So I'm just going to send
back that many bytes. What happens is, when the packet
comes in the payload, the data is stored in
a buffer of length 65535. So when you get the 32 bytes, it fills the first 32 bytes of the buffer but it doesn't
empty the rest of it. It doesn't clear the rest of it that still got the data
that was there before. So when it sends everything back, when it sends the
heartbeat message back, it gives you the 32 bytes
you sent plus 65500, in this case, bytes from
the rest of the buffer. So now you can go past
through that and find all sorts of interesting
things like cookies, decrypted passwords,
stuff like that. That's why it's called
Heartbleed because in essence, the buffer bleeds
over into the client. Now, XKCD is a comic strip
that I highly recommend. It was one of
the best descriptions of the bug that I've ever seen, and it showed the user
Meg talking to a server, and the conversation went, Meg saying, "Server,
are you there? If so, reply POTATO"
and that's six letters. The server says well, "Meg wants these six letters: POTATO,
so we'll send it to him". Meg then says, "Server,
are you still there? If so, reply BIRD", and the server returns
the four characters, BIRD. Meg then says, "Server,
you still there? If so, reply HAT but give me 500 letters instead
of just three". The server says, "Meg
wants these 500 letters. The first ones are
the HAT and then the rest are whatever happens
to be in the buffer". In this case, it
is command to set the administrator
password to "gleepsnork". So now I should get
the administrator password to the remote site. I highly encourage you to
go to that website.xkcd, it's quite a bit of fun and it's very good in computer security. It also talks about
other computer science and other fun stuff too. OpenSSL trusted the contents of the packet without
verifying it, and the rest of that simply
says what I said earlier. If in fact the two differ, the RFC that controls the heartbeat says that you
should ignore the heartbeat, just throw it away don't respond. But OpenSSL responded. Now the effects of this. This sounds like
a very arcane effect that wouldn't bother most people
and most places. It's usually seen as
a flaw on the server. But as I said earlier, the client can do
the same thing to the server so the clients
were also vulnerable. It also turns out that OpenSSL
was extremely widespread. It's freeware. Then by a couple of people from
Germany and England. So various people tested sites and you can see a list of
sites there that worked. Others that used OpenSSL all of which immediately
fixed the problem. But many other places had to scramble to fix
the problem as well because they used OpenSSL and the OpenSSL had this severe problem. It turned out by the way there
was a good result of this. The people who were writing and maintaining OpenSSL were doing
it as a volunteer effort. The code for the heartbeat
was added basically in a day. As a result of this, people started
giving them funds to hire others and so they could spend more time working
on the program instead of doing OpenSSL in
their spare time. So there was some social
benefit from this. Now, that's an example
of an assumption where the server assume that
the data in the packet body, the length, was correct. We've already mentioned
about the assumption about the client checking to be sure that everything is correct. Clients may do that but
the service shouldn't trust it. Another common one is on
Linux and Unix systems. You have to be root or the administrator in
order to send a message originating from a port
with number below 1024, not inclusive, so zero to 1023. Other systems don't share this. So any program that trusts
a message coming from a port with one of those numbers is privileged and therefore
should be trusted, is making a big mistake. Because without knowing the type of system and the way the system is configured, you
don't know that. Many other programs trust IP addresses for authentication. IPv4 is completely untrustworthy because it's very easy to forge. So if you're going to use an IP address for any type
of authentication, first of all, consider
it very weak authentication because it's not
cryptographically protected. Secondly, do something like
when you get the IP address, do a lookup for the host
name and then do a lookup of the IP address
of that host name. This is sometimes called
the double reverse lookup. The theory there is
that if someone's forging the IP address and there's anything in the message that indicates
where it's from, that should match what's
stored in the DNS. This assumption is not always
valid particularly with content delivery
networks which is why if you're going to do it, be aware that you may
get false negatives. In general, if you don't have
cryptographic protection, don't assume that the message is confidential or that it hasn't
been changed in transit. With cryptography, that is
generally a safe assumption. I emphasized generally because there are still problems that can arise from the way
the cryptography is written, or the infrastructure
supporting it. But, cryptography is a lot better than not cryptography,
for this purpose. So when you're writing something
or analyzing a program, what should you look for? The first thing is look
for the environment in which the program
is going to start up. Is it one that meets
the needs of the program? Is it one you can trust? Does your program trusted? Everything I said
earlier applies to this. Secondly, if you're writing a
server or writing a client, don't assume the client or
server will check anything. If you're writing
client for a server, do checking at the client. If you're writing a server, do checking at the server. Also, sometimes it's fun if you've got a
client that you're trying to improve or maintain or even with
your own client. See what happens if
the server sends a response, an illegal response. See what the client does. What it should do is say
something to the effect of illegal response and give
some sort of an error message. But in many cases, it'll do something else. See
what you can do. Gets interesting.
Same with the server. With the server, be
prepared to handle signals such as
a message coming in at bands or a connection being
dropped, something like that. The reason is that
most servers will run with some privileges
on the system. Daemon, or mail, or whatever. They normally would
never run as root unless you need
that access, SSHD. SSH daemon would for example. But in general, the clients are in privileged and
the servers have some privileges even if it's only being referenced
on the system. So as a result, when a client talks to
a server, in essence, it's getting some sort
of extra privilege if nothing else
the privilege of talking to an entity that's on
that particular system to which the client user does
not have access. So you want to check everything
very carefully there. Now, if you want to attack
the client in order to give the user bogus information
or even better to cause the client to
go to the wrong system, and this is called
a hijack attack. It's quite common with
clients such as browsers, there are a number of
ways to do this or a number of things that your
client has to guard against. The DNS for example, Domain Name Service,
is a mechanism that given a host name it
returns the IP address. So there's a trick called
poisoning the DNS. If I have control
over the DNS entry, what I can do is give
an illegal name. You can see when they're
in the line below, I poison the DNS, the host name is going to be
knob;copybeenSH2Etsytelnetd. What happens there is if the particular command that
the user is running is to go to host 192.168.100.5. That's right, but erase that. Everything back to this example. Let's say for example in the DNS that
the address given there corresponds to the host shown in the line below I
poisoned the DNS. But let's say I come in
from host 192.168.100.5, the connection is not
going to use the name, but it'll only use
the IP address. So when you go out
to resolve that, to get the host name for
the subject in the mail, that host name will be
replaced by knob:race. That host name will
be replaced by knob;CPbinSHEtsytelnetd. What that does is it
will send the message to the admin with the name knob, but the semi-colon ends the command and it
will then execute the command to copy a command shell over
the Telnet daemon, which basically means when it connected the Telnet daemon, I immediately have root access. By the way, this of course
has all been fixed now. Next thing to do when
you're writing a server, ask yourself, what
does the server trust? Here's another very good example that has long been fixed. The identity protocol
which incidentally is the only one I've ever seen
in which the Internet RFC, the Internet standard
says not recommended. Basically, queries
that host and says, "Who made this connection?" Then the remote host will
send you back information allowing you to identify
the user to some extent. We'll not go into an analysis
of this protocol, there are all sorts
of problems with it. But, one version of a mail
daemon that accepted letters and delivered them would send an ident requests back to the host from which it
was getting the mail, but you have no way
of knowing what that hosts will return as valid. So why you need to check it, and this version
of the mail server did not do the checking. So what the attacker
could do is send the letter from
the system she controls. Then when the request comes in for identity
information, from ident, send back a very long
message that overflows the send mails internal buffer for receiving that message. Then essentially, you can control sendmail by a
buffer overflow attack. It turned out also there
was another bug in the same program that
would do the same thing, but it would overflow a buffer in the logging daemons as well. So you had two points of
attack, not just one. Incidentally, this mechanism was implemented to improve security, but it just goes to show even security mechanisms
can have problems, and sometimes if
badly implemented, security mechanisms
more dangerous than not using
the mechanism at all. Another example on systems. Let's take a look at how PS, the Process Status
Lister on Linux works. Typically, what it does is it
looks for the kernel file, finds that, opens it up and
looks in the symbol table attached to the kernel for the address of
the process table. Once it's got that, it then
prints out what's in memory, and there are various other ways
to do this as well. Most Linux systems now we'll open a directory called a proc, look in there for the process ID, and then do get the status
information from there. The point is, they
pull it out of a file. The question is, which file? Now, on some systems you, can control which
file it looks at, and this is useful when
you're debugging kernels. But, the problem is, how do I know that
that's the right file? The next slide shows an exploit. Essentially, what I do is
I supply a bogus kernel. Give it an address that
I'm really interested in, not the real process table. Then use PS to list the "Processes" at that area,
at that address. In that case, what's going to happen is I'll get
a bunch of information. It'll be badly formatted, so I'll need to play around
with the exact address. But I'll get a bunch
of information from within the kernel that is
quite probably confidential. So basically, I can
read any space, anything in kernel memory
given this attack. So this is a good example of, what can you trust? The bottom line is, if
you don't supply it, don't trust it. Validate it. With PS here, if
you're reading from the kernel file that's owned by root and is not world writable,
then you're fine. Because if you can't trust root, don't worry about
secure programming. That's the least
of your problems. But if I'm reading
from a file called kernel that is not owned by root or that is world writeable, then anyone could have
done anything to that, and therefore, the information in there should be regarded
with great suspicion. By the way, this doesn't include the arguments
you supply to PS, it also includes the name
of the program because that's just treated as argument
zero in the C program. Now, why is that important? Because most programs would
tend to ignore that, right? Well, yes and no. Most programs do ignore that. But some of the critical ones, the ones that run
with root privilege tailor their behavior depending
on how they're called. For example, the shell if it's called as RSH
to restricted shell, or RESSH, depends on
which system you're on, behaves very
differently than if you call it with bare RESSH. So that argument zero can
play an important role. In particular, one
of the mail daemons, which ran as root had
exactly this problem. When it caught a signal, it would jump to
a signal handler. That would then re-execute the program with
appropriate flags and arguments to recover. It'd do it with the same
privileges as the mail daemon. In this case the mail daemon had administrative privileges. So what you did was
change the name of the mail program to be
one of your programs. Run the mail program and
then send it a signal. What will happen is
then it will execute your program with
the requisite privileges. So that's one set of issues. Tied into that is
the whole question of error messages and
error handlers. The biggest problem
we've seen with them is buffer overflow. The reason is, they often
write error messages into a buffer and then on return, those messages are printed or sometimes even within the
handler the buffer's printed, or the buffer is given to another program
that will lock it. Well, if I can somehow arrange for the error message should be longer than the buffer, I can have a
buffer overflow attack. Sometimes I can manipulate
the environment to do this, more often what he
can do is manipulate filenames or arguments or
other things to do this. The bottom line is, again,
check buffer lengths. Do the validation
and verification. With that, let's go on to
the next set of slides.