What we're going to do
now is explore some of the issues underlying
robust programming. Then we're going to
reconstruct that library but do it in a way that
is very, very robust. We mentioned the basic principles already but it's
good to review them. Paranoia says don't trust
what you don't generate. Stupidity says if it can be called or invoked
incorrectly, it will be. You saw some of
the consequences of that in the previous video. Dangerous implements simply
says if something is to remain constant across
function calls, make sure no one
else can access it. The queue library is
a very good example of this because that
ancillary information associated with
the queue, the head, the count, the size and
so forth has to remain constant or has to remain consistent across calls
to the library functions. Can't happen is to check
for a impossible errors because they maybe impossible now but maybe not in the future. A good way to think of this
is program defensively. If you're in the United States, one of the things you
learn when you're driving is to drive defensively. You always assume that
another driver is going to make a mistake and you have
to avoid the wreck. That's driving defensively. Programming defensively is exactly the same
mindset. All right. So now let's dive in and see exactly how this
affects our library. The first thing is
the queue structure. That's a dangerous implement
because its contents are supposed to remain consistent across calls to
the library functions. Yet the caller can
see inside there. So we want to avoid that. So how do we do this? What we do is we make
the queue structure static, within the functions so that only those functions
can access it. In C++ it will be private. Then we build an interface that gives the user
some sort of a token, and when the user wants
to refer to a queue, he or she passes in the token. The token is constructed
in such a way that we can analyze it to determine whether
or not it's valid token, and if it is valid, to which
queue it is going to refer. We'll talk about how
to do this later. This also solves the problem by the way of the double
free that we talked about earlier because
once I delete a queue, I can indicate that the token
that's no longer valid. So if they pass in
the token again to add something to that queue or delete something from the queue, I can immediately respond with, "There's no such queue, you already deleted it." Now, one question is: What should the value of these tokens be? Your immediate
instinct is to say, oh, an integer from
zero to whatever. The problem is that if you look at common areas and
calling things, zero and one are typically the most common problems or most common parameters
that are used in integers. So we'd like to avoid
that so that if the caller does make a mistake
we detect it immediately. So that's how we're going
to construct the token. If you look at the next slide,
here's an example. Let's say we have an array. We take the index that
we were interested in, we add x1221 or
some arbitrary x value to it. That's going to be
represent the new index. So if you pass in anything less than what you've
added to the index, it'll immediately
be rejected because it's not a valid part
of the token. The problem for preventing reuse of a token that corresponds to a deleted queue is to use
something called a nonce. This is a monotonic
increasing number. Every time you issue a nonce, it has to be at least one greater than the previous nonce. So what we're going to do is give each token in each queue a nonce. When we look at the token, once we find the queue
it points to, once we get the index, we then going to
compare the nonces and see whether or
not they match. If they don't, then that's
an old token that we ignored. If they do, then it's a
valid token and we proceed. So again, the problem
of zeros and ones arises that I
mentioned earlier. So what we're going to do is add another arbitrary number to the nonce and this one is x0502. Then what we're
going to do is store first the index with the offset, and then the nonce
with the offset. On this I'm assuming
a 32-bit machine, so we left shift the index by 16 and drop the nonce
right in there. Okay, the lesson here is don't return pointers
to internal structures like the fragile library
did because the caller can dereference that pointer and get access to
the internal structure. Here, they can't
dereference anything. They get no access to
the internal structure. All they can do is
pass back a token, and even if they know how
the token's constructed, it's not going to help
them because they can use the index but they
have no idea what the base is, where that array is
located in memory. Therefore, they can't
get access to it, and this is going
to prevent a lot of problems when people
make mistakes. Now, one question,
this assumes 32-bits, what happens if we did
this on a 64-bit machine? Well, the shifting would change, and the size of the index
and the nonce might change. Some of the mechanisms for extracting the parts of
the ticket would change, but that's it.
Conceptually, this works. It's just some of the
implementation details based on the increase from
32 bits to 64 bits. Okay, now, we have
to deal with errors, and it's good idea to
plan for this upfront. So, we need somehow to return a value that indicates whether or not
the routine succeeded. This means we have to
have something that will distinguish success from failure. The way we're going to do this is when a function is
called and it returns, if the number it returns is
non-negative, it succeeded. If the number it returns is negative, then
there was an error, and we're going to use
the specific negative number to indicate the error. So to do this, we're going to define some macros
to make life easier, and then a buffer, the purpose of which I'll get to in a moment. We defined the macro
QE as error, x, so that you just take
the return value and drop it into the macro, and the macro would return one if it's an error
and zero if it's not. So of course, none, zero mean no errors. But here's the other problem. What happens if
the caller wants to print out an informative message
saying what went wrong? Well, what are you going to do? Print minus three colon error? That violates one of
the principles of usability and specifically of least
astonishment because minus three, what the heck does that mean? With a caller would say
or a user would say. So instead, what we're
going to do is to associate an error message
with each error code. When that error occurs, we're going to copy it into a QE error buffer that is
visible to the caller. So when the caller
detects an error, the error code will tell
the caller what happened. If they want to print
out a detailed message, they simply print out
the value in that error buff. Now, notice that it's
256 characters long. The reason I can do this is because I control what is
written into that array, and so there will never
be a buffer overflow. If I were doing this another way, not for expository purposes
which is what this is, what it would have done was made error buff simply
something that's allocated so that
whenever an error occurs, the current contents
of error buff, the current memory
that error buff points to is de-allocated. Then enough new memory
is allocated to hold the error message. But that would complicate
things quite a bit, and for exposition I want to avoid that
additional complication. The next slide, error handling, goes into a little bit
more detail. You can see the earlier part
and then we've got this ERRBUF str. Those are the things that copy into ERRBUF, and everything, all the error messages are either a string or a string with one number
to be written into it, or a string with two numbers
to be written into it, hence, what you see there. Now, notice the str n copy. Sprintf doesn't have
something similar in general. Almost all systems are with
snprintf where the n is the number of characters
in the buffer. So if the system I
did this on had that, I would have used that instead. Now, there's another problem
with using snprintf, a format string attack. A format string attack essentially gives you
a doctored string, and from the printing that releases internal information or overwrites
internal information. So why then I worry
about it here? Why don't I check str to be
sure that error won't occur? Because I'm the one
who's doing str. All of the format strings
are within the library. Therefore, I don't
have to worry about anyone doing the substitution. If they can do the substitution, they can do a lot worse things, and the next slide shows this
in a little bit of detail. Now, the other question is, notice that there's
a qe errbuf 255 is null. The NUL byte, in ERR buff. Why do we need to add that? Well, the problem is strncpy does not guarantee it'll
have a NUL byte at the end. It guarantees that
it will copy up to the third argument number of
characters, and that's it. If that's less than ERRBUF size, then the null byte
will get copied. If it is exactly ERRBUF size, exactly the size of that ERRBUF, then the NUL byte
will not get copied. So in that case I have to add it, and that's what
the next line does, and the second slide
on says that as well.