What we're going to do now is explore some of the issues underlying robust programming. Then we're going to reconstruct that library but do it in a way that is very, very robust. We mentioned the basic principles already but it's good to review them. Paranoia says don't trust what you don't generate. Stupidity says if it can be called or invoked incorrectly, it will be. You saw some of the consequences of that in the previous video. Dangerous implements simply says if something is to remain constant across function calls, make sure no one else can access it. The queue library is a very good example of this because that ancillary information associated with the queue, the head, the count, the size and so forth has to remain constant or has to remain consistent across calls to the library functions. Can't happen is to check for a impossible errors because they maybe impossible now but maybe not in the future. A good way to think of this is program defensively. If you're in the United States, one of the things you learn when you're driving is to drive defensively. You always assume that another driver is going to make a mistake and you have to avoid the wreck. That's driving defensively. Programming defensively is exactly the same mindset. All right. So now let's dive in and see exactly how this affects our library. The first thing is the queue structure. That's a dangerous implement because its contents are supposed to remain consistent across calls to the library functions. Yet the caller can see inside there. So we want to avoid that. So how do we do this? What we do is we make the queue structure static, within the functions so that only those functions can access it. In C++ it will be private. Then we build an interface that gives the user some sort of a token, and when the user wants to refer to a queue, he or she passes in the token. The token is constructed in such a way that we can analyze it to determine whether or not it's valid token, and if it is valid, to which queue it is going to refer. We'll talk about how to do this later. This also solves the problem by the way of the double free that we talked about earlier because once I delete a queue, I can indicate that the token that's no longer valid. So if they pass in the token again to add something to that queue or delete something from the queue, I can immediately respond with, "There's no such queue, you already deleted it." Now, one question is: What should the value of these tokens be? Your immediate instinct is to say, oh, an integer from zero to whatever. The problem is that if you look at common areas and calling things, zero and one are typically the most common problems or most common parameters that are used in integers. So we'd like to avoid that so that if the caller does make a mistake we detect it immediately. So that's how we're going to construct the token. If you look at the next slide, here's an example. Let's say we have an array. We take the index that we were interested in, we add x1221 or some arbitrary x value to it. That's going to be represent the new index. So if you pass in anything less than what you've added to the index, it'll immediately be rejected because it's not a valid part of the token. The problem for preventing reuse of a token that corresponds to a deleted queue is to use something called a nonce. This is a monotonic increasing number. Every time you issue a nonce, it has to be at least one greater than the previous nonce. So what we're going to do is give each token in each queue a nonce. When we look at the token, once we find the queue it points to, once we get the index, we then going to compare the nonces and see whether or not they match. If they don't, then that's an old token that we ignored. If they do, then it's a valid token and we proceed. So again, the problem of zeros and ones arises that I mentioned earlier. So what we're going to do is add another arbitrary number to the nonce and this one is x0502. Then what we're going to do is store first the index with the offset, and then the nonce with the offset. On this I'm assuming a 32-bit machine, so we left shift the index by 16 and drop the nonce right in there. Okay, the lesson here is don't return pointers to internal structures like the fragile library did because the caller can dereference that pointer and get access to the internal structure. Here, they can't dereference anything. They get no access to the internal structure. All they can do is pass back a token, and even if they know how the token's constructed, it's not going to help them because they can use the index but they have no idea what the base is, where that array is located in memory. Therefore, they can't get access to it, and this is going to prevent a lot of problems when people make mistakes. Now, one question, this assumes 32-bits, what happens if we did this on a 64-bit machine? Well, the shifting would change, and the size of the index and the nonce might change. Some of the mechanisms for extracting the parts of the ticket would change, but that's it. Conceptually, this works. It's just some of the implementation details based on the increase from 32 bits to 64 bits. Okay, now, we have to deal with errors, and it's good idea to plan for this upfront. So, we need somehow to return a value that indicates whether or not the routine succeeded. This means we have to have something that will distinguish success from failure. The way we're going to do this is when a function is called and it returns, if the number it returns is non-negative, it succeeded. If the number it returns is negative, then there was an error, and we're going to use the specific negative number to indicate the error. So to do this, we're going to define some macros to make life easier, and then a buffer, the purpose of which I'll get to in a moment. We defined the macro QE as error, x, so that you just take the return value and drop it into the macro, and the macro would return one if it's an error and zero if it's not. So of course, none, zero mean no errors. But here's the other problem. What happens if the caller wants to print out an informative message saying what went wrong? Well, what are you going to do? Print minus three colon error? That violates one of the principles of usability and specifically of least astonishment because minus three, what the heck does that mean? With a caller would say or a user would say. So instead, what we're going to do is to associate an error message with each error code. When that error occurs, we're going to copy it into a QE error buffer that is visible to the caller. So when the caller detects an error, the error code will tell the caller what happened. If they want to print out a detailed message, they simply print out the value in that error buff. Now, notice that it's 256 characters long. The reason I can do this is because I control what is written into that array, and so there will never be a buffer overflow. If I were doing this another way, not for expository purposes which is what this is, what it would have done was made error buff simply something that's allocated so that whenever an error occurs, the current contents of error buff, the current memory that error buff points to is de-allocated. Then enough new memory is allocated to hold the error message. But that would complicate things quite a bit, and for exposition I want to avoid that additional complication. The next slide, error handling, goes into a little bit more detail. You can see the earlier part and then we've got this ERRBUF str. Those are the things that copy into ERRBUF, and everything, all the error messages are either a string or a string with one number to be written into it, or a string with two numbers to be written into it, hence, what you see there. Now, notice the str n copy. Sprintf doesn't have something similar in general. Almost all systems are with snprintf where the n is the number of characters in the buffer. So if the system I did this on had that, I would have used that instead. Now, there's another problem with using snprintf, a format string attack. A format string attack essentially gives you a doctored string, and from the printing that releases internal information or overwrites internal information. So why then I worry about it here? Why don't I check str to be sure that error won't occur? Because I'm the one who's doing str. All of the format strings are within the library. Therefore, I don't have to worry about anyone doing the substitution. If they can do the substitution, they can do a lot worse things, and the next slide shows this in a little bit of detail. Now, the other question is, notice that there's a qe errbuf 255 is null. The NUL byte, in ERR buff. Why do we need to add that? Well, the problem is strncpy does not guarantee it'll have a NUL byte at the end. It guarantees that it will copy up to the third argument number of characters, and that's it. If that's less than ERRBUF size, then the null byte will get copied. If it is exactly ERRBUF size, exactly the size of that ERRBUF, then the NUL byte will not get copied. So in that case I have to add it, and that's what the next line does, and the second slide on says that as well.