Let's take a look at
the login program now. What are the goals? Well, the first goal of
the login program is to only allow authorized users
onto the system. That's what the password and login prompts are supposed to do. They check that
the password belongs to the user whose name is given. The second thing is, you start
off as an end user root. You don't want to stay that user, as soon as you know
who the real user is, you drop privileges
because that prevents the user who's logging in from doing things
as the administrator. The third thing is, if something goes wrong we want
to know about it, so we log enough information to reconstruct
any unauthorized login. The only thing we don't
record is the password. Why? Because if
someone's going through the log files and
sees the password, they can easily impersonate the original user and
we want to avoid that. Okay, given these goals, what are the environment and
what are the assumptions? Well, the first thing is since we are going to
authenticate the user, we absolutely need to know where the authentication data is, and determine the password,
how do we validate it? On Linux and Unix the usual way is through
a password file of some kind, /etc/passwd in the old days, /etc/shadow in the newer days. Some systems though use Kerberos, others use distributed
authentication tools like AgRom or OAuth2 interfaces
or things like that. We're going to assume that
all this data is up-to-date. One of the most common
errors a company makes is when someone leaves they don't change their password, so the person can
still get back in. The third question
we have here is, does it use
environment variables? These are variables
describing aspects of the system or of the environment that affect
how the program runs. The login program also needs to set
the correct environment, and this is not so much
when you log in initially, it's when you log in
when someone else is using the terminal
or the connection. The first case you get
a pristine environment, but in the second case
where someone has a session going
and you type login and your name to
log in as yourself. The environment may be
inherited from that session, so you want to make
sure it's not. You want to make
sure login always cleans out the environment. Now here come a couple
of riddles and examples. If you look at the code here, first of all it gets the host name from the
environment and does something. It then looks to see if there's an error rather
and it does something else. Once it gets that name it
comparison to host one and if so it uses a protocol called
SKEY key to authenticate. If its host to it
uses Kerberos and if it's host three it
uses a shadow file. What is the problem
with this code? Well, the big problem
is get env host. Host is an environment
variable which means it's under the control of
the person running the program. So, that person if they for example don't have
access to SKEY, but are on the Kerberos list, they can force Kerberos to be used simply by setting
host to host two, whereas it really
should be host one. So, this effectively lets the host decide how they
want to authenticate. If you are going to do this, don't pull the host name
from the environment, use something like get host name or get host by name whichever your system supports
to get the host name because that's done by
the kernel not by the user. Now, here's another one that actually existed and it recurred which is
the reason I bring it up. This system here
was in login part, this part was in
the login program. First it said always
authenticate, but then it looked at
the options to login. If you gave it a dash and flag, it would turn off
authentication so it would immediately log you in
as it claimed you to be. Now, the reason this
was done was in the early days of
Windows systems, what used to happen
was they would have you log into each Window. That was painful so
they call login dash n, as a Windows started so you'll automatically be logged in. The problem here is of
course a bad assumption. The dash n would never
be invoked by the user. When you sit down to log
in as a regular user, you presented it with
the login prompt, there's no way to give
the dash n option. But once I login normally, what it can do is
type login dash n route. The dash n says
don't authenticate, so I immediately get access to the system
administration account. That's why you don't
want to do this. This occurred if I remember correctly in the late eighties. Then sometime in
the late nineties, people were looking through the code of a particular system and noticed it happened to use the same code as was in
the earlier library. So, they tried this and it turned out it
was a different flag, I forget which one, but
exactly the same thing happened. So these errors do recur. Now, here's another problem where you're going to mail message to staff
saying, "Send help soon." The way you're
going to do this in this program is you're going to open pipe to a shell, and the shell will execute
the command male staff as though it were typed
at the terminal window. The w means opened for writing, so then you're going to
write send help soon, close it and then when you close that p open ends the shell which will cause
the mail to be sent. There at least three
problems in this code. The first one is, how do you know which mail
program it's getting. On this system you control
that by a path variable, it's an environment variable but it's under the users control. So, the user could set it to a directory containing
his or her own mail program. The second one is,
how do you know the shell command
works as expected? It may be that when
the shell starts up, the default shell runs a small startup script that gives me whatever
program I want. If this is coming from root, privileges and privileges
aren't dropped, that means I can do
whatever I want as root. The third thing is, even
if I fix this to say, "I'm user bin mail" or whatever the mail program
is located at. How do you know that I can't somehow change
the interpretation of slashes from Directory to blank? So, slash bin slash mail would be read by
the shell that you invoke in this be open as
bin with the argument mail. I can easily write
a script called bin to do all sorts of interesting things
when run from this. The way to get around that is a shell variable called
Internal Field Separator. Many shells now no longer
lets you set that, specifically because of the type of the attack I'm talking about. But the Internal Field
Separator or IFS, if it has a slash, well that means slashes
is interpreted as a blank or actually a word
separator to be technical. One last puzzle. Here's the code. What I'm doing is I'm resetting path to a known safe value. So, I am going through
an environment list called environ which I
got from the call. I'm looking for
the first five characters of any variable being path. If I've got it, then I'm simply going to replace
it with my own path, presumably to make things safe. Then I'm going to
run a program which invokes the shell and sends, "Hi there" to me. This looks good until
you start asking, "Why am I assuming there's
only one path there?" Because if you have
the path environment variable, I break at the first one. What happens if there's a second? It's actually fairly
easy to do that. All you do is you pull in
the current environment list, create a new array
and copy it over, tack on a new value of path
and then spawn the shell. It's very enlightening
because some shells go from zero back and
stop at the first one. Others go from zero back
and stop at the last one. Which is which? It depends, so that's another thing you have to be somewhat careful of. So this brings us to
a checklist and again these checklists are not meant as if you do this you'll be safe. What checklists
are meant to do is simply help you focus attention. You'll find things there are times when the checklist
should be ignored. That's fine you should just
be able to justify it. But the big checklist
item here is, what will the users
or will I be getting from the environment or
from the users or what? How do I check for validity, how do I know if it's bogus and what assumptions am I making? Ask yourself those
and those will help you uncover potential problems
with your program. The second checklist item is, what am I assuming about
library functions? The big issue here
is not so much what the functions do
when they return, but what did they
do as side effects? What assumptions do
the library functions make? What information does
it obtained from remote servers in
the environment because, many libraries use
environment variables as well and some will go out
to remote servers and ask for questions. What information does it get? The third one is, does the library function do
what the manual claims? Read the manual very carefully, because at times the wording
is very imprecise and other times it's very precise
and if you don't read it precisely,
you'll get messed up.