June 16, 2015

A DOM-based XSS primer

Note: this post is a very simple primer aimed to help those who I know personally that have trouble chasing up DOM-based XSS findings in Burp Suite or similar due to the daunting nature of this vulnerability class.

Cross-site scripting comes in many different flavours. Persistent, reflected and DOM-based. This blog post acts as a primer to DOM-based cross-site scripting and how to effectively locate and exploit such vulnerabilities.

The "DOM" (Document Object Model), defined by W3 is: an application programming interface (API) for valid HTML and well-formed XML documents. It defines the logical structure of documents and the way a document is accessed and manipulated.1

The important part of the above paragraph is the fact that the DOM defines the way a document is accessed and manipulated. In modern web applications, the DOM is usually actively accessed and manipulated through JavaScript.

So, in order to find instances of DOM-based XSS on web applications, we audit the JavaScript that is being run on the page we are testing to see if the DOM is being written to or as W3 say "manipulated" in a manner that could allow for the execution of malicious JavaScript code through user input.

The above may not make much sense at first, but maybe a classic yet widely-found example may help:

<html>  
    <div id="myname">
        Welcome back, <span id="dynamicName"></span>.
    </div>
    <script>
            dynamicNameElement = document.getElementById("dynamicName");
            dynamicNameElement.innerHTML = location.hash.split('#')[1];
    </script>
</html>  

Let's suppose that the above was a snippet of code from an application that shows a welcome message with your name every time you login. Let's suppose that the developers thought that it would be a good idea to do this client side instead of process is on the server side and reflect the name in the response HTML.

An innocent URL for this could look like http://example.com/welcome#shubs.

Once the page above is visited and the JavaScript on the page is executed, the original HTML will be manipulated in the DOM to look like this:

<html><head></head><body><div id="myname">  
        Welcome back, <span id="dynamicName">shubs</span>.
    </div>
    <script>
            dynamicNameElement = document.getElementById("dynamicName");
            dynamicNameElement.innerHTML = location.hash.split('#')[1];
    </script>
</body></html>  

The extra tags such as <head></head> and <body></body> are automatically inserted by the browser for parsing reasons, however what we are interested in is the <span> element with the ID dynamicName. It now has the contents shubs within it.

The line dynamicNameElement.innerHTML = location.hash.split('#')[1]; obtains the value after the # in the URL. This value is then written to the property dynamicNameElement.innerHTML. This property refers to the span element in the code with the id dynamicName.

Where's the XSS? The above code is vulnerable to XSS due to the fact that it uses JavaScript that sets the innerHTML property of an element. This is dangerous as innerHTML allows for the insertion of arbitrary HTML including scripts. Therefore, if I were to be malicious (for example), I would be able to insert arbitrary script through a URL such as http://example.com/welcome"#<img src=x onerror="prompt(document.location)"/>".

The above URL would result in the following HTML being loaded in the DOM:

<html><head></head><body><div id="myname">  
        Welcome back, <span id="dynamicName"><img src="x" onerror="prompt(document.location)"></span>.
    </div>
    <script>
            dynamicNameElement = document.getElementById("dynamicName");
            dynamicNameElement.innerHTML = location.hash.split('#')[1];
    </script>
</body></html>  

There you go. That's a primer in DOM based XSS. However, I have only covered a single method of acheiving DOM based XSS - through the innerHTML property.

There are quite a few more ways that this type of XSS is possible and for that I would highly recommend studying the following resources:

https://code.google.com/p/domxsswiki/wiki/LocationSources
https://code.google.com/p/domxsswiki/wiki/Sinks
https://code.google.com/p/domxsswiki/wiki/StringManipulation

Often, you'll find yourself crawling through JavaScript sources and sinks, functions in JavaScript in between that manipulate strings and more when testing for DOM based XSS in fully-fledged web applications. Some knowledge on JavaScript (i.e. a little bit of programming in the language) would help dearly.

Good luck and contact me if you have a tricky potential DOM-based XSS somewhere that looks worthwhile checking out.

P.S. the example above is for vanilla JavaScript, do check out how the DOM can be manipulated in a malicious way in popular libraries such as JQuery and AngularJS :)

Comments powered by Disqus