Same-Origin Policy Part 1: Why we’re stuck with things like XSS and XSRF/CSRF

justin February 8th, 2007

The last few years have seen a constant rise in vulnerabilities like cross-site scripting (XSS), HTTP response splitting, and cross-site request forgery (XSRF or CSRF). While the vectors and exploit of each of these vulnerability classes vary, they all have one common thread. Each of these vulnerabilities exploits trust shared between a user and a website by circumventing the same basic protection mechanism: the same-origin policy.

In my experience most developers—and even many security people—don’t really know what the same-origin policy is. Worse yet, the rise of AJAX and mash-ups seems to have turned same-origin into something developers are trying to break. Complicating the issue further are the weaknesses in most browsers’ implementations of same-origin, leaving open questions about the effectiveness of the policy itself. So, I’ve decided to try and capture all of the information surrounding same-origin in one place. I also have my own thoughts on the value of the model itself, but I’ll save those for the end.

Background

The same-origin policy (also called the single-origin or same-site policy) was originally released with Netscape Navigator 2.0 and has been incorporated into every major browser since. Put simply, same-origin prevents a document or script loaded from one site of origin from manipulating properties of or communicating with a document loaded from another site of origin. In this case the term origin refers to the domain name, port, and protocol of the site hosting the document. The following table is from the original Netscape documentation on same-origin, and demonstrates how the same-origin policy would handle document manipulation by a script originating from http://store.company.com/dir/page.html.

Document Manipulation from http://store.company.com/dir/page.html
# Target URL Outcome Reason
1 http://store.company.com/dir2/other.html Success
2 http://store.company.com/dir/inner/another.html Success
3 https://store.company.com/secure.html Failure Different protocol
4 http://store.company.com:81/dir/etc.html Failure Different port
5 http://news.company.com/dir/other.html Failure Different host

The examples above show the relatively straightforward restrictions concerning ports and protocols. The following list covers the major actions that cause the browser to check access against the same-origin policy:

  • manipulating browser windows
  • URLs requested via the XmlHttpRequest
  • manipulating frames (including inline frames)
  • manipulating documents (included using the object tag)
  • manipulating cookies

The above restrictions don’t limit all interaction, however. There is no limitation on including documents from other sources in HTML tag elements. It’s fairly common for images, style sheets, and scripts to be included from other domains. In fact, the only time same-origin explicitly restricts document retrieval is when the XmlHttpRequest method is used. That’s important to keep in mind when you look at different attacks against same-origin.

The Threat

Before delving into more specifics, it helps to sum up the threats that the same-origin policy is intended to help prevent. Same-origin violations typically involve either hijacking an existing user session, issuing HTTP requests in the context of a user’s web session, or impersonating a legitimate site to steal a user’s credentials or other sensitive information (phishing). To sum it up, there are really two basic threats:

Impersonation of a Legitimate User
This threat is associated with violating the trust a website places in a remote user, allowing the attacker to initiate HTTP requests in the context of the remote user or impersonate the remote user entirely.
Impersonation of a Legitimate Website
This threat is associated with violating the trust a user places in a remote site by impersonating the site in whole or in part.
Parent Domain Traversal

One interesting quirk about the same-origin policy is that it provides exceptions for domains that share the same .domain.tld portion of their domain names. For example, documents from taossa.com cannot manipulate documents in google.com, however, the restriction is slightly different between www.taossa.com and the parent domain taossa.com. HTTP responses from www.taossa.com can set and read cookies for taossa.com, as described in this section from RFC 2965:

Host names can be specified either as an IP address or a HDN string. Sometimes we compare one host name with another. (Such comparisons SHALL be case-insensitive.) Host A’s name domain-matches host B’s if

  • their host name strings string-compare equal; or
  • A is a HDN string and has the form NB, where N is a non-empty name string, B has the form .B’, and B’ is a HDN string. (So, x.y.com domain-matches .Y.com but not Y.com.)

Note that domain-match is not a commutative operation: a.b.c.com domain-matches .c.com, but not the reverse.

The reach R of a host name H is defined as follows:

  • If
    • H is the host domain name of a host; and,
    • H has the form A.B; and
    • A has no embedded (that is, interior) dots; and
    • B has at least one embedded dot, or B is the string "local". then the reach of H is .B.
  • Otherwise, the reach of H is H

Parent domain name traversal is also possible through script. If called from a subdomain of taossa.com (such as www.taossa.com), the following statement will successfully change the parent domain to taossa.com:

document.domain = "taossa.com"

While all browsers allow the above statement, they don’t necessarily allow the inverse. Historically, IE, Safari, and Konqueror allow the document.domain to be set back to the original domain name, but Mozilla and Opera do not. This allows a script to repeatedly switch its domain name to enable communication with different origins. Abe Fettig extended this notion with a novel method of enabling cross-domain communication between scripts in different frames by using an inline frame as an intermediary. The technique works in all browsers and was presented as a method of simplifying AJAX communications. However, the security implications may be more interesting because the technique provides a viable method for executing a script in the context of any remote server by resolving arbitrary IP addresses to a subdomain of an attacker controlled domain. The results and limitations are very similar to DNS pinning attacks because the requests do not include cookies or credentials for the target server when the attacker-provided domain of origin does not match the real domain of the target server. The technique may, however, prove interesting for JavaScript network scanning and other forms of browser instrumentation.

International Domains

In many cases parent domain traversal is perfectly reasonable. However, the situation gets a lot more confusing when you consider how authority varies between country TLDs. Consider the co.uk domain name suffix, which is effectively the top level for commercial entities in the UK, but it’s not an actual TLD. If the browser only restricts access to the domain.tld portion of the host name, it’s leaving the whole of the co.uk domains open to potential origin violations. Michal Zalewski discussed how this issue affects cookie handling in his 2006 paper on Cross Site Cooking, and Gervase Markham of the Mozilla development team is currently seeking help to shore up Mozilla’s handling of country domains.

There’s also something of a hanging question about the growing use of international domain names (IDN) and punycode encoding. Punycode allows non-ASCII characters to be encoded in DNS names. The existing implementations appear to effectively restrict encoded characters that could violate the same-origin policy, however, it’s unclear as to how thoroughly these implementations have been validated. As such, the only uniquely new vector is the homographic attack presented by Eric Johanson, which isn’t really a same-origin violation.

External Scripts

Same-origin handling of external scripts requires some additional discussion. For example, assume that a site served by taossa.com includes a script from google.com. In this case, the script will execute in the context of taossa.com. Thus, the script will have unrestricted access to the page it’s included from and will be able to issue requests against taossa.com, but it will be unable to make XmlHttpRequests against google.com or manipulate google.com documents in other windows and frames. These external scripts are really quite common, and the example google.com relationship is how Google Analytics is used on this site. Of course, the risk associated with external scripts is that they implicitly grant unrestricted page access to a script from a remote site. In other words, compromising the script at google.com would allow you to attack users at taossa.com.

DNS Pinning

Port and protocol are typically not that hard to control, so the most important part of the same-origin decisions is usually the host’s domain name. However, DNS is not static, and host names could potentially resolve to different addresses over the course of a browsing session. Browsers use DNS pinning to prevent attackers from manipulating DNS timeouts to their advantage. DNS pinning means that once an address is returned for a host name it is used for the duration of the browsing session, regardless of the DNS timeout associated with the domain.

DNS pinning implementation are far from perfect and Martin Johns posted an approach to circumventing DNS pinning by denying access to the remote server. However, circumventing DNS pinning has limited utility as part of an attack because the origin is restricted to the attacker-supplied DNS name, not the real DNS name of the remote system. As such, the unpinned script will not have access to cookies or credentials associated with the true origin site, which means the attacker will most likely be unable to impersonate the victim.

Same-Origin Workarounds

There are a number of workarounds to the same-origin policy, some intentional and some not. One of the most popular intentional workarounds involves the use of Adobe/Macromedia Flash to issue XmlHttpRequests. The Flash browser plugin permits cross-domain requests if allowed by a rule in the crossdomain.xml file, present in the root of the target webserver. This can be handled entirely through Flash or in normal scripts by loading a small SWF file as an XmlHttpRequest proxy. One interesting side effect of this is that an overly permissive crossdomain.xml file can leave a site completely unprotected against cross-site request forgeries because it effectively disables the same-origin policy.

In the unintentional workaround category, there have been numerous vulnerabilities in the same origin model implementations in popular browsers. The IE XmlHttpRequest object has recent same-origin vulnerabilities allowing header forgery and issuing of HTML requests to arbitrary domains. Michal Zalewski recently proposed several methods of leveraging these issues for web-based attacks. The Flash plugin has also had recent vulnerabilities allowing the addition of arbitrary headers and HTTP requests.

Same-Origin Attacks

Up to this point we’ve discussed some vulnerabilities in same-origin policy implementations, but the truth is that the most common web-attacks are directed against a functional same-origin implementation. So, the following vulnerability classes are all completely viable within the existing same-origin policy.

Cross-Site Request Forgery

Cross-site request forgery (CSRF, XSRF, or cross-site reference forgery) has only recently been regarded as a significant vulnerability. The first instance of this vulnerability was found in Zope and reported to Bugtraq as a confused deputy vulnerability. The gist of the original post is that a form can be provided from a malicious site and the browser can be made to submit the form to a trusted site with which the user has an active set of credentials. The following is the sequence of steps involved in a basic cross-site request forgery:

  1. Attacker posts a link to malicious site on targeted site (or link is provided through some other means)
  2. Victim browses to a malicious website
  3. Malicious website entices victim to submit a form with the action pointing to the target site
  4. Form submission  is accepted if victim is already authenticated to the site (likely if the victim  clicked through from a link provided by the attacker)
  5. Form submission modifies sensitive data such as the victim’s password

Cross-site request forgeries are particularly interesting because they are a user-targeted attack that exploits the trust a webserver places in a client with an established session (via HTTP authentication or through a cookies); other similar attacks typically exploit the trust relationship from the user to the site. CGISecurity.com provides an up-to-date FAQ on XSRF/CSRF, along with links to quite a bit of additional information.

There’s also a bit of debate on how to best prevent this vulnerability class. It’s generally accepted that any requests generating side-effects should be performed only in a POST, which reduces the threat of scripted attacks. However, there’s a split as to whether it’s better to use the Referer header to validate the originating site and page of a request or if more complex methods of validation are required. The Referer header provides the simplest approach, but is less resilient to same-origin implementation vulnerabilities and fails when the Referer header is filtered. More complex approaches (such as attaching a nonce to each POST sequence) will function properly regardless of the Referer filtering and are more resistant to some implementation vulnerabilities in the same origin model, but they complicate development and testing.

Cross Site Scripting

Cross-site scripting (XSS) is really the opposite of a cross-site request forgery, although the end result is essentially the same. A cross-site scripting attack exploits the trust a user places in a website, making it a common vector for phishing and related attacks. Cross-site scripting occurs in two basic forms; there’s reflected cross-site scripting (first order), which occurs when an attacker can embed script in data rendered immediately to the victim as part of a GET or POST request. Then there’s stored cross-site scripting (second order), in which the attacker supplied script is retained in long-term storage before being rendered to the victim. Reflected cross-site scripting tends to be easier to detect and exploit, though it requires more direct victim interaction, making the attack less reliable. Stored cross-site scripting is often more difficult to detect and exploit, though the attack is more reliable because it typically occurs without any victim interaction.

Most cross-site scripting attacks attempt to hijack the victim’s session key and smuggle it out by embedding it in an image URL, or similar link. To combat this particular attack Microsoft introduced a special HTTP-only flag for cookies in Internet Explorer 6 SP1. The server can explicitly set a cookie as HTTP-only and client script in IE6 SP1 or above will be unable to access it. (By default cookies are scriptable as normal.) While that approach does complicate the exploit process, it doesn’t prevent an attacker from simply scripting all the operations they choose to perform and executing them in the victim’s context (effectively turning the attack into a combination of XSS and XSRF).

One interesting form of cross-site scripting includes vulnerabilities that allow the browser itself to be attacked. This form of XSS leverages bugs that allow script from a remote web-page to be treated as coming from a trusted origin such as the local filesystem, the local security zone in IE, or the chrome in Mozilla browsers. Michal Zalewski recently disclosed vulnerability of this type in the Firefox pop-up blocker.

Cross-site scripting is well researched with a wealth of information available. The CGISecurity XSS FAQ is an extremely useful resource on cross-site scripting, along with the OWASP documentation. For testing purposes, RSnake’s XSS cheat sheet also comes in extremely handy.

Cross-Site Tracing

Cross-site tracing (XST) is one of many cross-site scripting variants, however, this particular variant relies on a server configuration option instead of an application vulnerability. The attack simply uses the HTTP TRACE method to echo back an attacker-controlled content body. More information on this vulnerability is available in a whitepaper from Jeremiah Grossman at WhiteHat security.

Web Cache Poisoning

Web cache poisoning is a same-origin violation that can result from a number of attacks. The attacks target the local browser cache or (more often) a remote caching proxy. The effective result is essentially the same as cross-site scripting, as a successful attack causes the browser to receive attacker-supplied data in the context of another domain. However, cache poisoning presents some unique variations including a direct path for session fixation attacks and denial of service. Amit Klein introduced the notion of cache poisoning in a 2004 paper presenting HTTP response splitting attacks and has continued to research different attack vectors and impacts.

HTTP Response Splitting

HTTP response splitting is essentially just another variation on cross-site scripting, but the injected text occurs in the HTTP response header instead of the entity body. Because the injection occurs before the content body, HTTP response splitting can actually have a variety of different impacts ranging from typical cross-site scripting to cache poisoning attacks. It also means that the exploit techniques can be significantly more difficult depending on the browser platform and inline HTTP devices.

The general HTTP response splitting vector was introduced in a Amit Klien’s 2004 whitepaper. Klien further explored the topic with a later whitepaper on HTTP response smuggling, which leverages HTTP request smuggling-type techniques to evade anti-response-splitting measures.

HTTP Request Smuggling

HTTP request smuggling is an interesting attack in that it sidesteps same-origin without attacking the browser or webserver directly. Instead, request smuggling attacks the minor variances between HTTP-handling devices and proxies. HTTP request smuggling has similar results to HTTP response splitting, in that it can be used as a form of cross-site scripting or cache poisoning. Chaim Linhart, Amit Klein, Ronen Heled and Steve Orrin presented a detailed whitepaper on HTTP request smuggling, available from Watchfire and summarized in other locations.

HTML and JavaScript Network Scanning

HTML and JavaScript network scanning is one technique that really highlights the weaknesses of the existing same-origin policy. The approach was introduced in a paper from SPI Dynamics, along with a proof of concept using the iframe element to open connections and trapping the onLoad event to determine if the connections were successful. This approach allows a malicious website to deliver a page that scans the internal network of the client browser and reports the results back to the site. Later variations on the attack included a scanner based entirely on static HTML with no scripting components, which works by interpreting delays in loading style sheets from internal network addresses. I expect we’ll only see more research into this area, especially with the growing popularity of AJAX.

My Ramblings

Same-origin policy is a core part of web-security in an abysmal state that’s shown almost no growth to counter modern attacks. The policy itself lacks any real standard and the only references are incomplete documentation and off-hand vulnerability disclosures. Many international domains have never been adequately protected by same-origin and all of the above attacks can be used to bypass same-origin.

New attack vectors and the rise of AJAX have pushed the web beyond the one-size-fits-all approach. Macromedia may have been headed in the right direction with the crossdomain.xml file, but it’s not fine-grained enough. I expect that the replacement for same-origin will need to include explicit, server-provided rules for handling requests to and from external sites. Next week I’ll discuss my ideas, along with other suggestions for fixing same-origin.

Permanent Link | Trackback URI | Comments RSS

Leave a Reply