Conference PaperPDF Available

Automated Discovery of Parameter Pollution Vulnerabilities in Web Applications

May 2011

May 2011

Source
DBLP

Conference: Proceedings of the Network and Distributed System Security Symposium, NDSS 2011, San Diego, California, USA, 6th February - 9th February 2011

Authors:

Carmen Torrano-Gimenez

Telefónica, S.A

Davide Balzarotti

EURECOM

In the last twenty years, web applications have grown from simple, static pages to complex, full-fledged dynamic applications. Typically, these applications are built using heterogeneous technologies and consist of code that runs both on the client and on the server. Even simple web applications today may accept and process hundreds of different HTTP parameters to be able to provide users with interactive services. While injection vulnerabilities such as SQL injection and cross-site scripting are well-known and have been intensively studied by the research community, a new class of injection vulnerabilities called HTTP Parameter Pollution (HPP) has not received as much attention. If a web application does not properly sanitize the user input for parameter delimiters, exploiting an HPP vulnerability, an attacker can compromise the logic of the application to perform either client-side or server-side attacks. In this paper, we present the first automated approach for the discovery of HTTP Parameter Pollution vulnerabilities in web applications. Using our prototype implementation called PAPAS (PArameter Pollution Analysis System), we conducted a large-scale analysis of more than 5,000 popular websites. Our experimental results show that about 30 % of the websites that we analyzed contain vulnerable parameters and that 46.8 % of the vulnerabilities we discovered (i.e., 14 % of the total websites) can be exploited via HPP attacks. The fact that PAPAS was able to find vulnerabilities in many high-profile, well-known websites suggests that many developers are not aware of the HPP problem. We informed a number of major websites about the vulnerabilities we identified, and our findings were confirmed. 1

Architecture of PAPAS

…

Figures - uploaded by Carmen Torrano-Gimenez

Content may be subject to copyright.

Content uploaded by Carmen Torrano-Gimenez

Content may be subject to copyright.

Automated Discovery of Parameter Pollution Vulnerabilities in Web Applications

Marco Balduzzi

∗

, Carmen Torrano Gimenez

‡

, Davide Balzarotti

∗

, and Engin Kirda

∗ §

∗

Institute Eurecom, Sophia Antipolis

{balduzzi,balzarotti,kirda}@eurecom.fr

‡

Spanish National Research Council, Madrid

[email protected]

Northeastern University, Boston

[email protected]

Abstract

In the last twenty years, web applications have grown

from simple, static pages to complex, full-ﬂedged dynamic

applications. Typically, these applications are built using

heterogeneous technologies and consist of code that runs

both on the client and on the server. Even simple web ap-

plications today may accept and process hundreds of dif-

ferent HTTP parameters to be able to provide users with

interactive services. While injection vulnerabilities such as

SQL injection and cross-site scripting are well-known and

have been intensively studied by the research community, a

new class of injection vulnerabilities called HTTP Parame-

ter Pollution (HPP) has not received as much attention. If

a web application does not properly sanitize the user input

for parameter delimiters, exploiting an HPP vulnerability,

an attacker can compromise the logic of the application to

perform either client-side or server-side attacks.

In this paper, we present the ﬁrst automated approach for

the discovery of HTTP Parameter Pollution vulnerabilities

in web applications. Using our prototype implementation

called PAPAS (PArameter Pollution Analysis System), we

conducted a large-scale analysis of more than 5,000 pop-

ular websites. Our experimental results show that about

30% of the websites that we analyzed contain vulnerable

parameters and that 46.8% of the vulnerabilities we discov-

ered (i.e., 14% of the total websites) can be exploited via

HPP attacks. The fact that PAPAS was able to ﬁnd vulnera-

bilities in many high-proﬁle, well-known websites suggests

that many developers are not aware of the HPP problem.

We informed a number of major websites about the vulner-

abilities we identiﬁed, and our ﬁndings were conﬁrmed.

1 Introduction

In the last twenty years, web applications have grown

from simple, static pages to complex, full-ﬂedged dynamic

applications. Typically, these applications are built using

heterogeneous technologies and consist of code that runs

on the client (e.g., Javascript) and code that runs on the

server (e.g., Java servlets). Even simple web applications

today may accept and process hundreds of different HTTP

parameters to be able to provide users with rich, interactive

services. As a result, dynamic web applications may con-

tain a wide range of input validation vulnerabilities such

as cross site scripting (e.g., [4, 5, 34]) and SQL injec-

tion (e.g., [15, 17]).

Unfortunately, because of their high popularity and a

user base that consists of millions of Internet users, web

applications have become prime targets for attackers. In

fact, according to SANS [19], attacks against web applica-

tions constitute more than 60% of the total attack attempts

observed on the Internet. While ﬂaws such as SQL injec-

tion and cross-site scripting may be used by attackers to

steal sensitive information from application databases and

to launch authentic-looking phishing attacks on vulnerable

servers, many web applications are being exploited to con-

vert trusted websites into malicious servers serving content

that contains client-side exploits. According to SANS, most

website owners fail to scan their application for common

ﬂaws. In contrast, from the attacker’s point of view, auto-

mated tools, designed to target speciﬁc web application vul-

nerabilities simplify the discovery and infection of several

thousand websites.

While injection vulnerabilities such as SQL injection and

cross-site scripting are well-known and have been inten-

sively studied, a new class of injection vulnerabilities called

HTTP Parameter Pollution (HPP) has not received as much

attention. HPP was ﬁrst presented in 2009 by di Paola and

Carettoni at the OWASP conference [27]. HPP attacks con-

sist of injecting encoded query string delimiters into other

existing parameters. If a web application does not prop-

erly sanitize the user input, a malicious user can compro-

mise the logic of the application to perform either client-

side or server-side attacks. One consequence of HPP attacks

is that the attacker can potentially override existing hard-

coded HTTP parameters to modify the behavior of an appli-

cation, bypass input validation checkpoints, and access and

possibly exploit variables that may be out of direct reach.

In this paper, we present the ﬁrst automated approach for

the discovery of HTTP Parameter Pollution vulnerabilities

in web applications. Our prototype implementation, that we

call PArameter Pollution Analysis System (PAPAS), uses a

black-box scanning technique to inject parameters into web

applications and analyze the generated output to identify

HPP vulnerabilities. We have designed a novel approach

and a set of heuristics to determine if the injected parame-

ters are not sanitized correctly by the web application under

analysis.

To the best of our knowledge, no tools have been pre-

sented to date for the detection of HPP vulnerabilities in

web applications, and no studies have been published on

the topic. At the time of the writing of this paper, the most

effective means of discovering HPP vulnerabilities in web-

sites is via manual inspection. At the same time, it is unclear

how common and signiﬁcant a threat HPP vulnerabilities

are in existing web applications.

In order to show the feasibility of our approach, we used

PAPAS to conduct a large-scale analysis of more than 5,000

popular websites. Our experimental results demonstrate

that there is reason for concern as about 30% of the websites

that we analyzed contained vulnerable parameters. Further-

more, we veriﬁed that 14% of the websites could be ex-

ploited via client-side HPP attacks. The fact that PAPAS

was able to ﬁnd vulnerabilities in many high-proﬁle, well-

known websites such as Google, Paypal, Symantec, and Mi-

crosoft suggests that many developers are not aware of the

HPP problem.

When we were able to obtain contact information, we

informed the vulnerable websites of the vulnerabilities we

discovered. In the cases where the security ofﬁcers of the

concerned websites wrote back to us, our ﬁndings were con-

ﬁrmed.

We have created an online service based on PAPAS

(currently in beta version) that allows website maintainers

to scan their sites. As proof of ownership of a site, the web-

site owner is given a dynamically-generated token that she

The PAPAS service is available at: http://papas.iseclab.

org

can put in the document root of her website.

In summary, the paper makes the following contribu-

tions:

• We present the ﬁrst automated approach for the detec-

tion of HPP vulnerabilities in web applications. Our

approach consists of a component to inject parameters

into web applications and a set of tests and heuristics to

determine if the pages that are generated contain HPP

vulnerabilities.

• We describe the architecture and implementation of the

prototype of our approach that we call PAPAS (PA-

rameter Pollution Analysis System). PAPAS is able to

crawl websites and generate a list of HPP vulnerable

URLs.

• We present and discuss the large-scale, real-world ex-

periments we conducted with more than 5,000 popu-

lar websites. Our experiments show that HPP vulner-

abilities are prevalent on the web and that many well-

known, major websites are affected. We veriﬁed that at

least 46.8% of the vulnerabilities we discovered could

be exploited on the client-side. Our empirical results

suggest that, just like in the early days of cross site

scripting and cross site request forgery [1], many de-

velopers are not aware of the HPP problem, or that they

do not take it seriously.

The paper is structured as follows: The next section give

an explanation of parameter pollution attacks and provides

examples. Section 3 describes our approach and presents

the main components of PAPAS. Section 4 presents and

discusses the evaluation of PAPAS. Section 5 lists related

work, and Section 6 brieﬂy concludes the paper.

2 HTTP Parameter Pollution Attacks

HTTP Parameter Pollution attacks (HPP) have only re-

cently been presented and discussed [27], and have not re-

ceived much attention so far. An HPP vulnerability allows

an attacker to inject a parameter inside the URLs generated

by a web application. The consequences of the attack de-

pend on the application’s logic, and may vary from a simple

annoyance to a complete corruption of the application’s be-

havior. Because this class of web vulnerability is not widely

known and well-understood yet, in this section, we ﬁrst ex-

plain and discuss the problem.

Even though injecting a new parameter can sometimes

be enough to exploit an application, the attacker is usually

more interested in overriding the value of an already exist-

ing parameter. This can be achieved by “masking” the old

parameter by introducing a new one with the same name.

For this to be possible, it is necessary for the web applica-

tion to “misbehave” in the presence of duplicated parame-

ters, a problem that is often erroneously confused with the

HPP vulnerability itself. However, since parameter pollu-

tion attacks often rely on duplicated parameters in practice,

we decided to study the parameter duplication behavior of

applications, and measure it in our experiments.

2.1 Parameter Precedence in Web Applications

During the interaction with a web application, the client

often needs to provide input to the program that generates

the requested web page (e.g., a PHP or a Perl script). The

HTTP protocol [12] allows the user’s browser to transfer

information inside the URI itself (i.e., GET parameters),

in the HTTP headers (e.g., in the Cookie ﬁeld), or inside

the request body (i.e., POST parameters). The adopted

technique depends on the application and on the type and

amount of data that has to be transferred.

For the sake of simplicity, in the following, we focus on

GET parameters. However, note that HPP attacks can be

launched against any other input vector that may contain

parameters controlled by the user.

RFC 3986 [7] speciﬁes that the query component (or

query string) of a URI is the part between the “?” character

and the end of the URI (or the character “#”). The query

string is passed unmodiﬁed to the application, and consists

of one or more field=value pairs, separated by either an

ampersand or a semicolon character. For example, the URI

http://host/path/somepage.pl?name=john

&age=32 invokes the verify.pl script, passing the val-

ues john for the name parameter and the value 32 for the

age parameter. To avoid conﬂicts, any special characters

(such as the question mark) inside a parameter value must

be encoded in its %FF hexadecimal form.

This standard technique for passing parameters is

straightforward and is generally well-understood by web

developers. However, the way in which the query string is

processed to extract the single values depends on the appli-

cation, the technology, and the development language that

is used.

For example, consider a web page that contains a check-

box that allows the user to select one or more options in a

form. In a typical implementation, all the check-box items

share the same name, and, therefore, the browser will send

a separate homonym parameter for each item selected by

the user. To support this functionality, most of the pro-

gramming languages used to develop web applications pro-

vide methods for retrieving the complete list of values as-

sociated with a certain parameter. For example, the JSP

getParameterValues method groups all the values to-

gether, and returns them as a list of strings. For the lan-

guages that do not support this functionality, the developer

has to manually parse the query string to extract each single

value.

However, the problem arises when the developer expects

to receive a single item and, therefore, invokes methods

(such as getParameter in JSP) that only return a sin-

gle value. In this case, if more than one parameter with the

same name is present in the query string, the one that is re-

turned can either be the ﬁrst, the last, or a combination of

all the values. Since there is no standard behavior in this sit-

uation, the exact result depends on the combination of the

programming language that is used, and the web server that

is being deployed. Table 1 shows examples of the parameter

precedence adopted by different web technologies.

Note that the fact that only one value is returned is not a

vulnerability per se. However, if the developer is not aware

of the problem, the presence of duplicated parameters can

produce an anomalous behavior in the application that can

be potentially exploited by an attacker in combination with

other attacks. In fact, as we explain in the next section,

this is often used in conjunction with HPP vulnerabilities to

override hard-coded parameter values in the application’s

links.

2.2 Parameter Pollution

An HTTP Parameter Pollution (HPP) attack occurs when

a malicious parameter P

inj

, preceded by an encoded query

string delimiter, is injected into an existing parameter P

host

If P

host

is not properly sanitized by the application and its

value is later decoded and used to generate a URL A, the

attacker is able to add one or more new parameters to A.

The typical client-side scenario consists of persuading a

victim to visit a malicious URL that exploits the HPP vul-

nerability. For example, consider a web application that al-

lows users to cast their vote on a number of different elec-

tions. The application, written in JSP, receives a single pa-

rameter, called poll id, that uniquely identiﬁes the elec-

tion the user is participating in. Based on the value of the pa-

rameter, the application generates a page that includes one

link for each candidate. For example, the following snippet

shows an election page with two candidates where the user

could cast her vote by clicking on the desired link:

 

Url: http://host/election.jsp?poll_id=4568

Link1: <a href="vote.jsp?poll_id=4568&candidate=white">

Vote for Mr. White</a>

Link2: <a href="vote.jsp?poll_id=4568&candidate=green">

Vote for Mrs. Green</a>

 

Suppose that Mallory, a Mrs. Green supporter, is inter-

ested in subverting the result of the online election. By ana-

lyzing the webpage, he realizes that the application does not

properly sanitize the poll id parameter. Hence, Mallory

Technology/Server Tested Method Parameter Precedence

ASP/IIS Request.QueryString("par") All (comma-delimited string)

PHP/Apache $ GET["par"] Last

JSP/Tomcat Request.getParameter("par") First

Perl(CGI)/Apache Param("par") First

Python/Apache getvalue("par") All (List)

Table 1: Parameter precedence in the presence of multiple parameters with the same name

can use the HPP vulnerability to inject another parameter of

his choice. He then creates and sends to Alice the following

malicious Url:

 

http://host/election.jsp?poll_id=4568%26candidate%3Dgreen

 

Note how Mallory “polluted” the poll id parameter

by injecting into it the candidate=green pair. By click-

ing on the link, Alice is redirected to the original election

website where she can cast her vote for the election. How-

ever, since the poll id parameter is URL-decoded and

used by the application to construct the links, when Alice

visits the page, the malicious candidate value is injected

into the URLs

 

http://host/election.jsp?poll_id=4568%26candidate%3Dgreen

Link 1: <a href=vote.jsp?poll_id=4568&candidate=green

&candidate=white>Vote for Mr. White</a>

Link 2: <a href=vote.jsp?poll_id=4568&candidate=green

&candidate=green>Vote for Mrs. Green</a>

 

No matter which link Alice clicks on, the applica-

tion (in this case the vote.jsp script) will receive two

candidate parameters. Furthermore, the ﬁrst parameter

will always be set to green.

In the scenario we discussed, it is likely that the devel-

oper of the voting application expected to receive only one

candidate name, and, therefore, relied on the provided ba-

sic Java functionality to retrieve a single parameter. As a

consequence, as shown in Table 1, only the ﬁrst value (i.e.,

green) is returned to the program, and the second value

(i.e., the one carrying the Alice’s actual vote) is discarded.

In summary, in the example we presented, since the vot-

ing application is vulnerable to HPP, it is possible for an

attacker to forge a malicious link that, once visited, tam-

pers with the content of the page, and returns only links that

force a vote for Mrs. Green.

Cross-Channel Pollution HPP attacks can also be used

to override parameters between different input channels. A

URLs in the page snippets have the injected string emphasized by us-

ing a red, underlining font.

good security practice when developing a web application

is to accept parameters only from the input channel (e.g.,

GET, POST, or Cookies) where they are supposed to be

supplied. That is, an application that receives data from a

POST request should not accept the same parameters if they

are provided inside the URL. In fact, if this safety rule is ig-

nored, an attacker could exploit an HPP ﬂaw to inject arbi-

trary parameter-value pairs into a channel A to override the

legitimate parameters that are normally provided in another

channel B. Obviously, for this to be possible, a necessary

condition is that the web technology gives precedence to A

with respect to B.

HPP to bypass CSRF tokens One interesting use of HPP

attacks is to bypass the protection mechanism used to pre-

vent cross-site request forgery. A cross-site request forgery

(CRSF) is a confused deputy type of attack [16] that works

by including a malicious link in a page (usually in an im-

age tag) that points to a website in which the victim is sup-

posed to be authenticated. The attacker places parameters

into the link that are required to initiate an unauthorized ac-

tion. When the victim visits the attack page, the target ap-

plication receives the malicious request. Since the request

comes from a legitimate user and includes the cookie asso-

ciated with a valid session, the request is likely to be pro-

cessed.

A common technique to protect web applications against

CSRF attacks consists of using a secret request token (e.g.,

see [20, 25]). A unique token is generated by the applica-

tion and inserted in all the sensitive links URLs. When the

application receives a request, it veriﬁes that it contains the

valid token before authorizing the action. Hence, since the

attacker cannot predict the value of the token, she cannot

forge the malicious URL to initiate the action.

A parameter pollution vulnerability can be used to inject

parameters inside the existing links generated by the appli-

cation (that, therefore, include a valid secret token). With

these injected parameters, it may be possible for the attacker

to initiate a malicious action and bypass CSRF protection.

A CSRF bypassing attack using HPP was demonstrated

in 2009 against Yahoo Mail [10]. The parameter injection

permitted to bypass the token protections adopted by Ya-

hoo to protect sensitive operations, allowing the attacker to

delete all the mails of a user.

The following example demonstrates a simpliﬁed ver-

sion of the Yahoo attack:

 

Url:

showFolder?fid=Inbox&order=down&tt=24&pSize=25&startMid=0

%2526cmd=fmgt.emptytrash%26DEL=1%26DelFID=Inbox%26

cmd=fmgt.delete

Link:

showMessage?sort=date&order=down&startMid=0

%26cmd%3Dfmgt.emptytrash&DEL=1&DelFID=Inbox&

cmd=fmgt.delete&.rand=1076957714

 

In the example, the link to display the mail message is

protected by a secret token that is stored in the .rand pa-

rameter. This token prevents an attacker from including the

link inside another page to launch a CSRF attack. How-

ever, by exploiting an HPP vulnerability, the attacker can

still inject the malicious parameters (i.e., deleting all the

mails of a user and emptying the trash can) into the legiti-

mate page. The injection string is a concatenation of the two

commands, where the second command needs to be URL-

encoded twice in order to force the application to clean the

trash can only after the deletion of the mails.

3 Automated HPP Vulnerability Detection

with PAPAS

Our PArameter Pollution Analysis System (PAPAS) to

automatically detect HPP vulnerabilities in websites con-

sists of four main components: A browser, a crawler, and

two scanners.

The ﬁrst component is an instrumented browser that is

responsible for fetching the webpages, rendering the con-

tent, and extracting all the links and form URLs contained

in the page.

The second component is a crawler that communicates

with the browser through a bidirectional channel. This

channel is used by the crawler to inform the browser on

the URLs that need to be visited, and on the forms that need

to be submitted. Furthermore, the channel is also used to

retrieve the collected information from the browser.

Every time the crawler visits a page, it passes the ex-

tracted information to the two scanners so that it can be

analyzed. The parameter Precedence Scanner (P-Scan) is

responsible for determining how the page behaves when it

receives two parameters with the same name. The Vulnera-

bility Scanner (V-Scan), in contrast, is responsible for test-

ing the page to determine if it is vulnerable to HPP attacks.

V-Scan does this by attempting to inject a new parameter

inside one of the existing ones and analyzing the output.

The two scanners also communicate with the instrumented

browser in order to execute the tests.

All the collected information is stored in a database that

is later analyzed by a statistics component that groups to-

gether information about the analyzed pages, and generates

a report for the vulnerable URLs.

The general architecture of the system is summarized in

Figure 1. In the following, we describe the approach that is

used to detect HPP vulnerabilities and each component in

more detail.

3.1 Browser and Crawler Components

Whenever the crawler issues a command such as the vis-

iting of a new webpage, the instrumented browser in PA-

PAS ﬁrst waits until the target page is loaded. After the

browser is ﬁnished parsing the DOM, executing the client-

side scripts, and loading additional resources, a browser ex-

tension (i.e., plugin) extracts the content, the list of links,

and the forms in the page.

In order to increase the depth that a website can be

scanned with, the instrumented browser in PAPAS uses a

number of simple heuristics to automatically ﬁll forms (sim-

ilarly to previously proposed scanning solutions such as

[24]). For example, random alphanumeric values of 8 char-

acters are inserted into password ﬁelds and a default e-

mail address is inserted into ﬁelds with the name email,

e-mail, or mail.

For sites where the authentication or the provided inputs

fail (e.g., because of the use of CAPTCHAs), the crawler

can be assisted by manually logging into the application us-

ing the browser, and then specifying a regular expression to

be used to prevent the crawler from visiting the log-out page

(e.g., by excluding links that include the cmd=logout pa-

rameter).

3.2 P-Scan: Analysis of the Parameter Prece-

dence

The P-Scan component analyzes a page to determine

the precedence of parameters if multiple occurrences of the

same parameter are injected into an application. For URLs

that contain several parameters, each one is analyzed until

the page’s precedence has been determined or all available

parameters have been tested.

The algorithm we use to test the precedence of parame-

ters starts by taking the ﬁrst parameter of the URL (in the

form par1=val1), and generates a new parameter value

val2 that is similar to the existing one. The idea is to gen-

erate a value that would be accepted as being valid by the

application. For example, a parameter that represents a page

number cannot be replaced with a string. Hence, a number

is cloned into a consecutive number, and a string is cloned

into a same-length string with the ﬁrst two characters mod-

iﬁed.

Stat Generator

Crawler

V-Scan

P-Scan

Browser

Extension

Instrumented

Browser

Reports

Figure 1: Architecture of PAPAS

In a second step, the scanner asks the browser to gen-

erate two new requests. The ﬁrst request contains only the

newly generated value val2. In contrast, the second re-

quest contains two copies of the parameter, one with the

original value val1, and one with the value val2.

Suppose, for example, that a page accepts two parame-

ters par1 and par2. In the ﬁrst iteration, the ﬁrst parameter

is tested for the precedence behavior. That is, a new value

new val is generated and two requests are issued. In sum,

the parameter precedence test is run on that pages that are

the results of the three following requests:

 

Page0 - Original Url: application.php?

par1=val1&par2=val2

Page1 - Request 1: application.php?

par1=new val&par2=val2

Page2 - Request 2: application.php?

par1=val1&par1=new val&par2=val2

 

A naive approach to determine the parameter precedence

would be to simply compare the three pages returned by

the previous requests: If Page1 == Page2, then the sec-

ond (last) parameter would have precedence over the ﬁrst.

If, however, Page2 == Page0, the application is giving

precedence to the ﬁrst parameter over the second.

Unfortunately, this straightforward approach does not

work well in practice. Modern web applications are very

complex, and often include dynamic content that may still

vary even when the page is accessed with exactly the same

parameters. Publicity banners, RSS feeds, real-time statis-

tics, gadgets, and suggestion boxes are only a few examples

of the dynamic content that can be present in a page and that

may change each time the page is accessed.

The P-Scan component resolves the dynamic content

problem in two stages. First, it pre-processes the page and

tries to eliminate all dynamic content that does not depend

on the values of the application parameters. That is, P-Scan

removes HTML comments, images, embedded contents, in-

teractive objects (e.g., Java applets), CSS stylesheets, cross-

domain iFrames, and client-side scripts. It also uses regular

expressions to identify and remove “timers” that are often

used to report how long it takes to generate the page that

is being accessed. In a similar way, all the date and time

strings on the page are removed.

The last part of the sanitization step consists of removing

all the URLs that reference the page itself. The problem is

that as it is very common for form actions to submit data to

the same page, when the parameters of a page are modiﬁed,

the self-referencing URLs also change accordingly. Hence,

to cope with this problem, we also eliminate these URLs.

After the pages have been stripped off their dynamic

components, P-Scan compares them to determine the prece-

dence of the parameters. Let P0’, P1’, and P2’ be the

sanitized versions of Page0, Page1, and Page2. The

comparison procedure consists of ﬁve different tests that are

applied until one of the tests succeeds:

I. Identity Test - The identity test checks whether the pa-

rameter under analysis has any impact on the content

of the page. In fact, it is very common for query strings

to contain many parameters that only affect the inter-

nal state, or some “invisible” logic of the application.

Hence, if P0’ == P1’ == P2’, the parameter is

considered to be ineffective.

II. Base Test - The base test is based on the assumption

that the dynamic component stripping process is able

to perfectly remove all dynamic components from the

page that is under analysis. If this is the case, the sec-

ond (last) parameter has precedence over the ﬁrst if

P1’==P2’. The situation is the opposite if P2’ ==

P0’. Note that despite our efforts to improve the dy-

namic content stripping process as much as possible, in

practice, it is rarely the case that the compared pages

match perfectly.

III. Join Test - The join test checks the pages for indica-

tions that show that the two values of the homonym

parameters are somehow combined together by the ap-

plication. For example, it searches P2’ for two values

that are separated by commas, spaces, or that are con-

tained in the same HTML tag. If there is a positive

match, the algorithm concludes that the application is

merging the values of the parameters.

IV. Fuzzy Test - The fuzzy test is designed to cope with

pages whose dynamic components have not been per-

fectly sanitized. The test aims to handle identical pages

that may show minor differences because of embedded

dynamic parts. The test is based on conﬁdence inter-

vals. We compute two values, S

and S

, that repre-

sent how similar P2’ is to the pages P1’ and P0’ re-

spectively. The similarity algorithm we use is based on

the Ratcliff/Obershelp pattern recognition algorithm,

(also known as gestalt pattern matching [28]), and re-

turns a number between 0 (i.e, completely different) to

1 (i.e., perfect match). The parameter precedence de-

tection algorithm that we use in the fuzzy test works as

follows:

 

if ABS(S21-S20) > DISCRIMINATION_THRESHOLD:

if (S21 > S20) and (S21 > SIMILARITY_THRESHOLD):

Precedence = last

else (S20 > S21) and (S20 > SIMILARITY_THRESHOLD):

Precedence = first

else:

Unknown precedence

else:

Unknown precedence

 

To draw a conclusion, the algorithm ﬁrst checks if the

two similarity values are different enough (i.e., the val-

ues show a difference that is greater than a certain dis-

crimination threshold). If this is the case, the closer

match (if the similarity is over a minimum similarity

threshold) determines the parameter precedence. In

other words, if the page with the duplicated parameters

is very similar to the original page, there is a strong

probability that the web application is only using the

ﬁrst parameter, and ignoring the second. However, if

the similarity is closer to the page with the artiﬁcially

injected parameter, there is a strong probability that the

application is only accepting the second parameter.

The two threshold values have been determined by

running the algorithm on one hundred random web-

pages that failed to pass the base test, and for which

we manually determined the precedence of parame-

ters. The two experimental thresholds (set respectively

to 0.05 and 0.75) were chosen to maximize the accu-

racy of the detection, while minimizing the error rate.

V. Error Test - The error test checks if the application

crashes, or returns an ”internal” error when an identi-

cal parameter is injected multiple times. Such an error

usually happens when the application does not expect

to receive multiple parameters with the same name.

Hence, it receives an array (or a list) of parameters in-

stead of a single value. An error occurs if the value is

later used in a function that expects a well-deﬁned type

(such as a number or a string). In this test, we search

the page under analysis for strings that are associated

with common error messages or exceptions. In par-

ticular, we adopted all the regular expressions that the

SqlMap project [13] uses to identify database errors in

MySQL, PostgreSQL, MS SQL Server, Microsoft Ac-

cess, Oracle, DB2, and SQLite.

If none of these ﬁve tests succeed, the parameter is dis-

carded from the analysis. This could be, for example, be-

cause of content that is generated randomly on the server-

side. The parameter precedence detection algorithm is then

run again on the next available parameter.

3.3 V-Scan: Testing for HPP vulnerabilities

In this section, we describe how the V-Scan component

tests for the presence of HTTP Parameter Pollution vulner-

abilities in web applications.

For every page that V-Scan receives from the crawler,

it tries to inject a URL-encoded version of an innocuous

parameter into each existing parameter of the query string.

Then, for each injection, the scanner veriﬁes the presence

of the parameter in links, action ﬁelds and hidden ﬁelds of

forms in the answer page.

For example, in a typical scenario, V-Scan injects the

pair “%26foo%3Dbar” into the parameter “par1=val1”

and then checks if the “&foo=bar” string is included in-

side the URLs of links or forms in the answer page.

Note that we do not check for the presence of the vul-

nerable parameter itself (e.g., by looking for the string

“par1=val1&foo=bar”). This is because web applica-

tions sometimes use a different name for the same parame-

ter in the URL and in the page content. Therefore, the pa-

rameter “par1” may appear under a different name inside

the page.

In more detail, V-Scan starts by extracting the list

URL

= [P

, P

, . . . P

] of the parameters that

are present in the page URL, and the list P

Body

, P

, . . . P

] of the parameters that are present in

links or forms contained in the page body.

It then computes the following three sets:

• P

= P

URL

∩ P

Body

is the set of parameters that ap-

pear unmodiﬁed in the URL and in the links or forms

of the page.

• P

= p | p ∈ P

URL

∧ p /∈ P

Body

contains the

URL parameters that do not appear in the page. Some

of these parameters may appear in the page under a

different name.

• P

= p | p /∈ P

URL

∧ p ∈ P

Body

is the set of

parameters that appear somewhere in the page, but that

are not present in the URL.

First, V-Scan starts by injecting the new parameter in

the P

set. We observed that in practice, in the majority

of the cases, the application copies the parameter to the

page body and maintains the same name. Hence, there

is a high probability that a vulnerability will be identi-

ﬁed at this stage. However, if this test does not discover

any vulnerability, then the scanner moves on to the sec-

ond set (P

). In the second test, the scanner tests for the

(less likely) case in which the vulnerable parameter is re-

named by the application. Finally, in the ﬁnal test, V-Scan

takes the parameters in the P

group, attempts to add these

to the URL, and use them as a vector to inject the ma-

licious pair. This is because webpages usually accept a

very large number of parameters, not all of which are nor-

mally speciﬁed in the URL. For example, imagine a case in

which we observe that one of the links in the page con-

tains a parameter “language=en”. Suppose, however,

that this parameter is not present in the page URL. In the

ﬁnal test, V-Scan would attempt to build a query string like

“par1=var1&language=en%26foo%3Dbar”.

Note that the last test V-Scan applies can be executed on

pages with an empty query string (but with parameterized

links/forms), while the ﬁrst two require pages that already

contain a query string.

In our prototype implementation, the V-Scan component

encodes the attacker pair using the standard URL encod-

ing schema

. Our experiments show that this is sufﬁcient

for discovering HPP ﬂaws in many applications. However,

there is room for improvement as in some cases, the attacker

might need to use different types of encodings to be able to

trigger a bug. For example, this was the case of the HPP

attack against Yahoo (previously described in Section 2)

where the attacker had to double URL-encode the “clean-

ing of the trash can” action.

Handling special cases In our experiments, we identiﬁed

two special cases in which, even though our vulnerability

scanner reported an alert, the page was not actually vulner-

able to parameter pollution.

In the ﬁrst case, one of the URL parameters (or part of

it) is used as the entire target of a link. For example:

 

Url: index.php?v1=p1&uri=apps%2Femail.jsp%3Fvar1%3Dpar1

%26foo%3Dbar

Link: apps/email.jsp?var1=par1&foo=bar

 

URL Encoding Reference, http://www.w3schools.com/

TAGS/ref_urlencode.asp

A parameter is used to store the URL of the target page.

Hence, performing an injection in that parameter is equiva-

lent to modifying its value to point to a different URL. Even

though this technique is syntactically very similar to an HPP

vulnerability, it is not a proper injection case. Therefore, we

decided to consider this case as a false positive of the tool.

The second case that generates false alarms is the op-

posite of the ﬁrst case. In some pages, the entire URL of

the page becomes a parameter in one of the links. This

can frequently be observed in pages that support printing or

sharing functionalities. For example, imagine an applica-

tion that contains a link to report a problem to the website’s

administrator. The link contains a parameter page that ref-

erences the URL of the page responsible for the problem:

 

Url: search.html?session_id=jKAmSZx5%26foo%3Dbar&q=shoes

Link: service_request.html?page=search%2ehtml%3f

session_id%3djKAmSZx5&foo=bar&q=shoes

 

Note that by changing the URL of the page, we also

change the page parameter contained in the link. Clearly,

this is not an HPP vulnerability.

Since the two previous implementation techniques are

quite common in web applications, PAPAS would erro-

neously report these sites as being vulnerable to HPP. To

eliminate such alarms and to make PAPAS suitable for

large-scale analysis, we integrated heuristics into the V-

Scan component to cross-check and verify that the vulner-

abilities that are identiﬁed do not correspond to these two

common techniques that are used in practice.

In our prototype implementation, in order to eliminate

these false alarms, V-Scan checks that the parameter in

which the injection is performed does not start with a

scheme speciﬁer string (e.g., http://). Then, it veri-

ﬁes that the parameter as a whole is not used as the tar-

get for a link. Furthermore, it also checks that the entire

URL is not copied as a parameter inside a link. Finally,

our vulnerability analysis component double-checks each

vulnerability by injecting the new parameter without url-

encoding the separator (i.e., by injecting &foo=bar in-

stead of %26foo%3Dbar). If the result is the same, we

know that the query string is simply copied inside another

URL. While such input handling is possibly a dangerous

design decision on the side of the developer, there is a high

probability that it is intentional so we ignore it and do not

report it by default. However, such checks can be deacti-

vated anytime if the analyst would like to perform a more

in-depth analysis of the website.

3.4 Implementation

The browser component of PAPAS is implemented as a

Firefox extension, while the rest of the system is written in

Python. The components communicate over TCP/IP sock-

ets.

Similar to other scanners, it would have been possible to

directly retrieve web pages without rendering them in a real

browser. However, such techniques have the drawback that

they cannot efﬁciently deal with dynamic content that is of-

ten found on Web pages (e.g., Javascript). By using a real

browser to render the pages we visit, we are able to analyze

the page as it is supposed to appear to the user after the dy-

namic content has been generated. Also, note that unlike

detecting cross site scripting or SQL injections, the ability

to deal with dynamic content is a necessary prerequisite to

be able to test for HPP vulnerabilities using a black-box ap-

proach.

The browser extension has been developed using the

standard technology offered by the Mozilla development

environment: a mix of Javascript and XML User Interface

Language (XUL). We use XPConnect to access Firefox’s

XPCOM components. These components are used for in-

voking GET and POST requests and for communicating

with the scanning component.

PAPAS supports three different operational modes: fast

mode, extensive mode and assisted mode. The fast mode

aims to rapidly test a site until potential vulnerabilities are

discovered. Whenever an alert is generated, the analysis

continues, but the V-Scan component is not invoked to im-

prove the scanning speed. In the extensive mode, the entire

website is tested exhaustively and all potential problems and

injections are logged. The assisted mode allows the scanner

to be used in an interactive way. That is, the crawler pauses

and speciﬁc pages can be tested for parameter precedence

and HPP vulnerabilities. The assisted mode can be used by

security professionals to conduct a semi-automated assess-

ment of a web application, or to test websites that require a

particular user authentication.

PAPAS is also customizable and settings such as scan-

ning depths, numbers of injections that are performed, wait-

ing times between requests, and page loading timeouts are

all conﬁgurable by the analyst.

3.5 Limitations

Our current implementation of PAPAS has several limi-

tations. First, PAPAS does not support the crawling of links

embedded in active content such as Flash, and therefore, is

not able to visit websites that rely on active content tech-

nologies to navigate among the pages.

Second, currently, PAPAS focuses only on HPP vulner-

abilities that can be exploited via client-side attacks (e.g.,

analogous to reﬂected XSS attacks) where the user needs

to click on a link prepared by the attacker. Some HPP vul-

nerabilities can also be used to exploit server-side compo-

nents (when the malicious parameter value is not included

in a link but it is decoded and passed to a back-end com-

ponent). However, testing for server-side attacks is more

difﬁcult than testing for client-side attacks as comparing re-

quests and answers is not sufﬁcient (i.e., similar to the dif-

ﬁculty of detecting stored SQL-injection vulnerabilities via

black-box scanning). We leave the detection of server-side

attacks to future work.

4 Evaluation

We evaluated our detection technique by running two ex-

periments. In the ﬁrst experiment, we used PAPAS to au-

tomatically scan a list of popular websites with the aim of

measuring the prevalence of HPP vulnerabilities in the wild.

We then selected a limited number of vulnerable sites and,

in a second experiment, performed a more in-depth analysis

of the detected vulnerabilities to gain a better understanding

of the possible consequences of the vulnerabilities our tool

automatically identiﬁed.

4.1 HPP Prevalence in Popular Websites

In the ﬁrst experiment, we collected 5,000 unique URLs

from the public database of Alexa. In particular, we ex-

tracted the top ranked sites from each of the Alexa’s cate-

gories [3]. Each website was considered only once – even

if it was present in multiple distinct categories, or with dif-

ferent top-level domain names such as google.com and

google.fr.

The aim of our experiments was to quickly scan as many

websites as possible. Our basic premise was that it would

be likely that the application would contain parameter in-

jection vulnerabilities on many pages and on a large number

of parameters if the developers of the site were not aware of

the HPP threat and had failed to properly sanitize the user

input.

To maximize the speed of the tests, we conﬁgured the

crawler to start from the homepage and visit the sub-pages

up to a distance of three (i.e., three clicks away from the

website’s entry point). For the tests, we only considered

links that contained at least one parameter. In addition, we

limited the analysis to 5 instances per page (i.e., a page with

the same URL, but a different query string was considered

a new instance). The global timeout was set to 15 minutes

per site and the browser was customized to quickly load

and render the pages, and run without any user interaction.

Furthermore, we disabled pop-ups, image loading, and any

plug-ins for active content technologies such as Flash, or

Categories # of Tested Categories # of Tested

Applications Applications

Internet 698 Government 132

News 599 Social Networking 117

Shopping 460 Video 114

Games 300 Financial 110

Sports 256 Organization 106

Health 235 University 91

Science 222 Others 1401

Travel 175

Table 2: TOP15 categories of the analyzed sites

Silverlight. An external watchdog was also conﬁgured to

monitor and restart the browser in case it became unrespon-

sive.

In 13 days of experiments, we successfully scanned

5,016 websites, corresponding to a total of 149,806 unique

pages. For each page, our tool generated a variable amount

of queries, depending on the number of detected parame-

ters. The websites we tested were distributed over 97 coun-

tries and hundreds of different Alexa categories. Table 2

summarizes the 15 categories containing the higher number

of tested applications.

Parameter Precedence For each website, the P-Scan

component tested every page to evaluate the order in which

the GET parameters were considered by the application

when two occurrences of the same parameter were spec-

iﬁed. The results were then grouped together in a per-site

summary, as shown in Figure 2. The ﬁrst column reports the

type of parameter precedence. Last and First indicate that

all the analyzed pages of the application uniformly consid-

ered the last or the ﬁrst speciﬁed value. Union indicates that

the two parameters were combined together to form a sin-

gle value, usually by simply concatenating the two strings

with a space or a comma. In contrast, the parameter prece-

dence is set to inconsistent when different pages of the web-

site present mismatching precedences (i.e., some pages fa-

vor the ﬁrst parameter’s value, others favor the last). The

inconsistent state, accounting for a total of 25% of the ana-

lyzed applications, is usually a consequence of the fact that

the website has been developed using a combination of het-

erogeneous technologies. For example, the main implemen-

tation language of the website may be PHP, but a few Perl

scripts may still be responsible for serving certain pages.

Even though the lack of a uniform behavior can be suspi-

cious, it is neither a sign, nor a consequence of a vulnerable

application. In fact, each parameter precedence behavior

(even the inconsistent case) is perfectly safe if the applica-

tion’s developers are aware of the HPP threat and know how

to handle a parameter’s value in the proper way. Unfortu-

nately, as shown in the rest of the section, the results of our

experiments suggest that many developers are not aware of

HPP.

Figure 2 shows that for 4% of the websites we analyzed,

our scanner was not been able to automatically detect the

parameter precedence. This is usually due to two main rea-

sons. The ﬁrst reason is that the parameters do not affect

(or only minimally affect) the rendered page. Therefore,

the result of the page comparison does not reach the dis-

crimination threshold. The second reason is the opposite

of the ﬁrst. That is, the page shows too many differences

even after the removal of the dynamic content, and the re-

sult of the comparison falls below the similarity threshold

(see Section 3.2 for the full algorithm and an explanation of

the threshold values).

The scanner found 238 applications that raised an SQL

error when they were tested with duplicated parameters.

Quite surprisingly, almost 5% of the most popular websites

on the Internet failed to properly handle the user input, and

returned an ”internal” error page when a perfectly-legal pa-

rameter was repeated twice. Note that providing two param-

eters with the same name is a common practice in many ap-

plications, and most of the programming languages provide

special functionalities to access multiple values. Therefore,

this test was not intended to be an attack against the appli-

cations, but only a check to verify which parameter’s value

was given the precedence. Nevertheless, we were surprised

to note error messages from the websites of many major

companies, banks and government institutions, educational

sites, and others popular websites.

HPP Vulnerabilities PAPAS discovered that 1499 web-

sites (29.88% of the total we analyzed) contained at least

one page vulnerable to HTTP Parameter Injection. That is,

the tool was able to automatically inject an encoded param-

eter inside one of the existing parameters, and was then able

to verify that its URL-decoded version was included in one

of the URLs (links or forms) of the resulting page.

However, the fact that it is possible to inject a parameter

Parameter Precedence WebSites

Last 2,237 (44.60%)

First 946 (18.86%)

Union 381 (7.60%)

Inconsistent 1,251 (24.94%)

Unknown 201 (4.00%)

Total 5,016 (100.00%)

Database Errors 238 (4.74%)

Last

First

Union

Inconsistent

Unknown

Figure 2: Precedence when the same parameter occurs multiple time

does not reveal information about the signiﬁcance and the

consequences of the injection. Therefore, we attempted to

verify the number of exploitable applications (i.e., the sub-

set of vulnerable websites in which the injected parameter

could potentially be used to modify the behavior of the ap-

plication).

We started by splitting the vulnerable set into two sepa-

rate groups. In 872 websites (17.39%), the injection was on

a link or a form’s action ﬁeld. In the remaining 627 cases

(12.5%), the injection was on a form’s hidden ﬁeld.

For the ﬁrst group, our tool veriﬁed if the parameter in-

jection vulnerability could be used to override the value of

one of the existing parameters in the application. This is

possible only if the parameter precedence of the page is con-

sistent with the position of the injected value. For example,

if the malicious parameter is always added to the end of the

URL and the ﬁrst value has parameter precedence, it is im-

possible to override any existing parameter.

When the parameter precedence is not favorable, a vul-

nerable application can still be exploitable by injecting a

new parameter (that differs from all the ones already present

in the URL) that is accepted by the target page.

For example, consider a page target.pl that accepts

an action parameter. Suppose that, on the same page, we

ﬁnd a page poor.pl vulnerable to HPP:

 

Url: poor.pl?par1=val1%26action%3Dreset

Link: target.pl?x=y&w=z&par1=val1&action=reset

 

Since in Perl the parameter precedence is on the ﬁrst

value, it is impossible to override the x and w parameters.

However, as shown in the example, the attacker can still

exploit the application by injecting the action parameter

that she knows is accepted by the target.pl script. Note

that while the parameter overriding test was completely au-

tomated, this type of injection required a manual supervi-

sion to verify the effects of the injected parameter on the

web application.

The ﬁnal result was that at least 702 out of the 872 ap-

plications of the ﬁrst group were exploitable. For the re-

maining 170 pages, we were not able, through a parameter

injection, to affect the behavior of the application.

For the applications in the second group, the impact of

the vulnerability is more difﬁcult to estimate in an auto-

mated fashion. In fact, since modern browsers automati-

cally encode all the form ﬁelds, the injected parameter will

still be sent in a url-encoded form, thus making an attack

ineffective.

In such a case, it may still be possible to exploit the appli-

cation using a two-step attack where the malicious value is

injected into the vulnerable ﬁeld, it is propagated in the form

submission, and it is (possibly) decoded and used in a later

stage. In addition, the vulnerability could also be exploited

to perform a server-side attack, as explained in Section 3.5.

However, using a black-box approach, it is very difﬁcult to

automatically test the exploitability of multi-step or server-

side vulnerabilities. Furthermore, server-side testing might

have had ethical implications (see Section 4.3 for discus-

sion). Therefore, we did not perform any further analysis in

this direction.

To conclude, we were able to conﬁrm that in (at least) 702

out of the 1499 vulnerable websites (i.e., 46.8%) that PA-

PAS identiﬁed, it would have been possible to exploit the

HPP vulnerability to override one of the hard-coded param-

eters, or to inject another malicious parameter that would

affect the behavior of the application.

Figure 3 shows the fraction of vulnerable and exploitable

applications grouped by the different Alexa categories. The

results are equally divided, suggesting that important ﬁnan-

cial and health institutions do not seem to be more security-

aware and immune to HPP than leisure sites for sporting

and gaming.

False Positives In our vulnerability detection experi-

ments, the false positives rate was 1.12% (10 applications).

All the false alarms were due to parameters that were used

by the application as an entire target for one of the links.

The heuristic we implemented to detect these cases (ex-

plained in Section 3.3) failed because the applications ap-

plied a transformation to the parameter before using it as a

Fina ncia l

Games

Government

Health

Internet

News

Organization

Science

Shopping

Social Networking

Sports

Travel

University

Video

Others

0,00%

5,00%

10,00%

15,00%

20,00%

25,00%

30,00%

35,00%

40,00%

45,00%

Vulnerable

Exploitable

Figure 3: Vulnerability rate for category

link’s URL.

Note that, to maximize efﬁciency, our results were ob-

tained by crawling each website at a maximum depth of

three pages. In our experiments, we observed that 11% of

the vulnerable pages were directly linked from the home-

page, while the remaining 89% were equally distributed be-

tween the distance of 2 and 3. This trend suggests that it

is very probable that many more vulnerabilities could have

been found by exploring the sites in more depth.

4.2 Examples of Discovered Vulnerabilities

Our ﬁnal experiments consisted of the further analysis of

some of the vulnerable websites that we identiﬁed. Our aim

was to gain an insight into the real consequences of the HPP

vulnerabilities we discovered.

The analysis we performed was assisted by the V-Scan

component. When invoked in extensive mode, V-Scan was

able to explore in detail the web application, enumerating

all the vulnerable parameters. For some of the websites, we

also registered an account and conﬁgured the scanner to test

the authenticated part of the website.

HPP vulnerabilities can be abused to run a wide range of

different attacks. In the rest of this section, we discuss the

different classes of problems we identiﬁed in our analysis

with the help of real-world examples.

The problems we identiﬁed affected many important and

well-known websites such as Microsoft, Google, VMWare,

About.com, Symantec, history.com, ﬂickr, and Paypal.

Since, at the time of writing, we have not yet received con-

ﬁrmation that all of the vulnerabilities have been ﬁxed, we

have anonymized the description of the following real-word

cases.

Facebook Share Facebook, Twitter, Digg and other so-

cial networking sites offer a share component to easily share

the content of a webpage over a user proﬁle. Many news

portals nowadays integrate these components with the in-

tent of facilitating the distribution of their news.

By reviewing the vulnerability logs of the tested appli-

cations, we noticed that different sites allowed a parameter

injection on the links referencing the share component of

Facebook. In all those cases, a vulnerable parameter would

allow an attacker to alter the request sent to Facebook and to

trick the victim into sharing a page chosen by the attacker.

For example, it was possible for an attacker to exploit these

vulnerabilities to corrupt a shared link by overwriting the

reference with the URL of a drive-by-download website.

In technical terms, the problem was due to the fact that

it was possible to inject an extra url-to-share parameter that

could overwrite the value of the parameter used by the ap-

plication. For example:

 

Url:

<site>/shareurl.htm?PG=<default url>&zItl=<description>

%26url-to-share%3Dhttp://www.malicious.com

Link:

http://www.facebook.com/sharer.php?

url-to-share=<default url>&t=<description>&

url-to-share=http://www.malicious.com

 

Even though the problem lies with the websites that use

the share component, Facebook facilitated the exploitation

by accepting multiple instances of the same parameter, and

always considering the latest value (i.e., the one on the

right).

We notiﬁed the security team of Facebook and proposed

a simple solution based on the ﬁltering of all incoming shar-

ing requests that include duplicate parameters. The team

promptly acknowledged the issue and informed us that they

were willing to put in place our countermeasure.

CSRF via HPP Injection Many applications use hidden

parameters to store a URL that is later used to redirect the

users to an appropriate page. For example, social networks

commonly use this feature to redirect new users to a page

where they can look up a friend’s proﬁle.

In some of these sites, we observed that it was possible

for an attacker to inject a new redirect parameter inside the

registration or the login page so that it could override the

hard-coded parameter’s value. On one social-network web-

site, we were able to inject a custom URL that had the effect

of automatically sending friend requests after the login. In

another site, by injecting the malicious pair into the regis-

tration form, an attacker could perform different actions on

the authenticated area.

This problem is a CSRF attack that is carried out via an

HPP injection. The advantages compared to a normal CSRF

is that the attack URL is injected into the real login/regis-

tration page. Moreover, the user does not have to be already

logged into the target website because the action is auto-

matically executed when the user logs into the application.

However, just like in normal CSRF, this attack can be pre-

vented by using security tokens.

Shopping Carts We discovered different HPP vulnerabil-

ities in online shopping websites that allow the attacker to

tamper with the user interaction with the shopping cart com-

ponent.

For example, in several shopping websites, we were able

to force the application to select a particular product to be

added into the user’s cart. That is, when the victim checks

out and would like to pay for the merchandise, she is actu-

ally paying for a product that is different from the ones she

actually selected. On an Italian shopping portal, for exam-

ple, it was even possible to override the ID of the product

in such a way that the browser was still showing the image

and the description of the original product, even when the

victim was actually buying a different one.

Financial Institutions We ran PAPAS against the authen-

ticated and non-authenticated areas of some ﬁnancial web-

sites and the tool automatically detected several HPP vul-

nerabilities that were potentially exploitable. Since the links

involved sensitive operations (such as increasing account

limits and manipulating credit card operations), we imme-

diately stopped our experiments and promptly informed the

security departments of the involved companies. The prob-

lems were acknowledged and are currently being ﬁxed.

Tampering with Query Results In most cases, the HPP

vulnerabilities that we discovered in our experiments allow

the attacker to tamper with the data provided by the vulner-

able website, and to present to the victim some information

chosen by the attacker.

On several popular news portals, we managed to modify

the news search results to hide certain news, to show the

news of a certain day with another date, or to ﬁlter the news

of a speciﬁc source/author. An attacker can exploit these

vulnerabilities to promote some particular news, or conceal

news that can hurt his person/image, or even subvert the

information by replacing an article with an older one.

Also some multimedia websites were vulnerable to HPP

attacks. In several popular sites, an attacker could over-

ride the video links and make them point to a link of his

choice (e.g., a drive-by download site), or alter the results

of a query to inject malicious multimedia materials. In one

case, we were able to automatically register a user to a spe-

ciﬁc streaming event.

Similar problems also affected several popular search en-

gines. We noticed that it would have been possible to tam-

per with the results of the search functionality by adding

special keywords, or by manipulating the order in which

the results are shown. We also noticed that on some search

engines, it was possible to replace the content of the com-

mercial suggestion boxes with links to sites owned by the

attacker.

4.3 Ethical Considerations

Crawling and automatically testing a large number of ap-

plications may be considered an ethically sensitive issue.

Clearly, one question that arises is if it is ethically accept-

able and justiﬁable to test for vulnerabilities in popular web-

sites.

Analogous to the real-world experiments conducted by

Jakobsson et al. in [21, 22], we believe that realistic exper-

iments are the only way to reliably estimate success rates

of attacks in the real-world. Unfortunately, criminals do

not have any second thoughts about discovering vulnerabil-

ities in the wild. As researchers, we believe that our ex-

periments helped many websites to improve their security.

Furthermore, we were able to raise some awareness about

HPP problems in the community.

Also, note that:

• PAPAS only performed client-side checks. Similar

client-side vulnerability experiments have been pe-

formed before in other studies (e.g., for detecting

cross site scripting, SQL injections, and CSRF in the

wild [24, 29]). Furthermore, we did not perform any

server-side vulnerability analysis because such experi-

ments had the potential to cause harm.

• We only provided the applications with innocuous pa-

rameters that we knew that the applications were al-

ready accepting, and did not use any malicious code as

input.

• PAPAS was not powerful enough to inﬂuence the per-

formance of any website we investigated, and the scan

activities was limited to 15 minutes to further reduce

the generated trafﬁc.

• We informed the concerned sites of any critical vulner-

abilities that we discovered.

• None of the security groups of the websites that we

interacted with complained to us when we informed

them that we were researchers, and that we had dis-

covered vulnerabilities on their site with a tool that

we were testing. On the contrary, many people were

thankful to us that we were informing them about vul-

nerabilities in their code, and helping them make their

site more secure.

5 Related work

There are two main approaches [14] to test software

applications for the presence of bugs and vulnerabilities:

white-box testing and black-box testing. In white-box test-

ing, the source code of an application is analyzed to ﬁnd

ﬂaws. In contrast, in black-box testing, input is fed into

a running application and the generated output is analyzed

for unexpected behavior that may indicate errors. PAPAS

adopts a black-box approach to scan for vulnerabilities.

When analyzing web applications for vulnerabilities,

black-box testing tools (e.g., [2, 8, 24, 33]) are the most

popular. Some of these tools (e.g., [2]) claim to be generic

enough to identify a wide range of vulnerabilities in web

applications. However, recent studies ([6, 11]) have shown

that scanning solutions that claim to be generic have seri-

ous limitations, and that they are not as comprehensive in

practice as they pretend to be.

Two well-known, older web vulnerability detection and

mitigation approaches in literature are Scott and Sharp’s

application-level ﬁrewall [30] and Huang et al.’s [17] vul-

nerability detection tool that automatically executes SQL

injection attacks. Scott and Sharp’s solution allows to de-

ﬁne ﬁne-grained policies manually in order to prevent at-

tacks such as parameter tampering and cross-site scripting.

However, it cannot prevent HPP attacks and has not been

designed with this vulnerability in mind. In comparison,

Huang et al.’s work solely focuses on SQL injection vulner-

ability detection using fault injection.

To the best of our knowledge, only one of the available

black-box scanners, Cenzic Hailstorm [9], claims to support

HPP detection. However, a study of its marketing material

reveals that the tool only looks for behavioral differences

when HTTP parameters are duplicated (i.e., not a sufﬁcient

test by itself to detect HPP). Unfortunately, we were not

able to obtain more information about the inner-workings

of the tool as Cenzic did not respond to our request for an

evaluation version.

The injection technique we use is similar to other black-

box approaches such as SecuBat [24] that aim to discover

SQL injection, or reﬂected cross site scripting vulnerabili-

ties. However, note that conceptually, detecting cross site

scripting or SQL injection is different from detecting HPP.

In fact, our approach required the development of a set of

tests and heuristics to be able to deal with dynamic content

that is often found on webpages today (content that is not

an issue when testing for XSS or SQL injection). Hence,

compared to existing work in literature, our approach for

detecting HPP, and the prototype we present in this paper

are unique.

With respect to white-box testing of web applications,

a large number of static source code analysis tools (e.g.,

[23, 31, 34]) that aim to identify vulnerabilities have been

proposed. These approaches typically employ taint tracking

to help discover if tainted user input reaches a critical func-

tion without being validated. We believe that static code

analysis would be useful and would help developers iden-

tify HPP vulnerabilities. However, to be able to use static

code analysis, it is still necessary for the developers to un-

derstand the concept of HPP. Previous research has shown

that the sanitization process can still be faulty if the devel-

oper does not understand a certain class of vulnerability [4].

Note that there also exists a large body of more general

vulnerability detection and security assessment tools (e.g.,

Nikto [26], and Nessus [32]). Such tools typically rely on

a repository of known vulnerabilities and test for the exis-

tence of these ﬂaws. In comparison, our approach aims to

discover previously unknown HPP vulnerabilities in the ap-

plications that are under analysis.

With respect to scanning, there also exist network-level

tools such as nmap [18]. Tools like nmap can determine the

availability of hosts and accessible services. However, they

cannot detect higher-level application vulnerabilities.

In comparison to the work we present in this paper, to

the best of our knowledge, no large-scale study has been

performed to date to measure the prevalence and the signif-

icance of HPP vulnerabilities in popular websites.

6 Conclusion

Web applications are not what they used to be ten years

ago. Popular web applications have now become more dy-

namic, interactive, complex, and often contain a large num-

ber of multimedia components. Unfortunately, as the pop-

ularity of a technology increases, it also becomes a target

for criminals. As a result, most attacks today are launched

against web applications.

Vulnerabilities such as cross site scripting, SQL injec-

tion, and cross site request forgery are well-known and

have been intensively studied by the research community.

Many solutions have been proposed, and tools have been

released. However, a new class of injection vulnerabili-

ties called HTTP Parameter Pollution (HPP) that was ﬁrst

presented at the OWASP conference [27] in 2009 has not

received as much attention. If a web application does not

properly sanitize the user input for parameter delimiters, us-

ing an HPP vulnerability, an attacker can compromise the

logic of the application to perform client-side or server-side

attacks.

In this paper, we present the ﬁrst automated approach for

the discovery of HPP vulnerabilities in web applications.

Our prototype implementation called PArameter Pollution

Analysis System (PAPAS) is able to crawl websites and dis-

cover HPP vulnerabilities by parameter injection. In order

to determine the feasibility of our approach and to assess

the prevalence of HPP vulnerabilities on the Internet today,

we analyzed more than 5,000 popular websites. Our results

show that about 30% of the sites we analyzed contain vul-

nerable parameters and that at least 14% of them can be

exploited using HPP. A large number of well-known, high-

proﬁle websites such as Symantec, Google, VMWare, and

Microsoft were among the sites affected by HPP vulnera-

bilities that we discovered. We informed the sites for which

we could obtain contact information, and some of these sites

wrote back to us and conﬁrmed our ﬁndings.

We hope that this paper will help raise awareness and

draw attention to the HPP problem.

Acknowledgments This work has been supported by

the POLE de Competitivite SCS (France) through the

MECANOS project and by the French National Research

Agency through the VAMPIRE project. The work has also

received support from the Secure Business Austria in Vi-

enna.

References

[1] C. A. A-2000-02. Malicious HTML Tags Embedded in

Client Web Requests, 2000. http://www.cert.org/

advisories/CA-2000-02.html.

[2] Acunetix. Acunetix Web Vulnerability Scanner. http:

//www.acunetix.com/, 2008.

[3] I. Alexa Internet. Alexa - Top Sites by Category: Top.

http://www.alexa.com/topsites/category.

[4] D. Balzarotti, M. Cova, V. Felmetsger, D. Balzarotti, N. Jo-

vanovic, C. Kruegel, E. Kirda, and G. Vigna. Saner: Com-

posing Static and Dynamic Analysis to Validate Sanitization

in Web Applications. In IEEE Symposium on Security and

Privacy, 2008.

[5] D. Bates, A. Barth, and C. Jackson. Regular Expressions

Considered Harmful in Client-Side XSS Filters. In 19th

International World Wide Web Conference. (WWW 2010),

2010.

[6] J. Bau, E. Burzstein, D. Gupta, and J. C. Mitchell. State of

the Art: Automated Black-Box Web Application Vulnerabil-

ity Testing. In Proceedings of IEEE Security and Privacy,

May 2010.

[7] T. Berners-Lee, R. Fielding, and L. Masinter. Rfc 3986, uni-

form resource identiﬁer (uri): Generic syntax, 2005. http:

//rfc.net/rfc3986.html.

[8] Burp Spider. Web Application Security. http://

portswigger.net/spider/, 2008.

[9] Cenzic. Cenzic Hailstormr. http://www.cenzic.

com/, 2010.

[10] S. di Paola and L. Carettoni. Client side Http Parameter

Pollution - Yahoo! Classic Mail Video Poc, May 2009.

http://blog.mindedsecurity.com/2009/05/

client-side-http-parameter-pollution.

html.

[11] A. Doup

e, M. Cova, and G. Vigna. Why Johnny Cant Pen-

test: An Analysis of Black-Box Web Vulnerability Scanners.

Detection of Intrusions and Malware, and Vulnerability As-

sessment, pages 111–131, 2010.

[12] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter,

P. Leach, and T. Berners-Lee. Rfc 2616, hypertext trans-

fer protocol – http/1.1, 1999. http://www.rfc.net/

rfc2616.html.

[13] B. D. A. G. and M. Stampar. sqlmap. http://sqlmap.

sourceforge.net.

[14] C. Ghezzi, M. Jazayeri, and D. Mandrioli. Fundamentals of

Software Engineering. Prentice-Hall International, 1994.

[15] W. G. J. Halfond and A. Orso. Preventing SQL injection

attacks using AMNESIA. In ICSE ’06: Proceedings of

the 28th international conference on Software engineering,

2006.

[16] N. Hardy. The Confused Deputy: (or why capabilities might

have been invented). ACM SIGOPS Operating Systems Re-

view, 22(4), October 1988.

[17] Y. Huang, S. Huang, and T. Lin. Web Application Secu-

rity Assessment by Fault Injection and Behavior Monitor-

ing. 12th World Wide Web Conference, 2003.

[18] Insecure.org. NMap Network Scanner. http://www.

insecure.org/nmap/, 2010.

[19] S. Institute. Top Cyber Security Risks,

September 2009. http://www.sans.org/

top-cyber-security-risks/summary.php.

[20] A. B. C. Jackson and J. C. Mitchell. Robust Defenses for

Cross-Site Request Forgery. In 15th ACM Conference on

Computer and Communications Security, 2007.

[21] M. Jakobsson, P. Finn, and N. Johnson. Why and How

to Perform Fraud Experiments. Security & Privacy, IEEE,

6(2):66–68, March-April 2008.

[22] M. Jakobsson and J. Ratkiewicz. Designing ethical phishing

experiments: a study of (ROT13) rOnl query features. In

15th International Conference on World Wide Web (WWW),

2006.

[23] N. Jovanovic, C. Kruegel, and E. Kirda. Pixy: A Static

Analysis Tool for Detecting Web Application Vulnerabilities

(Short Paper). In IEEE Symposium on Security and Privacy,

2006.

[24] S. Kals, E. Kirda, C. Kruegel, and N. Jovanovic. SecuBat: A

Web Vulnerability Scanner. In World Wide Web Conference,

2006.

[25] N. J. E. Kirda and C. Kruegel. Preventing Cross Site Re-

quest Forgery Attacks. In IEEE International Conference

on Security and Privacy in Communication Networks (Se-

cureComm), Baltimore, MD, 2006.

[26] Nikto. Web Server Scanner. http://www.cirt.net/

code/nikto.shtml, 2010.

[27] OWASP AppSec Europe 2009. HTTP Parameter Pollution,

May 2009. http://www.owasp.org/images/b/

ba/AppsecEU09_CarettoniDiPaola_v0.8.pdf.

[28] J. Ratcliff and D. Metzener. Pattern matching: The gestalt

approach. Dr. Dobbs Journal, 7:46, 1988.

[29] D. Reading. CSRF Flaws Found on Major Websites: Prince-

ton University researchers reveal four sites with cross-site

request forgery ﬂaws and unveil tools to protect against

these attacks, 2008. http://www.darkreading.

com/security/app-security/showArticle.

jhtml?articleID=211201247.

[30] D. Scott and R. Sharp. Abstracting Application-level Web

Security. 11th World Wide Web Conference, 2002.

[31] Z. Su and G. Wassermann. The Essence of Command Injec-

tion Attacks in Web Applications. In Symposium on Princi-

ples of Programming Languages, 2006.

[32] Tenable Network Security. Nessus Open Source Vulnerabil-

ity Scanner Project. http://www.nessus.org/, 2010.

[33] Web Application Attack and Audit Framework. http://

w3af.sourceforge.net/.

[34] Y. Xie and A. Aiken. Static Detection of Security Vulner-

abilities in Scripting Languages. In 15th USENIX Security

Symposium, 2006.

Full Paper A HYBRID SYSTEM FOR THE MITIGATION OF HTTP PARAMETER POLLUTION ATTACKS FOR E-PAYMENT GATEWAYS

Conference Paper

Full-text available

Mar 2023

The advancement of Information and Communication Technology particularly the Internet and mobile phones is rapidly replacing the old ways of doing business. A payment gateway accelerates a payment transaction by the transfer of information between a payment portal and the receiving bank. Since all the payment activities are carried out in an unsecured network, there is a high risk of vital information invasion by internet criminals. There has been a remarkable increase in cyber-attacks on online payment platforms to hijack sensitive data. One of such attacks is HTTP Parameter Pollution (HPP) Attack. This form of attack utilizes the data sent in the API request by changing the values of the API request. To prevent the attack, a combined security model is developed to conceal important information from online fraudsters. The security model combines the Deffie Hellman Key Exchange (DHKE) and Triple Data Encryption Standard (3DES) algorithms to encrypt the information transmitted by the e-payment gateway. The encrypted data is decrypted on both the client and the server sides. This combined security model ensures that the information shared is safe from external attack. The proposed security model for e-payment gateways was implemented using C# programming language. Results obtained suggest that the model is viable as data encrypted and hashed could not be decrypted by an attacker compared to other existing models of attack mitigation. The developed hybrid security model therefore provides a more secured e-payment transaction platform that adequately mitigates attacks from internet criminals.

The approaches to quantify web application security scanners quality: A review

Article

Sep 2018

The web application security scanner is a computer program that assessed web application security with penetration testing technique. The benefit of automated web application penetration testing is huge, which web application security scanner not only reduced the time, cost, and resource required for web application penetration testing but also eliminate test engineer reliance on human knowledge. Nevertheless, web application security scanners are possessing weaknesses of low test coverage, and the scanners are generating inaccurate test results. Consequently, experimentations are frequently held to quantitatively quantify web application security scanner's quality to investigate the web application security scanner's strengths and limitations. However, there is a discovery that neither a standard methodology nor criterion is available for quantifying the web application security scanner's quality. Hence, in this paper systematic review is conducted and analysed the methodology and criterion used for quantifying web application security scanners' quality. In this survey, the experiment methodologies and criterions that had been used to quantify web application security scanner's quality is classified and review using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) protocol. The objectives are to provide practitioners with the understanding of methodologies and criterions that available for measuring web application security scanners' test coverage, attack coverage, and vulnerability detection rate, while provides the critical hint for development of the next testing framework, model, methodology, or criterions, to measure web application security scanner quality.

Efficiency and Effectiveness of Web Application Vulnerability Detection Approaches: A Review

Article

Full-text available

Oct 2021

Most existing surveys and reviews on web application vulnerability detection (WAVD) approaches focus on comparing and summarizing the approaches’ technical details. Although some studies have analyzed the efficiency and effectiveness of specific methods, there is a lack of a comprehensive and systematic analysis of the efficiency and effectiveness of various WAVD approaches. We conducted a systematic literature review (SLR) of WAVD approaches and analyzed their efficiency and effectiveness. We identified 105 primary studies out of 775 WAVD articles published between January 2008 and June 2019. Our study identified 10 categories of artifacts analyzed by the WAVD approaches and 8 categories of WAVD meta-approaches for analyzing the artifacts. Our study’s results also summarized and compared the effectiveness and efficiency of different WAVD approaches on detecting specific categories of web application vulnerabilities and which web applications and test suites are used to evaluate the WAVD approaches. To our knowledge, this is the first SLR that focuses on summarizing the effectiveness and efficiencies of WAVD approaches. Our study results can help security engineers choose and compare WAVD tools and help researchers identify research gaps.

SEVENTH FRAMEWORK PROGRAMME Managing Threats and Vulnerabilities in the Future Internet RED BOOK A Roadmap for Systems Security Research THE

Book

Feb 2021

Nowadays, security is becoming number one priority for governments, organization, companies, and individuals. Security is all about protecting critical and valuable assets. Protecting valuable and critical assets, whether they are tangible or intangible, is a process that can be ranged from being unsophisticated to being very sophisticated. Security is a broad term that serves as an umbrella for many topics including but not limited to computer security, internet security, communication security, network security, application security, data security, and information security. In this chapter, and following the scope of the textbook, we will discuss about information security and provide an overview about general information security concepts, recent evolutions, and current challenges in the field of information security.

Development of a Secured Web-Based Healthcare Portal

Thesis

Full-text available

Sep 2018

Rotimi-Williams Bello

Web-based healthcare portal refers to the health-related services that are accessed over a network connection using Hyper Text Transfer Protocol (HTTP), rather than existing within a device’s memory. However, Web-based healthcare portal also may be client-based, where a small part of the health-related services are downloaded to a user’s desktop, but processing is done over the internet on an external server through which the portal is exposed to a lot of security pitfalls. Therefore, this research work intended to design defense architecture with security check point to detect the Web-request-level attacks by employing request blocker as the security check point and mathematical models of Fermat’s theorem, Euler’s theorem and totient function as the constituents of Rivest, Shamir, and Adleman (RSA) algorithm, which is largely synonymous with cryptography for encryption and decryption. The findings in this research work showed that the Web-based healthcare portal controls the access over its functions by checking session variables indicating the user privilege before its restrictive functions can be executed. If the application is not at the required state, the Web-based healthcare portal will redirect the user to the login page, authorization page or an error page. However, if there exists a path leading to the restrictive function with insufficient or erroneous checking of session variables, the attacker is able to bypass the authentication/authorization. The challenging task noted in the course of this research work linked to the complexity in developing a secured Web-based healthcare portal because the development adds additional layer of complexity during implementation.

DetLogic: A black-box approach for detecting logic vulnerabilities in web applications

Article

Feb 2018

Web applications are subject to attacks by malicious users owing to the fact that the applications are implemented by software developers with insufficient knowledge about secure programming. The implementation flaws arising due to insecure coding practices allow attackers to exploit the application in order to perform adverse actions leading to undesirable consequences. These flaws can be categorized into injection and logic flaws. As large number of tools and solutions are available for addressing injection flaws, the focus of the attackers is shifting towards exploitation of logic flaws. The logic flaws allow attackers to compromise the application-specific functionality against the expectations of the stakeholders, and hence it is important to identify these flaws in order to avoid exploitation. Therefore, a prototype called DetLogic is developed for detecting different types of logic vulnerabilities such as parameter manipulation, access-control, and workflow bypass vulnerabilities in web applications. DetLogic employs black-box approach, and models the intended behavior of the application as an annotated finite state machine, which is subsequently used for deriving constraints related to input parameters, access-control, and workflows. The derived constraints are violated for simulating attack vectors to identify the vulnerabilities. DetLogic is evaluated against benchmark applications and is found to work effectively.

Blind Spots: Identifying Exploitable Program Inputs

Conference Paper

May 2023

Equivocal URLs: Understanding the Fragmented Space of URL Parser Implementations

Chapter

Sep 2022

Uniform Resource Locators (URLs) are integral to the Web and have existed for nearly three decades. Yet URL parsing differs subtly among parser implementations, leading to ambiguity that can be abused by attackers. We measure agreement between widely-used URL parsers and find that each has made design decisions that deviate from parsing standards, creating a fractured implementation space where assumptions of uniform interpretation are unreliable. In some cases, deviations are severe enough that clients using different parsers will make requests to different hosts based on a single, “equivocal” URL. We systematize the thousands of differences we observed into seven pitfalls in URL parsing that application developers should beware of. We demonstrate that this ambiguity can be weaponized through misdirection attacks that evade the Google Safe Browsing and VirusTotal URL classifiers. URL parsing libraries have made a tradeoff to favor permissiveness over strict standards adherence. We hope this work will motivate the systemic adoption of a more unified URL parsing standard–enabling a more secure Web.

DeepWAF: Detecting Web Attacks Based on CNN and LSTM Models

Chapter

Jan 2020

The increasing popularity of web applications makes the web a main venue for attackers engaging in a myriad of cybercrimes. With large quantities of information processing and sharing by web applications, the situation for web attack detection or prevention becomes increasingly severe. We present a prototype implementation called DeepWAF to detect web attacks based on deep learning techniques. We systematically discuss the approach for effective use of the currently popular CNN and LSTM models, and their combinational models CNN-LSTM and LSTM-CNN. The experimental results on the dataset of HTTP DATASET CSIC 2010 demonstrate that our proposed four types of detection models all achieve satisfactory results, with the detection rate of approximately 95% and the false alarm rate of approximately 2%. We also carried out case studies to analyze the causes of false negatives and false positives, which can be used for further improvements. Our work further illustrates that machine learning has a promising application prospect in the field of web attack detection.

Hiding Behind the Shoulders of Giants: Abusing Crawlers for Indirect Web Attacks

Conference Paper

Aug 2017

Why Johnny Can’t Pentest: An Analysis of Black-Box Web Vulnerability Scanners

Conference Paper

Full-text available

Jul 2010

Black-box web vulnerability scanners are a class of tools that can be used to identify security issues in web applications. These tools are often marketed as “point-and-click pentesting” tools that automatically evaluate the security of web applications with little or no human support. These tools access a web application in the same way users do, and, therefore, have the advantage of being independent of the particular technology used to implement the web application. However, these tools need to be able to access and test the application’s various components, which are often hidden behind forms, JavaScript-generated links, and Flash applications. This paper presents an evaluation of eleven black-box web vulnerability scanners, both commercial and open-source. The evaluation composes different types of vulnerabilities with different challenges to the crawling capabilities of the tools. These tests are integrated in a realistic web application. The results of the evaluation show that crawling is a task that is as critical and challenging to the overall ability to detect vulnerabilities as the vulnerability detection techniques themselves, and that many classes of vulnerabilities are completely overlooked by these tools, and thus research is required to improve the automated detection of these flaws.

Designing ethical phishing experiments: a study of (ROT13) rOnl query features.

Conference Paper

Full-text available

Jan 2006

We study how to design experiments to measure the suc- cess rates of phishing attacks that are ethical and accurate, which are two requirements of contradictory forces. Namely, an ethical experiment must not expose the participants to any risk; it should be possible to locally verify by the partic- ipants or representatives thereof that this was the case. At the same time, an experiment is accurate if it is possible to argue why its success rate is not an upper or lower bound of that of a real attack - this may be dicult if the ethics considerations make the user perception of the experiment dierent from the user perception of the attack. We intro-

Abstracting application-level web security

Conference Paper

Jan 2002

Static Detection of Security Vulnerabilities in Scripting Languages

Article

Jan 2006

We present a static analysis algorithm for detecting secu- rity vulnerabilities in PHP, a popular server-side script- ing language for building web applications. Our analysis employs a novel three-tier architecture to capture infor- mation at decreasing levels of granularity at the intra- block, intraprocedural, and interprocedural level. This architecture enables us to handle dynamic features of scripting languages that have not been adequately ad- dressed by previous techniques. We demonstrate the effectiveness of our approach on six popular open source PHP code bases, finding 105 pre- viously unknown security vulnerabilities, most of which we believe are remotely exploitable.

Rfc 2616: hypertext transfer protocol -http/1

Article

Jan 1999

Pixy: A Static Analysis Tool for Detecting Web Application Vulnerabilities (Technical Report)

Article

May 2006

The number and the importance of Web applications have increased rapidly over the last years. At the same time, the quantity and impact of security vulnerabilities in such applications have grown as well. Since manual code reviews are time-consuming, error-prone and costly, the need for automated solutions has become evident. In this paper, we address the problem of vulnerable Web applications by means of static source code analysis. More precisely, we use o w-sensitive, interprocedural and context-sensitive data o w analysis to discover vulnerable points in a program. In addition, alias and literal analysis are employed to improve the correctness and precision of the results. The presented concepts are targeted at the general class of taint-style vulnerabilities and can be applied to the detection of vulnerability types such as SQL injection, cross-site scripting, or command injection. Pixy, the open source prototype implementation of our concepts, is targeted at detecting cross-site scripting vulnerabilities in PHP scripts. Using our tool, we discovered and reported 15 previously unknown vulnerabilities in three web applications, and reconstructed 36 known vulnerabilities in three other web applications. The observed false positive rate is at around 50% (i.e., one false positive for each vulnerability) and therefore, low enough to permit effective security audits.

Preventing SQL injection attacks using AMNESIA

Conference Paper

May 2006

AMNESIA is a tool that detects and prevents SQL injection attacks by combining static analysis and runtime monitoring. Empirical evaluation has shown that AMNESIA is both effective and efficient against SQL injection.

Preventing Cross Site Request Forgery Attacks

Conference Paper

Aug 2006

The Web has become an indispensable part of our lives. Unfortunately, as our dependency on the Web increases, so does the interest of attackers in exploiting Web applications and Web-based information systems. Previous work in the field of Web application security has mainly focused on the mitigation of cross site scripting (XSS) and SQL injection attacks. In contrast, cross site request forgery (XSRF) attacks have not received much attention. In an XSRF attack, the trust of a Web application in its authenticated users is exploited by letting the attacker make arbitrary HTTP requests on behalf of a victim user. The problem is that Web applications typically act upon such requests without verifying that the performed actions are indeed intentional. Because XSRF is a relatively new security problem, it is largely unknown by Web application developers. As a result, there exist many Web applications that are vulnerable to XSRF. Unfortunately, existing mitigation approaches are time-consuming and error-prone, as they require manual effort to integrate defense techniques into existing systems. In this paper, we present a solution that provides a completely automatic protection from XSRF attacks. More precisely, our approach is based on a server-side proxy that detects and prevents XSRF attacks in a way that is transparent to users as well as to the Web application itself. We provide experimental results that demonstrate that we can use our prototype to secure a number of popular open-source Web applications, without negatively affecting their behavior

Regular expressions considered harmful in client-side XSS filters

Conference Paper

Apr 2010

Cross-site scripting flaws have now surpassed buffer overflows as the world's most common publicly-reported security vulnerability. In recent years, browser vendors and researchers have tried to develop client-side filters to mitigate these attacks. We analyze the best existing filters and find them to be either unacceptably slow or easily circumvented. Worse, some of these filters could introduce vulnerabilities into sites that were previously bug-free. We propose a new filter design that achieves both high performance and high precision by blocking scripts after HTML parsing but before execution. Compared to previous approaches, our approach is faster, protects against more vulnerabilities, and is harder for attackers to abuse. We have contributed an implementation of our filter design to the WebKit open source rendering engine, and the filter is now enabled by default in the Google Chrome browser.

SecuBat: a Web vulnerability scanner

Conference Paper

May 2006

As the popularity of the web increases and web applications become tools of everyday use, the role of web security has been gaining importance as well. The last years have shown a significant increase in the number of web-based attacks. For example, there has been extensive press coverage of re- cent security incidences involving the loss of sensitive credit card information belonging to millions of customers. Many web application security vulnerabilities result from generic input validation problems. Examples of such vulner- abilities are SQL injection and Cross-Site Scripting (XSS). Although the majority of web vulnerabilities are easy to understand and to avoid, many web developers are, unfor- tunately, not security-aware. As a result, there exist many web sites on the Internet that are vulnerable. This paper demonstrates how easy it is for attackers to automatically discover and exploit application-level vulner- abilities in a large number of web applications. To this end, we developed SecuBat, a generic and modular web vulnera- bility scanner that, similar to a port scanner, automatically analyzes web sites with the aim of finding exploitable SQL injection and XSS vulnerabilities. Using SecuBat, we were able to find many potentially vulnerable web sites. To verify the accuracy of SecuBat, we picked one hundred interesting web sites from the potential victim list for further analysis and confirmed exploitable flaws in the identified web pages. Among our victims were well-known global companies and a finance ministry. Of course, we notified the administrators of vulnerable sites about potential security problems. More than fifty responded to request additional information or to report that the security hole was closed.

Automated Discovery of Parameter Pollution Vulnerabilities in Web Applications

Abstract and Figures

Recommended publications

Finding your way in the testing jungle: A learning approach to web security testing

Privacy-Aware Network Client Pattern

Co Hijacking Monitor: Collaborative Detecting and Locating Mechanism for HTTP Spectral Hijacking

CSRF Attacks On Web Applications