Revisiting Script Injection in ASP.NET

The danger of Cross-Site Scripting (XSS) has to be dealt with in any web application. You do this by validating the input from all possible channels. by constraining it in terms of its range, type and length, and by encoding the output from views. ASP.NET has some built-in validation of requests that can be extended to make it more effective, but this approach has changed with ASP.NET Core to place the onus on the application developers to provide the middleware to perform effective validation that is fine-tuned to the application. Dino Esposito explains.

Unless, maybe, you are writing a web application for a small and well-known audience, you should really take a look at the cheat sheet of OWASP for .NET projects. The Open Web Application Security Project (OWASP) is an organization that cares about good practices to minimize security risks in web applications. More information on OWASP is available at http://www.owasp.org.

In this article, I’ll focus on the countermeasures and tools built into the classic ASP.NET pipeline, and the ASP.NET MVC stack in particular, to fend off script injection attacks and in general any attempt to inject invalid and potentially malicious input into an application. All the rules of web security could be summed up in single nugget of plain wisdom—never trust any user input. However, it is interesting to see how the perspective on the most appropriate way to deal with raw input has changed from classic ASP.NET to ASP.NET Core. Note that in this article, when not explicitly stated otherwise, I’m referring to the classic ASP.NET MVC 5.x stack and the classic ASP.NET pipeline. I’ll discuss ASP.NET Core later on in a separate section.

ASP.NET Request Barricades

Years ago, ASPX Web Forms pages introduced a page directive to enable or disable request URL validation.

When set to true, the validateRequest directive instructs the runtime to perform a preliminary check on the parametric segments of the URL being called to spot any potentially malicious characters. If any of those characters are found (for example, angle brackets possibly being the beginning of a JavaScript snippet) then the pipeline would stop the request and throw an exception. The same barrier is set up for the ASP.NET MVC stack. Over the years, though, the request validation filter has been moved out of the application model stack (Web Forms or MVC) and placed just outside the entire request processing cycle. Currently, in fact, the check is performed before the request is processed by the ASP.NET pipeline and culminates in a controller method call or in an ASPX postback event.

Regardless of the application model of choice, whether Web Forms or MVC, in ASP.NET any request goes through a sequence of application-level events that can be handled with code in global.asax. The event that signals the receipt of the request and the beginning of any processing work on it is BeginRequest.

When the BeginRequest event fires up, any request validation work has been done already. If BeginRequest is received by the application code then it also means that the check on the URL was successful. The request validation filter doesn’t do any particularly sophisticated work but it’s enough to detect common attempts of script injection and anyway is better than nothing. For any occurrence of “<” followed by alphabetical characters and for any occurrence of “<!” and “&#” found in posted form data, query string, cookies, route data the system throws a HttpRequestValidationException exception and aborts the request.

Note that the request validation input filter is optional in classic ASP.NET and can be disabled altogether for the entire application in web.config or on a per ASPX page and controller action method basis. To disable it on a controller method you apply the ValidateInput attribute.

Disabling the filter for the entire application is a bit trickier. First, you set the ValidateRequest property on the Controller class to false. Second, you edit the default configuration of the HTTP runtime to make it work as it used to do before ASP.NET 4.0.

It is with ASP.NET 4.0, in fact, that the request validation filter has been moved before BeginRequest and because of this any programmatic check comes too late. Resetting the configuration of the runtime to any version of ASP.NET prior to 4.0 regains you full control over the application of the filter. To cover all of your controllers with a single setting, you might want to introduce a custom base controller class property configured to disable the filter.

The ASP.NET Core Approach

Since the beginning there has been a vibrant debate about the effective usefulness of such a blind barrier around request URLs that is unable to detect and interpret context. The primary objection was that an automatic request validation barrier prevents any application from receiving some HTML content via a form. This is a real issue for all those applications that need to have users enabled to type in HTML markup, say, as the body of a published news. Disabling the filter solves the issue for those applications in the need of processing (sanitized) HTML, but leaves on the developers’ shoulders the responsibility of actually sanitizing the HTML.

In ASP.NET Core, the ASP.NET team chose a different direction. No middleware is made available that behaves in a way similar to the ASP.NET request validation filter. This is by design. The standpoint is that, in the end, validating the input should be seen as a primary application concern. Having a general-purpose barrier set in place for free might even give a false sense of comfort and safety; and this might lead developers to disregard specific concerns originating in their specific context. A general-purpose barrier carries the risk of being  rather porous,  even inadequate, and it’s definitely hard to come up with a mix of checks that work for just any applications. Hence, the ASP.NET Core team decided not to offer any built-in middleware but instead leaves you responsible for building your own middleware. On the other hand, the team managed to ensure that all output from Razor views is automatically encoded. This is the best countermeasure to stay protected against cross-site script attacks (XSS).

What if I Need to Process HTML in JavaScript?

Let’s assume that your application has one or more HTML forms that are due to accept (and parse and sanitize) HTML input. The first example that comes to mind is a form that captures a news or a blog post where you want to use some HTML formatting tags. The first scenario I’ll go through is when the sanitization should occur on the client side and done via JavaScript.

JavaScript provides the encodeURI global function to encode plain text. The net effect of the function is to encode special characters except the following: , / ? : @ & = + $ #. However, if you use the encodeURIComponent function instead, all special characters will be encoded. To decode what could have been encoded, you use the decodeURI function. Let’s have a look at the following code snippet.

The variable named input contains the raw input as the user may have typed it in the input field. The variable input1 contains the safely encoded version of it, ideal for transmitting across the wire on the server. The other two variables, input2 and input3, provide two ways to display encoded HTML and JavaScript content within the current DOM. The first approach involves wrapping the raw input with a XMP element. The second approach wraps the raw input in a disabled textarea element. The figure below shows the final result in a sample page.

The textbox receives some text that can be evaluated to an active script. Below the horizontal line, you can see the encoded text as it was generated by the encodeURI function, the rendering of the raw text once wrapped up in a XMP element and, finally, the output of the raw input within a disabled textarea element. Depending on the expected use of the input markup (processing or just display) you have quite a few options to work with the data safely.

Processing Potentially Malicious Data on the Server

If you keep ValidateRequest enabled, then your controller action methods will never have their chance to parse and process any request that contains markup and tag elements. Hence your decision depends on the answer you give one core question: does your application need to receive and process HTML formatted input? If the answer is no, then you can probably keep the filter on and will automatically be protected against any attempt to post potentially malicious data. If the answer is yes, instead, you’d better disable the request validation filter and roll your own (effective) sanitization layer. The good news is that in this case ASP.NET MVC has some additional capabilities to keep your necessary work of parsing HTML down to the bare minimum. You can use the AllowHtml attribute on the properties of a view model class used to capture input data through model binding.

Suppose you populate an instance of this class via the model binding engine of ASP.NET MVC.

In the example, the request validation filter is on but any attempt to bind HTML content to TextInput won’t throw any security exception. This is because the member is decorated with the AllowHTML attribute. If you attempt to bind HTML content to the property Buffer, instead, you will get the expected exception. Note that the AllowHTML attribute can only be used through model binding and it won’t work if applied to a parameter as the Bind attribute does. In other words, the following code, though logically equivalent, won’t compile.

Note that ASP.NET prohibits direct access to input collections such as QueryString and Form if the request validation filter is disabled and potentially dangerous input is acceptable. The following code will fail:

The reason the code fails is that Form or QueryString collections get to contain raw data not sanitized in any way. If direct access is really necessary in code then you have to resort to the following expression:

No issues exist whatsoever if you retrieve posted values through the model binding layer. When it comes to writing your own request validation filter you might want to get inspired by the original one written by Microsoft for ASP.NET. The source code is available as open source and if you have JetBrains’s dotPeek installed (it’s free) you can access it directly from within Visual Studio. Anyway, if in your own request validation layer, you need to invoke the original filter just call it: Request.ValidateInput().

Output Encoding

Request validation comes as a tool to defend against XSS attacks that exploit vulnerabilities in input validation. As a result, client-side script code can be injected in the DOM resulting in unwanted actions and unpredictable results. The most effective line of defense—the only one implemented in ASP.NET Core—is automatically encoding any output being rendered through Razor views. Outside of Razor views, for example in Web Forms pages, you can use the Server.HtmlEncode method. In Razor, you can also use the method Html.Encode to have text explicitly encoded. Should you need to emit un-encoded text from within a Razor view use the Html.Raw method instead.

To prevent automatic encoding of output content you can use special string types such as HtmlString and MvcHtmlString. The former is for Web Forms pages whereas the latter is for use within Razor views. Both classes implement the interface IHtmlString which tells the system not to encode text automatically.

It is important to remind that any use of MvcHtmlString comes under the assumption that the text being rendered is, or has been, already accounted for encoding.

Note that the ASP.NET default encoder class looks for black-listed characters that may indicate some malicious script. There’s an alternative though—using the AntiXssEncoder class. Encoding methods on this class take a white-list approach and define a set of allowed characters and ensure no other characters are found. To replace the default encoder, you add the following to your web.config file:

The AntiXssEncoder class has the same HTML encoding method as the default encoder plus others. In particular, the AntiXssEncoder class provides additional static methods to encode in special scenarios such as for text to be included in a CSS file (method CssEncode), in an XML file (method XmlEncode) and in a HTML form being posted (method HtmlFormUrlEncode).

Summary

In the end, Cross-Site Scripting (XSS) vulnerabilities are still today a significant threat for the security of most web applications. XSS attacks can be prevented in two ways: by validating the input accepted through all possible channels and by encoding any output. Validating request content means avoiding all special characters and constraining all user input to acceptable range, type and length.  In ASP.NET MVC, output encoding is automatic in Razor views and incoming requests are also validated for special characters.