CodeWatch

This is going to be a short post that expands upon input validation controls. These routines are in addition to those found in the OWASP ASVS and should be utilized where possible to help mitigate the risk of any type of injection attack (XSS, SQLi, LDAPi, XMLi, Command injection, etc).

There are two other major forms of validation that can be added into your application security controls to help protect your users, applications, and systems from attack:

  1. Type Validation: Compare the submitted type of the data against what is expected (integer vs. array, string vs. array, etc)
  2. Length Validation: Compare the length of the submitted data against that which is expected. For example, if your application contains a form field in which a zip code is used, then the value should be no more than five numbers, a dash, and four more numbers for a total of ten characters.

 
We will expand upon our whitelist validation routine for ASVS 5.3 found in our first post on XSS countermeasures for these examples.

Type Validation:

Type validation can be performed by manually setting the type of the data before passing it to the validation function. For example:

  // Manually set the data type to string.  We are validating 
  // the value to make sure it possesses all numbers, but the 
  // type is represented as a string of data in PHP.  After 
  // setting the type, we can pass it to our validation routine.
  $myData = (string)$_POST["data"];
  $validated = dataValidator("number", $myData);

  function dataValidator($regex, $data) {
    $decodeData = html_entity_decode($data, ENT_QUOTES, 'UTF-8');

    $typeArray = array(
      "number" => '/^[0-9]+$/',
      "letter" => '/^[a-zA-Z]+$/',
      "alphan" => '/^[a-zA-Z0-9]+$/'
    );

    $sanitizeArray = array(
      "number" => '/[^0-9]/',
      "letter" => '/[^a-zA-Z]/',
      "alphan" => '/[^a-zA-Z0-9]/'
    );

    if (preg_match($typeArray[$regex], $decodeData)) {
      return $decodeData;
    } else {
      $sanitized = preg_replace($sanitizeArray[$regex], '', $decodeData);
      return $sanitized;
    }
  }

 
PHP provides several types for use in casting. You can probably statically set the type to “string” for most PHP web applications. It is unlikely that you will want to pass arrays in GET/POST requests. In this case, you can change the decode routine to statically set the data type to string:

  $decodeData = html_entity_decode((string)$data, ENT_QUOTES, 'UTF-8');

 
Length Validation:

We can automatically restrict the size of the input in PHP by using the substr function. This function can be used to just take the desired amount of characters. The function takes in three variables; the string to be used, the start character to use, and the length from the start character to use. Our function can be changed to accept an additional parameter signifying the maximum length of the data:

  // Modify the function to accept another variable that
  // signifies the maximum length of the input data.
  function dataValidator($regex, $data, $dataLen) {
    $decodeData = html_entity_decode((string)$data, ENT_QUOTES, 'UTF-8');

    $typeArray = array(
      "number" => '/^[0-9]+$/',
      "letter" => '/^[a-zA-Z]+$/',
      "alphan" => '/^[a-zA-Z0-9]+$/'
    );

    $sanitizeArray = array(
      "number" => '/[^0-9]/',
      "letter" => '/[^a-zA-Z]/',
      "alphan" => '/[^a-zA-Z0-9]/'
    );
    
    // If a value of 0 is passed as the length for
    // the function, then DON'T truncate the data.
    // Otherwise, set a variable for how many characters
    // there should be in the data.
    if ($dataLen == 0) {

      // Set the size to the full size of the data.  The
      // strlen function returns an integer representing
      // the size of the string.
      $setSize = strlen($decodeData);

    } else {

      // Set the size to that which was passed to the function.
      $setSize = $dataLen;

    }

    if (preg_match($typeArray[$regex], $decodeData)) {

      // Use the PHP substr function to only return the
      // right number of characters, starting at 
      // character 0.
      return substr($decodeData, 0, $setSize);

    } else {
      $sanitized = preg_replace($sanitizeArray[$regex], '', $decodeData);

      // Use the PHP substr function to only return the
      // right number of characters from the sanitized
      // string, starting at character 0.
      return substr($sanitized, 0, $setSize);
    }
  }

 
If we passed the following data to the function above, it would return 146342:

  $myData = "1saa4sdf6sdaf3424";
  $validated = dataValidator("number", $myData, 6);

 

Conclusion:

These additional checks will improve the data input into the application while also eliminating many classes of vulnerability. These in no way represent all of the ways in which you can and should perform input validation. Other scenarios could include cases where input data should always start with a specific character, or should always contain a certain series of characters, etc. There are many one off cases unique to applications that can be considered to improve the security and functionality of your web site.

Just like that, we are done with OWASP 2010 A2 – Cross-Site Scripting (XSS). Our next post will cover a new section in the OWASP 2010 Top 10. It will remain a surprise until posted.

Leave a Reply

Your email address will not be published. Required fields are marked *