Proper filtering and validation is a common problem when processing data submitted by users from an online form. It is arguably also the number one security vulnerability for a website. Furthermore, it can be quite awkward to have the filters and validators scattered all over the application. A chaining mechanism would resolve these issues neatly, and would also allow you to exert control over the order in which the filters and validators are processed.
filter_input_array(), that, at first glance, seems well suited for this task. Looking more deeply into its functionality, however, it soon becomes apparent that this function was designed in the early days, and is not up to modern requirements for protection against attack and flexibility. Accordingly, we will instead present a much more flexible mechanism based on an array of callbacks performing filtering and validation.Application\Filter\Result.Result class will be to hold a $item value, which would be the filtered value or a boolean result of validation. Another property, $messages, will hold an array of messages populated during the filtering or validation operation. In the constructor, the value supplied for $messages is formulated as an array. You might observe that both properties are defined public. This is to facilitate ease of access:namespace Application\Filter;
class Result
{
public $item; // (mixed) filtered data | (bool) result of validation
public $messages = array(); // [(string) message, (string) message ]
public function __construct($item, $messages)
{
$this->item = $item;
if (is_array($messages)) {
$this->messages = $messages;
} else {
$this->messages = [$messages];
}
}Result instance with another. This is important as at some point we will be processing the same value through a chain of filters. In such a case, we want the newly filtered value to overwrite the existing one, but we want the messages to be merged:public function mergeResults(Result $result)
{
$this->item = $result->item;
$this->mergeMessages($result);
}
public function mergeMessages(Result $result)
{
if (isset($result->messages) && is_array($result->messages)) {
$this->messages = array_merge($this->messages, $result->messages);
}
}FALSE, up or down the validation chain, must cause the entire result to be FALSE:public function mergeValidationResults(Result $result)
{
if ($this->item === TRUE) {
$this->item = (bool) $result->item;
}
$this->mergeMessages($result);
}
}Application\Filter\CallbackInterface interface. You will note that we are taking advantage of the PHP 7 ability to data type the return value to ensure that we are getting a Result instance in return:namespace Application\Filter;
interface CallbackInterface
{
public function __invoke ($item, $params) : Result;
}Application\Filter\Messages class with a series of static properties. We provide methods to set all messages, or just one message. The $messages property has been made public for easier access:namespace Application\Filter;
class Messages
{
const MESSAGE_UNKNOWN = 'Unknown';
public static $messages;
public static function setMessages(array $messages)
{
self::$messages = $messages;
}
public static function setMessage($key, $message)
{
self::$messages[$key] = $message;
}
public static function getMessage($key)
{
return self::$messages[$key] ?? self::MESSAGE_UNKNOWN;
}
}Application\Web\AbstractFilter class that implements core functionality. As mentioned previously, this class will be relatively lightweight and we do not need to worry about specific filters or validators as they will be supplied through configuration. We use the UnexpectedValueException class, provided as part of the PHP 7 Standard PHP Library (SPL), in order to throw a descriptive exception in case one of the callbacks does not implement CallbackInterface:namespace Application\Filter;
use UnexpectedValueException;
abstract class AbstractFilter
{
// code described in the next several bulletsconst BAD_CALLBACK = 'Must implement CallbackInterface'; const DEFAULT_SEPARATOR = '<br>' . PHP_EOL; const MISSING_MESSAGE_KEY = 'item.missing'; const DEFAULT_MESSAGE_FORMAT = '%20s : %60s'; const DEFAULT_MISSING_MESSAGE = 'Item Missing';
$separator is used in conjunction with filtering and validation messages. $callbacks represents the array of callbacks that perform filtering and validation. $assignments map data fields to filters and/or validators. $missingMessage is represented as a property so that it can be overwritten (that is, for multi-language websites). Finally, $results is an array of Application\Filter\Result objects and is populated by the filtering or validation operation:protected $separator; // used for message display protected $callbacks; protected $assignments; protected $missingMessage; protected $results = array();
__construct() method. Its main function is to set the array of callbacks and assignments. It also either sets values or accepts defaults for the separator (used in message display), and the missing message:public function __construct(array $callbacks, array $assignments,
$separator = NULL, $message = NULL)
{
$this->setCallbacks($callbacks);
$this->setAssignments($assignments);
$this->setSeparator($separator ?? self::DEFAULT_SEPARATOR);
$this->setMissingMessage($message
?? self::DEFAULT_MISSING_MESSAGE);
}setOneCall() checks to see if the callback implements CallbackInterface. If it does not, an UnexpectedValueException is thrown:public function getCallbacks()
{
return $this->callbacks;
}
public function getOneCallback($key)
{
return $this->callbacks[$key] ?? NULL;
}
public function setCallbacks(array $callbacks)
{
foreach ($callbacks as $key => $item) {
$this->setOneCallback($key, $item);
}
}
public function setOneCallback($key, $item)
{
if ($item instanceof CallbackInterface) {
$this->callbacks[$key] = $item;
} else {
throw new UnexpectedValueException(self::BAD_CALLBACK);
}
}
public function removeOneCallback($key)
{
if (isset($this->callbacks[$key]))
unset($this->callbacks[$key]);
}getItemsAsArray(), otherwise getResults() will return an array of Result objects:public function getResults()
{
return $this->results;
}
public function getItemsAsArray()
{
$return = array();
if ($this->results) {
foreach ($this->results as $key => $item)
$return[$key] = $item->item;
}
return $return;
}$this ->results and extracting the $messages property. For convenience, we also added getMessageString() with some formatting options. To easily produce an array of messages, we use the PHP 7 yield from syntax. This has the effect of turning getMessages() into a delegating generator. The array of messages becomes a sub-generator:public function getMessages()
{
if ($this->results) {
foreach ($this->results as $key => $item)
if ($item->messages) yield from $item->messages;
} else {
return array();
}
}
public function getMessageString($width = 80, $format = NULL)
{
if (!$format)
$format = self::DEFAULT_MESSAGE_FORMAT . $this->separator;
$output = '';
if ($this->results) {
foreach ($this->results as $key => $value) {
if ($value->messages) {
foreach ($value->messages as $message) {
$output .= sprintf(
$format, $key, trim($message));
}
}
}
}
return $output;
} public function setMissingMessage($message)
{
$this->missingMessage = $message;
}
public function setSeparator($separator)
{
$this->separator = $separator;
}
public function getSeparator()
{
return $this->separator;
}
public function getAssignments()
{
return $this->assignments;
}
public function setAssignments(array $assignments)
{
$this->assignments = $assignments;
}
// closing bracket for class AbstractFilter
}Application\Filter\Filter. We make this class extend AbstractFilter in order to provide the core functionality described previously:namespace Application\Filter;
class Filter extends AbstractFilter
{
// code
}process() method that scans an array of data and applies filters as per the array of assignments. If there are no assigned filters for this data set, we simply return NULL:public function process(array $data)
{
if (!(isset($this->assignments)
&& count($this->assignments))) {
return NULL;
}$this->results to an array of Result objects where the $item property is the original value from $data, and the $messages property is an empty array:foreach ($data as $key => $value) {
$this->results[$key] = new Result($value, array());
}$this->assignments and check to see if there are any global filters (identified by the '*' key. If so, we run processGlobal() and then unset the '*' key:$toDo = $this->assignments;
if (isset($toDo['*'])) {
$this->processGlobalAssignment($toDo['*'], $data);
unset($toDo['*']);
}processAssignment():foreach ($toDo as $key => $assignment) {
$this->processAssignment($assignment, $key);
}processGlobalAssignment() we need to loop through the array of callbacks. In this case, however, because these assignments are global, we also need to loop through the entire data set, and apply each global filter in turn:protected function processGlobalAssignment($assignment, $data)
{
foreach ($assignment as $callback) {
if ($callback === NULL) continue;
foreach ($data as $k => $value) {
$result = $this->callbacks[$callback['key']]($this->results[$k]->item,
$callback['params']);
$this->results[$k]->mergeResults($result);
}
}
}The tricky bit is this line of code:
$result = $this->callbacks[$callback['key']]($this ->results[$k]->item, $callback['params']);
Remember, each callback is actually an anonymous class that defines the PHP magic __invoke() method. The arguments supplied are the actual data item to be filtered, and an array of parameters. By running $this->callbacks[$callback['key']]() we are in fact magically calling __invoke().
processAssignment(), in a manner akin to processGlobalAssignment(), we need to execute each remaining callback assigned to each data key: protected function processAssignment($assignment, $key)
{
foreach ($assignment as $callback) {
if ($callback === NULL) continue;
$result = $this->callbacks[$callback['key']]($this->results[$key]->item,
$callback['params']);
$this->results[$key]->mergeResults($result);
}
}
} // closing brace for Application\Filter\FilterCreate an Application\Filter folder. In this folder, create the following class files, using code from the preceding steps:
|
Application\Filter\* class file |
Code described in these steps |
|---|---|
|
|
3 - 5 |
|
|
6 |
|
|
7 |
|
|
8 - 15 |
|
|
16 - 22 |
Next, take the code discussed in step 5, and use it to configure an array of messages in a chap_06_post_data_config_messages.php file. Each callback references the Messages::$messages property. Here is a sample configuration:
<?php
use Application\Filter\Messages;
Messages::setMessages(
[
'length_too_short' => 'Length must be at least %d',
'length_too_long' => 'Length must be no more than %d',
'required' => 'Please be sure to enter a value',
'alnum' => 'Only letters and numbers allowed',
'float' => 'Only numbers or decimal point',
'email' => 'Invalid email address',
'in_array' => 'Not found in the list',
'trim' => 'Item was trimmed',
'strip_tags' => 'Tags were removed from this item',
'filter_float' => 'Converted to a decimal number',
'phone' => 'Phone number is [+n] nnn-nnn-nnnn',
'test' => 'TEST',
'filter_length' => 'Reduced to specified length',
]
);Next, create a chap_06_post_data_config_callbacks.php callback configuration file that contains configuration for filtering callbacks, as described in step 4. Each callback should follow this generic template:
'callback_key' => new class () implements CallbackInterface
{
public function __invoke($item, $params) : Result
{
$changed = array();
$filtered = /* perform filtering operation on $item */
if ($filtered !== $item) $changed = Messages::$messages['callback_key'];
return new Result($filtered, $changed);
}
}The callbacks themselves must implement the interface and return a Result instance. We can take advantage of the PHP 7
anonymous class capability by having our callbacks return an anonymous class that implements CallbackInterface. Here is how an array of filtering callbacks might look:
use Application\Filter\ { Result, Messages, CallbackInterface };
$config = [ 'filters' => [
'trim' => new class () implements CallbackInterface
{
public function __invoke($item, $params) : Result
{
$changed = array();
$filtered = trim($item);
if ($filtered !== $item)
$changed = Messages::$messages['trim'];
return new Result($filtered, $changed);
}
},
'strip_tags' => new class ()
implements CallbackInterface
{
public function __invoke($item, $params) : Result
{
$changed = array();
$filtered = strip_tags($item);
if ($filtered !== $item)
$changed = Messages::$messages['strip_tags'];
return new Result($filtered, $changed);
}
},
// etc.
]
];For test purposes, we will use the prospects table as a target. Instead of providing data from $_POST, we will construct an array of good and bad data:

You can now create a chap_06_post_data_filtering.php script that sets up autoloading, includes the messages and callbacks configuration files:
<?php require __DIR__ . '/../Application/Autoload/Loader.php'; Application\Autoload\Loader::init(__DIR__ . '/..'); include __DIR__ . '/chap_06_post_data_config_messages.php'; include __DIR__ . '/chap_06_post_data_config_callbacks.php';
You then need to define assignments that represent a mapping between the data fields and filter callbacks. Use the * key to define a global filter that applies to all data:
$assignments = [
'*' => [ ['key' => 'trim', 'params' => []],
['key' => 'strip_tags', 'params' => []] ],
'first_name' => [ ['key' => 'length',
'params' => ['length' => 128]] ],
'last_name' => [ ['key' => 'length',
'params' => ['length' => 128]] ],
'city' => [ ['key' => 'length',
'params' => ['length' => 64]] ],
'budget' => [ ['key' => 'filter_float', 'params' => []] ],
];Next, define good and bad test data:
$goodData = [ 'first_name' => 'Your Full', 'last_name' => 'Name', 'address' => '123 Main Street', 'city' => 'San Francisco', 'state_province' => 'California', 'postal_code' => '94101', 'phone' => '+1 415-555-1212', 'country' => 'US', 'email' => 'your@email.address.com', 'budget' => '123.45', ]; $badData = [ 'first_name' => 'This+Name<script>bad tag</script>Valid!', 'last_name' => 'ThisLastNameIsWayTooLongAbcdefghijklmnopqrstuvwxyz0123456789Abcdefghijklmnopqrstuvwxyz0123456789Abcdefghijklmnopqrstuvwxyz0123456789Abcdefghijklmnopqrstuvwxyz0123456789', //'address' => '', // missing 'city' => ' ThisCityNameIsTooLong012345678901234567890123456789012345678901234567890123456789 ', //'state_province'=> '', // missing 'postal_code' => '!"£$%^Non Alpha Chars', 'phone' => ' 12345 ', 'country' => 'XX', 'email' => 'this.is@not@an.email', 'budget' => 'XXX', ];
Finally, you can create an Application\Filter\Filter instance, and test the data:
$filter = new Application\Filter\Filter( $config['filters'], $assignments); $filter->setSeparator(PHP_EOL); $filter->process($goodData); echo $filter->getMessageString(); var_dump($filter->getItemsAsArray()); $filter->process($badData); echo $filter->getMessageString(); var_dump($filter->getItemsAsArray());
Processing good data produces no messages other than one indicating that the value for the float field was converted from string to float. The bad data, on the other hand, produces the following output:

You will also notice that tags were removed from first_name, and that both last_name and city were truncated.
The filter_input_array() function takes two arguments: the input source (in the form of a pre-defined constant used to indicate one of the $_* PHP super-globals, that is, $_POST), and an array of matching field definitions as keys and filters or validators as values. This function performs not only filtering operations, but validation as well. The flags labeled sanitize are actually filters.
Documentation and examples of filter_input_array() can be found at http://php.net/manual/en/function.filter-input-array.php. You might also have a look at the different types of filters that are available on http://php.net/manual/en/filter.filters.php.