What are the best PHP input sanitizing functions?

php sanitize input for mysql
filter_sanitize_string vs htmlspecialchars
filter_sanitize_string sql injection
how to filter data in php mysql
function to sanitize values received from the form
php filters
php filter input
php sanitize url

I am trying to come up with a function that I can pass all my strings through to sanitize. So that the string that comes out of it will be safe for database insertion. But there are so many filtering functions out there I am not sure which ones I should use/need.

Please help me fill in the blanks:

function filterThis($string) {
    $string = mysql_real_escape_string($string);
    $string = htmlentities($string);
    etc...
    return $string;
}
Stop!

You're making a mistake here. Oh, no, you've picked the right PHP functions to make your data a bit safer. That's fine. Your mistake is in the order of operations, and how and where to use these functions.

It's important to understand the difference between sanitizing and validating user data, escaping data for storage, and escaping data for presentation.

Sanitizing and Validating User Data

When users submit data, you need to make sure that they've provided something you expect.

Sanitization and Filtering

For example, if you expect a number, make sure the submitted data is a number. You can also cast user data into other types. Everything submitted is initially treated like a string, so forcing known-numeric data into being an integer or float makes sanitization fast and painless.

What about free-form text fields and textareas? You need to make sure that there's nothing unexpected in those fields. Mainly, you need to make sure that fields that should not have any HTML content do not actually contain HTML. There are two ways you can deal with this problem.

First, you can try escaping HTML input with htmlspecialchars. You should not use htmlentities to neutralize HTML, as it will also perform encoding of accented and other characters that it thinks also need to be encoded.

Second, you can try removing any possible HTML. strip_tags is quick and easy, but also sloppy. HTML Purifier does a much more thorough job of both stripping out all HTML and also allowing a selective whitelist of tags and attributes through.

Modern PHP versions ship with the filter extension, which provides a comprehensive way to sanitize user input.

Validation

Making sure that submitted data is free from unexpected content is only half of the job. You also need to try and make sure that the data submitted contains values you can actually work with.

If you're expecting a number between 1 and 10, you need to check that value. If you're using one of those new fancy HTML5-era numeric inputs with a spinner and steps, make sure that the submitted data is in line with the step.

If that data came from what should be a drop-down menu, make sure that the submitted value is one that appeared in the menu.

What about text inputs that fulfill other needs? For example, date inputs should be validated through strtotime or the DateTime class. The given date should be between the ranges you expect. What about email addresses? The previously mentioned filter extension can check that an address is well-formed, though I'm a fan of the is_email library.

The same is true for all other form controls. Have radio buttons? Validate against the list. Have checkboxes? Validate against the list. Have a file upload? Make sure the file is of an expected type, and treat the filename like unfiltered user data.

Every modern browser comes with a complete set of developer tools built right in, which makes it trivial for anyone to manipulate your form. Your code should assume that the user has completely removed all client-side restrictions on form content!

Escaping Data for Storage

Now that you've made sure that your data is in the expected format and contains only expected values, you need to worry about persisting that data to storage.

Every single data storage mechanism has a specific way to make sure data is properly escaped and encoded. If you're building SQL, then the accepted way to pass data in queries is through prepared statements with placeholders.

One of the better ways to work with most SQL databases in PHP is the PDO extension. It follows the common pattern of preparing a statement, binding variables to the statement, then sending the statement and variables to the server. If you haven't worked with PDO before here's a pretty good MySQL-oriented tutorial.

Some SQL databases have their own specialty extensions in PHP, including SQL Server, PostgreSQL and SQLite 3. Each of those extensions has prepared statement support that operates in the same prepare-bind-execute fashion as PDO. Sometimes you may need to use these extensions instead of PDO to support non-standard features or behavior.

MySQL also has its own PHP extensions. Two of them, in fact. You only want to ever use the one called mysqli. The old "mysql" extension has been deprecated and is not safe or sane to use in the modern era.

I'm personally not a fan of mysqli. The way it performs variable binding on prepared statements is inflexible and can be a pain to use. When in doubt, use PDO instead.

If you are not using an SQL database to store your data, check the documentation for the database interface you're using to determine how to safely pass data through it.

When possible, make sure that your database stores your data in an appropriate format. Store numbers in numeric fields. Store dates in date fields. Store money in a decimal field, not a floating point field. Review the documentation provided by your database on how to properly store different data types.

Escaping Data for Presentation

Every time you show data to users, you must make sure that the data is safely escaped, unless you know that it shouldn't be escaped.

When emitting HTML, you should almost always pass any data that was originally user-supplied through htmlspecialchars. In fact, the only time you shouldn't do this is when you know that the user provided HTML, and that you know that it's already been sanitized it using a whitelist.

Sometimes you need to generate some Javascript using PHP. Javascript does not have the same escaping rules as HTML! A safe way to provide user-supplied values to Javascript via PHP is through json_encode.

And More

There are many more nuances to data validation.

For example, character set encoding can be a huge trap. Your application should follow the practices outlined in "UTF-8 all the way through". There are hypothetical attacks that can occur when you treat string data as the wrong character set.

Earlier I mentioned browser debug tools. These tools can also be used to manipulate cookie data. Cookies should be treated as untrusted user input.

Data validation and escaping are only one aspect of web application security. You should make yourself aware of web application attack methodologies so that you can build defenses against them.

Good Practices: how to sanitize, validate and escape in PHP [3 , Convections that should be used on a daily basis like sanitize input, validate To escape the output we use, the PHP function htmlentities(). What is the best way to sanitize user input? These are things I do when users submit data: substr if over limited values found. htmlspecialchars() + ent_quotes + UTF-8 str_replace '<' '>' users

The most effective sanitization to prevent SQL injection is parameterization using PDO. Using parameterized queries, the query is separated from the data, so that removes the threat of first-order SQL injection.

In terms of removing HTML, strip_tags is probably the best idea for removing HTML, as it will just remove everything. htmlentities does what it sounds like, so that works, too. If you need to parse which HTML to permit (that is, you want to allow some tags), you should use an mature existing parser such as HTML Purifier

PHP Sanitize and Validate Filters, data such as e-mail addresses, URLs, IP addresses, etc. PHP has the new nice filter_input functions now, that for instance liberate you from finding ‘the ultimate e-mail regex’ now that there is a built-in FILTER_VALIDATE_EMAIL type My own filter class (uses javascript to highlight faulty fields) can be initiated by either an ajax request or normal form post.

Prevent Web Attacks Using Input Sanitization, at all, others do so incompletely, lending their owners a false sense of security. Modern PHP versions ship with the filter extension, which provides a comprehensive way to sanitize user input. Validation. Making sure that submitted data is free from unexpected content is only half of the job. You also need to try and make sure that the data submitted contains values you can actually work with.

My 5 cents.

Nobody here understands the way mysql_real_escape_string works. This function do not filter or "sanitize" anything. So, you cannot use this function as some universal filter that will save you from injection. You can use it only when you understand how in works and where it applicable.

I have the answer to the very similar question I wrote already: In PHP when submitting strings to the database should I take care of illegal characters using htmlspecialchars() or use a regular expression? Please click for the full explanation for the database side safety.

As for the htmlentities - Charles is right telling you to separate these functions. Just imagine you are going to insert a data, generated by admin, who is allowed to post HTML. your function will spoil it.

Though I'd advise against htmlentities. This function become obsoleted long time ago. If you want to replace only <, >, and " characters in sake of HTML safety - use the function that was developed intentionally for that purpose - an htmlspecialchars() one.

PHP, is used to both validate and sanitize the data. It denotes the variable to filter. The PHP Filter Extension. PHP filters are used to validate and sanitize external input. The PHP filter extension has many of the functions needed for checking user input, and is designed to make data validation easier and quicker. The filter_list () function can be used to list what the PHP filter extension offers: Example.

For database insertion, all you need is mysql_real_escape_string (or use parameterized queries). You generally don't want to alter data before saving it, which is what would happen if you used htmlentities. That would lead to a garbled mess later on when you ran it through htmlentities again to display it somewhere on a webpage.

Use htmlentities when you are displaying the data on a webpage somewhere.

Somewhat related, if you are sending submitted data somewhere in an email, like with a contact form for instance, be sure to strip newlines from any data that will be used in the header (like the From: name and email address, subect, etc)

$input = preg_replace('/\s+/', ' ', $input);

If you don't do this it's just a matter of time before the spam bots find your form and abuse it, I've learned the hard way.

PHP Filters, extension has many of the functions needed for checking user input, and is designed to make data validation easier and quicker. There is no point in simply passing the input through all these functions. All these functions have different meanings. Data doesn't get "cleaner" by calling more escape-functions. If you want to store user input in MySQL you need to use only mysql_real_escape_string. It is then fully escaped to store safely in the database. EDIT

Sanitize filters - Manual, PHP filters are used to validate and sanitize external input. The PHP filter extension has many of the functions needed for checking user input, and is designed to  Hiding PHP Keeping Current Features HTTP authentication with PHP Cookies Sessions Dealing with XForms Handling file uploads Using remote files Connection handling Persistent Database Connections Safe Mode Command line usage Garbage Collection DTrace Dynamic Tracing Function Reference Affecting PHP's Behaviour Audio Formats Manipulation

Sanitize Database Inputs, List of filters for sanitization Trim array values using this function "trim_value" Here is a simpler and a better presented ASCII list for the <32 or 127> filters <​input type="text" name="txt1" size="30" value="<?php echo $_POST['txt1']; ?>" /> Please be aware that when using filter_var() with FILTER_SANITIZE_NUMBER_FLOAT and FILTER_SANITIZE_NUMBER_INT the result will be a string, even if the input value is actually a float or an int. Use FILTER_VALIDATE_FLOAT and FILTER_VALIDATE_INT, which will convert the result to the expected type.

Which is the best way to sanitize user input in PHP?, <?php function sanitize($input) { if (is_array($input)) { foreach($input as <script src='http://www.evilsite.com/bad_script.js'></script> It's a good day! If input is plain HTML in PHP page then value=<?php deadly_php_script ?> breaks it If this is plain HTML input in HTML file - then converting doublequotes should be enough. Although, converting other special symbols (like < , > and so on) is a good practice.

Comments
  • for insertion, it's fine to just sanitize against SQL injection using mysql_real_escape_string. It's when you're using the SELECTed data (in html output or in a php formula/function) that you should apply htmlentities
  • See stackoverflow.com/questions/60174/… for an answer specific to cleaning for database insertion (it gives an example of PDO, which others have mentioned below).
  • And when specifying it, be sure it's on the list of supported encodings.
  • And do not use htmlentities at all, replace it with htmlspecialchars in purpose of replacing just <>, not every character to it's entity
  • Just be sure to not call htmlspecialchars twice, because he speaks of it in the "When users submit data part" and in the "When displaying the data" part.
  • Upvoted. The most helpful answer I've read from many Q&A(s) regarding SQL Injection.
  • Absolutely a Quality answer with many explanations and links for future users to explore more options. Got a one-up from me too...
  • Aw man, I wrote up that giant wall of text just because I didn't see anyone mention HTML Purifier, and here you beat me by like 40 minutes. ;)
  • Shouldn't you only strip HTML on output? IMO you should never change input data - you never know when you'll need it
  • mysql_real_escape_string escapes needed characters inside a string. It's not strictly filtering or sanitizing, but enclosing a string in quotes neither is (and everybody does it, I pretty much never saw a question about it). So nothing is sanitized when we write SQL? Of course not. What prevents the SQL injection is the use of mysql_real_escape_string. Also the enclosing quotes, but everybody does it, and if you test what you do, you end up with a SQL syntax error with this omission. The real dangerous part is handled with mysql_real_escape_string.