Accelerated Security Course - Episode 1: Never Trust Foreign Data
This article was written by Damien Metzger, and first published on the PrestaShop blog, on June 22th, 2011.
If there’s only one rule regarding security that a developer must adhere to, it’s this: Never Trust Foreign Data. Together, we’ll take a look at what this modern saying signifies:
- Never: In the realm of security, you must never aim lower than perfection. A 99% secure code is a code that is flawed. A hacker doesn’t need much to hack it. Once the weakness or the entry point is found, he can often extend his power and add back doors.
- Trust: What we mean by saying “don’t trust!” is that you should always consider all data as malicious. Always assume that the variable you use has been set by a hacker, and you’ll never be caught off guard. Don’t try to think about how the hacker can do it, just always assume that he or she can.
- Foreign: The definition of “foreign” data must be very strict. In general, any variable that has not been assigned within the function in which it is used must be considered as foreign. Even if you yourself passed the parameter just a few minutes earlier, you don’t know what other developer will modify your code later, nor would you necessarily remember a few months later exactly what you did.
- Data: data is generally a variable. It can be passed as a parameter in
GET, from a cookie, or even from
$_SERVER(no less foreign than the rest!). However, data can also be the contents of a file, or even the path to that file, the response from a web service, the headers of HTTP requests and much more. Anything that is not directly displayed in your favorite code editor is data.
We must therefore remember that not everything we use is reliable. The advantage of constantly keeping this idea mind is that it quickly becomes natural. It becomes unthinkable not to operate this way.
What, then, is the solution? What should we do with our data, if you can not use it however you want?
- Do checks. If you expect an integer, then double-check that you are using an integer. If it’s an MD5 Hash, then there shouldn’t be anything but alphanumeric characters. Make regular expressions, whitelist when possible, and blacklist when it’s not. Be aware however that the regexps are CPU consuming, when possible use simpler functions:
- Make casts. With a cast, you are 100% sure of the type of data that you use. That’s something... Of course, a cast to string will not solve your problems, and constant viligence is always important.