3.11 Preventing SQL Injection Attacks
3.11.1 Problem
You are developing an application that
interacts with a SQL database, and you need to defend against SQL
injection attacks.
3.11.2 Solution
SQL injection attacks are most common in web applications that use a
database to store data, but they can occur anywhere that a SQL
command string is constructed from any type of input from a user.
Specifically, a SQL injection attack is mounted by inserting
characters into the command string that creates a compound command in
a single string. For example, suppose a query string is created with
a WHERE clause that is constructed from user
input. A proper command might be:
SELECT * FROM people WHERE first_name="frank";
If the value "frank" comes directly
from user input and is not properly validated, an attacker could
include a closing double quote and a semicolon that would complete
the SELECT command and allow the attacker to
append additional commands. For example:
SELECT * FROM people WHERE first_name="frank"; DROP TABLE people;
Obviously, the best way to avoid SQL injection attacks is to not
create SQL command strings that include any user
input. In some small number of
applications, this may be feasible, but more frequently it is not.
Avoid including user input in SQL commands as much as you can, but
where it cannot be avoided, you should escape dangerous characters.
3.11.3 Discussion
SQL injection attacks are really just general input validation
problems. Unfortunately, there is no perfect solution to preventing
these types of attacks. Your best defense is to apply strict checking
of input—even going so far as to refuse questionable input
rather than attempt to escape it—and hope that that is a strong
enough defense.
There are two main approaches that can be taken to avoid SQL
injection attacks:
- Restrict user input to the smallest character set possible, and refuse any input that contains character outside of that set.
-
In many cases, user input needs to be used in queries such as looking
up a username or a message number, or some other relatively simple
piece of information. It is rare to need any character in a user name
other than the set of alphanumeric characters. Similarly, message
numbers or other similar identifiers can safely be restricted to
digits.
With SQL, problems start to occur when symbol characters that have
special meaning are allowed. Examples of such characters are quotes
(both double and single), semicolons, percent symbols, hyphens, and
underscores. Avoid these characters wherever possible; they are often
unnecessary, and allowing them at all just makes things more
difficult for everyone except an attacker.
- Escape characters that have special significant to SQL command processors.
-
In SQL parlance, anything that is not a keyword or an identifier is a
literal. Keywords are portions of a SQL command such as
SELECT or WHERE, and an
identifier would typically be the name of a table or the name of a
field. In some cases, SQL syntax allows literals to appear without
enclosing quotes, but as a general rule you should always enclose
literals with quotes.
Literals should always be enclosed in single quotes
('), but some SQL implementations allow you to use
either single or double quotes ("). Whichever you choose to use,
always close the literal with the same character with which you
opened it.
Within literals, most characters are safe to leave unescaped, and in
many cases, it is not possible to escape them. Certainly, with
whichever quoting character you choose to use with your literals, you
may need to allow that character inside the literal. Escaping quotes
is done by doubling up on the quote character. Other characters that
should always be escaped are control characters and the escape
character itself (a backslash).
Finally, if you are using the LIKE keyword in a
WHERE clause, you may wish to prevent input from
containing wildcard characters. In fact, it is a good idea to prevent
wildcard characters in most circumstances.
Wildcard
characters include the percent symbol, underscore, and square
brackets.
You can use the function spc_escape_sql(
), shown at the end of this section, to escape
all of the characters that we've mentioned. As a
convenience (and partly due to necessity), the function will also
surround the escaped string with the quote character of your choice.
The return from the function will be the quoted and escaped version
of the input string. If an error occurs (e.g., out of memory, or an
invalid quoting character chosen), the return will be
NULL.
spc_escape_sql( ) requires three arguments:
- input
-
The string that is to be escaped.
- quote
-
The quote character to use. It must be either a single or double
quote. Any other character will cause spc_escape_sql(
) to return failure.
- wildcards
-
If this argument is specified as 0, wildcard characters recognized by
the LIKE operator in a WHERE
clause will not be escaped; otherwise, they will be. You should only
escape wildcards when you are going to be using the escaped string as
the right-hand side for the LIKE operator.
#include <stdlib.h>
#include <string.h>
char *spc_escape_sql(const char *input, char quote, int wildcards) {
char *out, *ptr;
const char *c;
/* If every character in the input needs to be escaped, the resulting string
* would at most double in size. Also, include room for the surrounding
* quotes.
*/
if (quote != '\'' && quote != '\"') return 0;
if (!(out = ptr = (char *)malloc(strlen(input) * 2 + 2 + 1))) return 0;
*ptr++ = quote;
for (c = input; *c; c++) {
switch (*c) {
case '\'': case '\"':
if (quote == *c) *ptr++ = *c;
*ptr++ = *c;
break;
case '%': case '_': case '[': case ']':
if (wildcards) *ptr++ = '\\';
*ptr++ = *c;
break;
case '\\': *ptr++ = '\\'; *ptr++ = '\\'; break;
case '\b': *ptr++ = '\\'; *ptr++ = 'b'; break;
case '\n': *ptr++ = '\\'; *ptr++ = 'n'; break;
case '\r': *ptr++ = '\\'; *ptr++ = 'r'; break;
case '\t': *ptr++ = '\\'; *ptr++ = 't'; break;
default:
*ptr++ = *c;
break;
}
}
*ptr++ = quote;
*ptr = 0;
return out;
}
|