Table of Contents for
Using SQLite

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Using SQLite by Jay A. Kreibich Published by O'Reilly Media, Inc., 2010
  1. Cover
  2. Using SQLite
  3. O'Reilly Strata Conference
  4. Using SQLite
  5. Dedication
  6. A Note Regarding Supplemental Files
  7. Preface
  8. SQLite Versions
  9. Email Lists
  10. Example Code Download
  11. How We Got Here
  12. Conventions Used in This Book
  13. Using Code Examples
  14. Safari® Books Online
  15. How to Contact Us
  16. 1. What Is SQLite?
  17. Self-Contained, No Server Required
  18. Single File Database
  19. Zero Configuration
  20. Embedded Device Support
  21. Unique Features
  22. Compatible License
  23. Highly Reliable
  24. 2. Uses of SQLite
  25. Database Junior
  26. Application Files
  27. Application Cache
  28. Archives and Data Stores
  29. Client/Server Stand-in
  30. Teaching Tool
  31. Generic SQL Engine
  32. Not the Best Choice
  33. Big Name Users
  34. 3. Building and Installing SQLite
  35. SQLite Products
  36. Precompiled Distributions
  37. Documentation Distribution
  38. Source Distributions
  39. Building
  40. Build and Installation Options
  41. An sqlite3 Primer
  42. Summary
  43. 4. The SQL Language
  44. Learning SQL
  45. Brief Background
  46. General Syntax
  47. SQL Data Languages
  48. Data Definition Language
  49. Data Manipulation Language
  50. Transaction Control Language
  51. System Catalogs
  52. Wrap-up
  53. 5. The SELECT Command
  54. SQL Tables
  55. The SELECT Pipeline
  56. Advanced Techniques
  57. SELECT Examples
  58. What’s Next
  59. 6. Database Design
  60. Tables and Keys
  61. Common Structures and Relationships
  62. Normal Form
  63. Indexes
  64. Transferring Design Experience
  65. Closing
  66. 7. C Programming Interface
  67. API Overview
  68. Library Initialization
  69. Database Connections
  70. Prepared Statements
  71. Bound Parameters
  72. Convenience Functions
  73. Result Codes and Error Codes
  74. Utility Functions
  75. Summary
  76. 8. Additional Features and APIs
  77. Date and Time Features
  78. ICU Internationalization Extension
  79. Full-Text Search Module
  80. R*Trees and Spatial Indexing Module
  81. Scripting Languages and Other Interfaces
  82. Mobile and Embedded Development
  83. Additional Extensions
  84. 9. SQL Functions and Extensions
  85. Scalar Functions
  86. Aggregate Functions
  87. Collation Functions
  88. SQLite Extensions
  89. 10. Virtual Tables and Modules
  90. Introduction to Modules
  91. Module API
  92. Simple Example: dblist Module
  93. Advanced Example: weblog Module
  94. Best Index and Filter
  95. Wrap-Up
  96. A. SQLite Build Options
  97. Shell Directives
  98. ENABLE_READLINE
  99. Default Values
  100. SQLITE_DEFAULT_AUTOVACUUM
  101. SQLITE_DEFAULT_CACHE_SIZE
  102. SQLITE_DEFAULT_FILE_FORMAT
  103. SQLITE_DEFAULT_JOURNAL_SIZE_LIMIT
  104. SQLITE_DEFAULT_MEMSTATUS
  105. SQLITE_DEFAULT_PAGE_SIZE
  106. SQLITE_DEFAULT_TEMP_CACHE_SIZE
  107. YYSTACKDEPTH
  108. Sizes and Limits
  109. SQLITE_MAX_ATTACHED
  110. SQLITE_MAX_COLUMN
  111. SQLITE_MAX_COMPOUND_SELECT
  112. SQLITE_MAX_DEFAULT_PAGE_SIZE
  113. SQLITE_MAX_EXPR_DEPTH
  114. SQLITE_MAX_FUNCTION_ARG
  115. SQLITE_MAX_LENGTH
  116. SQLITE_MAX_LIKE_PATTERN_LENGTH
  117. SQLITE_MAX_PAGE_COUNT
  118. SQLITE_MAX_PAGE_SIZE
  119. SQLITE_MAX_SQL_LENGTH
  120. SQLITE_MAX_TRIGGER_DEPTH
  121. SQLITE_MAX_VARIABLE_NUMBER
  122. Operation and Behavior
  123. SQLITE_CASE_SENSITIVE_LIKE
  124. SQLITE_HAVE_ISNAN
  125. SQLITE_OS_OTHER
  126. SQLITE_SECURE_DELETE
  127. SQLITE_THREADSAFE
  128. SQLITE_TEMP_STORE
  129. Debug Settings
  130. SQLITE_DEBUG
  131. SQLITE_MEMDEBUG
  132. Enable Extensions
  133. SQLITE_ENABLE_ATOMIC_WRITE
  134. SQLITE_ENABLE_COLUMN_METADATA
  135. SQLITE_ENABLE_FTS3
  136. SQLITE_ENABLE_FTS3_PARENTHESIS
  137. SQLITE_ENABLE_ICU
  138. SQLITE_ENABLE_IOTRACE
  139. SQLITE_ENABLE_LOCKING_STYLE
  140. SQLITE_ENABLE_MEMORY_MANAGEMENT
  141. SQLITE_ENABLE_MEMSYS3
  142. SQLITE_ENABLE_MEMSYS5
  143. SQLITE_ENABLE_RTREE
  144. SQLITE_ENABLE_STAT2
  145. SQLITE_ENABLE_UPDATE_DELETE_LIMIT
  146. SQLITE_ENABLE_UNLOCK_NOTIFY
  147. YYTRACKMAXSTACKDEPTH
  148. Limit Features
  149. SQLITE_DISABLE_LFS
  150. SQLITE_DISABLE_DIRSYNC
  151. SQLITE_ZERO_MALLOC
  152. Omit Core Features
  153. B. sqlite3 Command Reference
  154. Command-Line Options
  155. Interactive Dot-Commands
  156. .backup
  157. .bail
  158. .databases
  159. .dump
  160. .echo
  161. .exit
  162. .explain
  163. .headers
  164. .help
  165. .import
  166. .indices
  167. .iotrace
  168. .load
  169. .log
  170. .mode
  171. .nullvalue
  172. .output
  173. .prompt
  174. .quit
  175. .read
  176. .restore
  177. .schema
  178. .separator
  179. .show
  180. .tables
  181. .timeout
  182. .timer
  183. .width
  184. C. SQLite SQL Command Reference
  185. SQLite SQL Commands
  186. ALTER TABLE
  187. ANALYZE
  188. ATTACH DATABASE
  189. BEGIN TRANSACTION
  190. COMMIT TRANSACTION
  191. CREATE INDEX
  192. CREATE TABLE
  193. CREATE TRIGGER
  194. CREATE VIEW
  195. CREATE VIRTUAL TABLE
  196. DELETE
  197. DETACH DATABASE
  198. DROP INDEX
  199. DROP TABLE
  200. DROP TRIGGER
  201. DROP VIEW
  202. END TRANSACTION
  203. EXPLAIN
  204. INSERT
  205. PRAGMA
  206. REINDEX
  207. RELEASE SAVEPOINT
  208. REPLACE
  209. ROLLBACK TRANSACTION
  210. SAVEPOINT
  211. SELECT
  212. UPDATE
  213. VACUUM
  214. D. SQLite SQL Expression Reference
  215. Literal Expressions
  216. Logic Representations
  217. Unary Expressions
  218. Binary Expressions
  219. Function Calls
  220. Column Names
  221. General Expressions
  222. AND
  223. BETWEEN
  224. CASE
  225. CAST
  226. COLLATE
  227. EXISTS
  228. GLOB
  229. IN
  230. IS
  231. ISNULL
  232. LIKE
  233. MATCH
  234. NOTNULL
  235. OR
  236. RAISE
  237. REGEXP
  238. SELECT
  239. E. SQLite SQL Function Reference
  240. Scalar Functions
  241. abs()
  242. changes()
  243. coalesce()
  244. date()
  245. datetime()
  246. glob()
  247. ifnull()
  248. hex()
  249. julianday()
  250. last_insert_rowid()
  251. length()
  252. like()
  253. load_extension()
  254. lower()
  255. ltrim()
  256. match()
  257. max()
  258. min()
  259. nullif()
  260. quote()
  261. random()
  262. randomblob()
  263. regex()
  264. replace()
  265. round()
  266. rtrim()
  267. sqlite_compileoption_get()
  268. sqlite_compileoption_used()
  269. sqlite_source_id()
  270. sqlite_version()
  271. strftime()
  272. substr()
  273. time()
  274. total_changes()
  275. trim()
  276. typeof()
  277. upper()
  278. zeroblob()
  279. Aggregate Functions
  280. avg()
  281. count()
  282. group_concat()
  283. max()
  284. min()
  285. sum()
  286. total()
  287. F. SQLite SQL PRAGMA Reference
  288. SQLite PRAGMAs
  289. auto_vacuum
  290. cache_size
  291. case_sensitive_like
  292. collation_list
  293. count_changes
  294. database_list
  295. default_cache_size
  296. encoding
  297. foreign_keys
  298. foreign_key_list
  299. freelist_count
  300. full_column_names
  301. fullfsync
  302. ignore_check_constraints
  303. incremental_vacuum
  304. index_info
  305. index_list
  306. integrity_check
  307. journal_mode
  308. journal_size_limit
  309. legacy_file_format
  310. locking_mode
  311. lock_proxy_file
  312. lock_status
  313. max_page_count
  314. omit_readlock
  315. page_count
  316. page_size
  317. parser_trace
  318. quick_check
  319. read_uncommitted
  320. recursive_triggers
  321. reverse_unordered_selects
  322. schema_version
  323. secure_delete
  324. short_column_names
  325. sql_trace
  326. synchronous
  327. table_info
  328. temp_store
  329. temp_store_directory
  330. user_version
  331. vdbe_trace
  332. vdbe_listing
  333. writable_schema
  334. G. SQLite C API Reference
  335. API Datatypes
  336. sqlite3
  337. sqlite3_backup
  338. sqlite3_blob
  339. sqlite3_context
  340. sqlite3_int64, sqlite3_uint64, sqlite_int64, sqlite_uint64
  341. sqlite3_module
  342. sqlite3_mutex
  343. sqlite3_stmt
  344. sqlite3_value
  345. sqlite3_vfs
  346. API Functions
  347. sqlite3_aggregate_context()
  348. sqlite3_auto_extension()
  349. sqlite3_backup_finish()
  350. sqlite3_backup_init()
  351. sqlite3_backup_pagecount()
  352. sqlite3_backup_remaining()
  353. sqlite3_backup_step()
  354. sqlite3_bind_xxx()
  355. sqlite3_bind_parameter_count()
  356. sqlite3_bind_parameter_index()
  357. sqlite3_bind_parameter_name()
  358. sqlite3_blob_bytes()
  359. sqlite3_blob_close()
  360. sqlite3_blob_open()
  361. sqlite3_blob_read()
  362. sqlite3_blob_write()
  363. sqlite3_busy_handler()
  364. sqlite3_busy_timeout()
  365. sqlite3_changes()
  366. sqlite3_clear_bindings()
  367. sqlite3_close()
  368. sqlite3_collation_needed()
  369. sqlite3_column_xxx()
  370. sqlite3_column_bytes()
  371. sqlite3_column_count()
  372. sqlite3_column_database_name()
  373. sqlite3_column_decltype()
  374. sqlite3_column_name()
  375. sqlite3_column_origin_name()
  376. sqlite3_column_table_name()
  377. sqlite3_column_type()
  378. sqlite3_commit_hook()
  379. sqlite3_compileoption_get()
  380. sqlite3_compileoption_used()
  381. sqlite3_complete()
  382. sqlite3_config()
  383. sqlite3_context_db_handle()
  384. sqlite3_create_collation()
  385. sqlite3_create_function()
  386. sqlite3_create_module()
  387. sqlite3_data_count()
  388. sqlite3_db_config()
  389. sqlite3_db_handle()
  390. sqlite3_db_mutex()
  391. sqlite3_db_status()
  392. sqlite3_declare_vtab()
  393. sqlite3_enable_load_extension()
  394. sqlite3_enable_shared_cache()
  395. sqlite3_errcode()
  396. sqlite3_errmsg()
  397. sqlite3_exec()
  398. sqlite3_extended_errcode()
  399. sqlite3_extended_result_codes()
  400. sqlite3_file_control()
  401. sqlite3_finalize()
  402. sqlite3_free()
  403. sqlite3_free_table()
  404. sqlite3_get_autocommit()
  405. sqlite3_get_auxdata()
  406. sqlite3_get_table()
  407. sqlite3_initialize()
  408. sqlite3_interrupt()
  409. sqlite3_last_insert_rowid()
  410. sqlite3_libversion()
  411. sqlite3_libversion_number()
  412. sqlite3_limit()
  413. sqlite3_load_extension()
  414. sqlite3_log()
  415. sqlite3_malloc()
  416. sqlite3_memory_highwater()
  417. sqlite3_memory_used()
  418. sqlite3_mprintf()
  419. sqlite3_mutex_alloc()
  420. sqlite3_mutex_enter()
  421. sqlite3_mutex_free()
  422. sqlite3_mutex_held()
  423. sqlite3_mutex_leave()
  424. sqlite3_mutex_notheld()
  425. sqlite3_mutex_try()
  426. sqlite3_next_stmt()
  427. sqlite3_open()
  428. sqlite3_open_v2()
  429. sqlite3_overload_function()
  430. sqlite3_prepare_xxx()
  431. sqlite3_profile()
  432. sqlite3_progress_handler()
  433. sqlite3_randomness()
  434. sqlite3_realloc()
  435. sqlite3_release_memory()
  436. sqlite3_reset()
  437. sqlite3_reset_auto_extension()
  438. sqlite3_result_xxx()
  439. sqlite3_result_error_xxx()
  440. sqlite3_rollback_hook()
  441. sqlite3_set_authorizer()
  442. sqlite3_set_auxdata()
  443. sqlite3_shutdown()
  444. sqlite3_sleep()
  445. sqlite3_snprintf()
  446. sqlite3_soft_heap_limit()
  447. sqlite3_sourceid()
  448. sqlite3_sql()
  449. sqlite3_status()
  450. sqlite3_step()
  451. sqlite3_stmt_status()
  452. sqlite3_strnicmp()
  453. sqlite3_table_column_metadata()
  454. sqlite3_threadsafe()
  455. sqlite3_total_changes()
  456. sqlite3_trace()
  457. sqlite3_unlock_notify()
  458. sqlite3_update_hook()
  459. sqlite3_user_data()
  460. sqlite3_value_xxx()
  461. sqlite3_value_bytes()
  462. sqlite3_value_numeric_type()
  463. sqlite3_value_type()
  464. sqlite3_version[]
  465. sqlite3_vfs_find()
  466. sqlite3_vfs_register()
  467. sqlite3_vfs_unregister()
  468. sqlite3_vmprintf()
  469. Index
  470. About the Author
  471. Colophon
  472. Copyright

Bound Parameters

Statement parameters are special tokens that are inserted into the SQL command string before it is passed to one of the sqlite3_prepare_xxx() functions. They act as a placeholder for any literal value, such as a bare number or a single quote string. After the statement is prepared, but before it is executed, you can bind specific values to each statement parameter. Once you’re done executing a statement, you can reset the statement, bind new values to the parameters, and execute the statement again—only this time with the new values.

Parameter Tokens

SQLite supports five different styles of statement parameters. These short string tokens are placed directly into the SQL command string, which can then be passed to one of the sqlite3_prepare_xxx() functions. Once the statement is prepared, the individual parameters are referenced by index.

?

An anonymous parameter with automatic index. As the statement is processed, each anonymous parameter is assigned a unique, sequential index value, starting with one.

?<index>

Parameter with explicit numeric index. Duplicate indexes allow the same value to be bound multiple places in the same statement.

:<name>

A named parameter with an automatic index. Duplicate names allow the same value to be bound multiple places in the same statement.

@<name>

A named parameter with an automatic index. Duplicate names allow the same value to be bound multiple places in the same statement. Works exactly like the colon parameter.

$<name>

A named parameter with an automatic index. Duplicate names allow the same value to be bound multiple places in the same statement. This is an extended syntax to support Tcl variables. Unless you’re doing Tcl programming, I suggest you use the colon format.

To get an idea of how these work, consider this INSERT statement:

INSERT INTO people (id, name) VALUES ( ?, ? );

The two statement parameters represent the id and name values being inserted. Parameter indexes start at one, so the first parameter that represents the id value has an index of one, and the parameter used to reference the name value has an index of two.

Notice that the second parameter, which is likely a text value, does not have single quotes around it. The single quotes are part of the string-literal representation, and are not required for a parameter value.

Warning

Statement parameters should not be put in quotes. The notation '?' designates a one-character text value, not a parameter.

Once this statement has been prepared, it can be used to insert multiple rows. For each row, simply bind the appropriate values, step through the statement, and then reset the statement. After the statement has been reset, new values can be bound to the parameters and the statement can be stepped again.

You can also use explicit index values:

INSERT INTO people (id, name) VALUES ( ?1, ?2 );

Using explicit parameter indexes has two major advantages. First, you can have multiple instances of the same index value, allowing the same value to be bound to more than one place in the same statement.

Second, explicit indexes allow the parameters to appear out of order. There can even be gaps in the index sequence. This can help simplify application code maintenance if the query is modified and parameters are added or removed.

This level of abstraction can be taken even further by using named parameters. In this case, you allow SQLite to assign parameter index values as it sees fit, in a similar fashion to anonymous parameters. The difference is that you can ask SQLite to tell you the index value of a specific parameter based off the name you’ve given it. Consider this statement:

INSERT INTO people (id, name) VALUES ( :id, :name );

In this case, the parameter values are quite explicit. As we will see in the next section, the code that binds values to these parameters is also quite explicit, making it very clear what is going on. Best of all, it doesn’t matter if new parameters are added. As long as the existing names remain unchanged, the code will properly find and bind the named parameters.

Note, however, that parameters can only be used to replace literal values, such as quoted strings or numeric values. Parameters cannot be used in place of identifiers, such as table names or column names. The following bit of SQL is invalid:

SELECT * FROM ?;   -- INCORRECT: Cannot use a parameter as an identifier

If you attempt to prepare this statement, it will fail. This is because the parameter (which acts as an unknown literal value) is being used where an identifier is required. This is invalid, and the statement will not prepare correctly.

Within a statement, it is best to choose a specific parameter style and stick with it. Mixing anonymous parameters with explicit indexes or named parameters is likely to cause confusion about what index belongs to which parameter.

Personally, I prefer to used the colon-name-style parameters. Using named parameters eliminates the need to know any specific index values, allowing you to just reference the name at runtime. The use of short, significant names can also make the intent of both your SQL statements and your bind code easier to understand.

Binding Values

When you first prepare a statement with parameters, all of the parameters start out with a NULL assigned to them. Before you execute the statement, you can bind specific values to each parameter using the sqlite3_bind_xxx() family of functions.

There are nine sqlite3_bind_xxx() functions available, plus a number of utility functions. These functions can be called any time after the statement is prepared, but before sqlite3_step() is called for the first time. Once sqlite3_step() has been called, these functions cannot be called again until the statement is reset.

All the sqlite3_bind_xxx() functions have the same first and second parameters and return the same result. The first parameter is always a pointer to an sqlite3_stmt, and the second is the index of the parameter to bind. Remember that for anonymous parameters, the first index value starts with one. For the most part, the third parameter is the value to bind. The fourth parameter, if present, indicates the length of the data value in bytes. The fifth parameter, if present, is a function pointer to a memory management callback.

Warning

Remember that bind index values start with one (1), unlike result column indexes, which start with zero (0).

All the bind functions return an integer error code, which is equal to SQLITE_OK upon success.

The bind functions are:

int sqlite3_bind_blob( sqlite3_stmt *stmt, int pidx, const void *data, int data_len, mem_callback )

Binds an arbitrary length binary data BLOB.

int sqlite3_bind_double( sqlite3_stmt *stmt, int pidx, double data )

Binds a 64-bit floating point value.

int sqlite3_bind_int( sqlite3_stmt *stmt, int pidx, int data )

Binds a 32-bit signed integer value.

int sqlite3_bind_int64( sqlite3_stmt *stmt, int pidx, sqlite3_int64 )

Binds a 64-bit signed integer value.

int sqlite3_bind_null( sqlite3_stmt *stmt, int pidx )

Binds a NULL datatype.

int sqlite3_bind_text( sqlite3_stmt *stmt, int pidx, const char *data, int data_len, mem_callback )

Binds an arbitrary length UTF-8 encoded text value. The length is in bytes, not characters. If the length parameter is negative, SQLite will compute the length of the string up to, but not including, the null terminator. It is recommended that the manually computed lengths do not include the terminator (the terminator will be included when the value is returned).

int sqlite3_bind_text16( sqlite3_stmt *stmt, int pidx, const void *data, int data_len, mem_callback )

Binds an arbitrary length UTF-16 encoded text value. The length is in bytes, not characters. If the length parameter is negative, SQLite will compute the length of the string up to, but not including, the null terminator. It is recommended that the manually computed lengths do not include the terminator (the terminator will be included when the value is returned).

int sqlite3_bind_zeroblob( sqlite3_stmt *stmt, int pidx, int len )

Binds an arbitrary length binary data BLOB, where each byte is set to zero (0x00). The only additional parameter is a length value, in bytes. This function is particularly useful for creating large BLOBs that can then be updated with the incremental BLOB interface. See sqlite3_blob_open() in Appendix G for more details.

In addition to these type-specific bind functions, there is also a specialized function:

int sqlite3_bind_value( sqlite3_stmt *stmt, int pidx, const sqlite3_value *data_value )

Binds the type and value of an sqlite3_value structure. An sqlite3_value structure can hold any data format.

The text and BLOB variants of sqlite3_bind_xxx() require you to pass a buffer pointer for the data value. Normally this buffer and its contents must remain valid until a new value is bound to that parameter, or the statement is finalized. Since that might be some time later in the code, these bind functions have a fifth parameter that controls how the buffer memory is handled and possibly released.

If the fifth parameter is either NULL or the constant SQLITE_STATIC, SQLite will take a hands-off approach and assume the buffer memory is either static or that your application code is taking care of maintaining and releasing any memory.

If the fifth parameter is the constant SQLITE_TRANSIENT, SQLite will make an internal copy of the buffer. This allows you to release your buffer immediately (or allow it to go out of scope, if it happens to be on the stack). SQLite will automatically release the internal buffer at an appropriate time.

The final option is to pass a valid void mem_callback( void* ptr ) function pointer. This callback will be called when SQLite is done with the buffer and wants to release it. If the buffer was allocated with sqlite3_malloc() or sqlite3_realloc(), you can pass a reference to sqlite3_free() directly. If you allocated the buffer with a different set of memory management calls, you’ll need to pass a reference to a wrapper function that calls the appropriate memory release function.

Once a value has been bound to a parameter, there is no way to extract that value back out of the statement. If you need to reference a value after it has been bound, you must keep track of it yourself.

To help you figure out what parameter index to use, there are three utility functions:

int sqlite3_bind_parameter_count( sqlite3_stmt *stmt )

Returns an integer indicating the largest parameter index. If no explicit numeric indexes are used ( ?<number> ), this will the be the number of unique parameters that appear in a statement. If explicit numeric indexes are used, there may be gaps in the number sequence.

int sqlite3_bind_parameter_index( sqlite3_stmt *stmt, const char *name )

Returns the index of a named parameter. The name must include any leading character (such as “:”) and must be given in UTF-8, even if the statement was prepared from UTF-16. A zero is returned if a parameter with a matching name cannot be found.

const char* sqlite3_bind_parameter_name( sqlite3_stmt *stmt, int pidx )

Returns the full text representation of a specific parameter. The text is always UTF-8 encoded and includes the leading character.

Using the sqlite3_bind_parameter_index() function, you can easily find and bind named parameters. The sqlite3_bind_xxx() functions will properly detect an invalid index range, allowing you to look up the index and bind a value in one line:

sqlite3_bind_int(stmt, sqlite3_bind_parameter_index(stmt, ":pid"), pid);

If you want to clear all of the bindings back to their initial NULL defaults, you can use the function sqlite3_clear_bindings():

int sqlite3_clear_bindings( sqlite3_stmt *stmt )

Clears all parameter bindings in a statement. After calling, all parameters will have a NULL bound to them. This will cause the memory management callback to be called on any text or BLOB values that were bound with a valid function pointer. Currently, this function always returns SQLITE_OK.

If you want to be absolutely sure bound values won’t leak from one statement execution to the next, it is best to clear the bindings any time you reset the statement. If you’re doing manual memory management on data buffers, you can free any memory used by bound values after this function is called.

Security and Performance

There are significant security advantages to using bound parameters. Many times people will manipulate SQL strings to substitute the values they want to use. For example, consider building an SQL statement in C using the string function snprintf():

snprintf(buf, buf_size,
         "INSERT INTO people( id, name ) VALUES ( %d, '%s' );",
         id_val, name_val);

In this case we do need single quotes around the string value, as we’re trying to form a literal representation. If we pass in these C values:

id_val = 23;
name_val = "Fred";

Then we get the following SQL statement in our buffer:

INSERT INTO people( id, name ) VALUES ( 23, 'Fred');

This seems simple enough, but the danger with a statement like this is that the variables need to be sanitized before they’re passed into the SQL statement. For example, consider these values:

id_val = 23;
name_val = "Fred' ); DROP TABLE people;";

This would cause our snprintf() to create the following SQL command sequence, with the individual commands split out onto their own lines for clarity:

INSERT INTO people( id, name ) VALUES ( 23, 'Fred' );
DROP TABLE people;
' );

While that last statement is nonsense, the second statement is cause for concern.

Thankfully, things are not quite as bad as they seem. The sqlite3_prepare_xxx() functions will only prepare a single statement (up to the first semicolon), unless you explicitly pass the remainder of the SQL command string to another sqlite3_prepare_xxx() call. That limits what can be done in a case like this, unless your code automatically prepares and executes multiple statements from a single command buffer.

Be warned, however, that the interfaces provided by many scripting languages will do exactly that, and will automatically process multiple SQL statements passed in with a single call. The SQLite convenience functions, including sqlite3_exec(), will also automatically process multiple SQL commands passed in through a single string. What makes sqlite3_exec() particularly dangerous is that the convenience functions don’t allow the use of bound values, forcing you to programmatically build SQL command statements and opening you up to problems. Later in the chapter, we’ll take a closer look at sqlite3_exec() and why it usually isn’t the best choice.

Even if SQLite will only process the first command, damage can still be done with subqueries and other commands. Bad input can also force a statement to fail. Consider the result if the name value is:

Fred', 'extra junk

If you’re updating a series of records based off this id value, you had better wrap all the commands up in a transaction and be prepared to roll it back if you encounter an error. If you just assume the commands will work, you’ll end up with an inconsistent database.

This type of attack is known as an SQL injection attack. An SQL injection attack inserts SQL command fragments into data values, causing the database to execute arbitrary SQL commands. Unfortunately, it is extremely common for websites to be susceptible to this kind of attack. It also borders on inexcusable, because it is typically very easy to avoid.

One defense against SQL injections is to try to sanitize any string values received from an untrusted source. For example, you might try to substitute all single quote characters with two single quote characters (the standard SQL escape mechanism). This can get quite complex, however, and you’re putting utter faith in the code’s ability to correctly sanitize untrusted strings.

A much easier way to defend yourself against SQL injections is to use SQL statement parameters. Injection attacks depend on a data value being represented as a literal value in an SQL command statement. The attack only works if the attack value is passed through the SQL parser, where it alters the meaning of the surrounding SQL commands.

In the case of SQL parameters, the bound values are never passed through the SQL parser. An SQL statement is only parsed when the command is prepared. If you’re using parameters, the SQL engine parses only the parameter tokens. Later, when you bind a value to a specific parameter, that value is bound directly in its native format (i.e. string, integer, etc.) and is not passed through the SQL parser. As long as you’re careful about how you extract and display the string, it is perfectly safe to directly bind an untrusted string value to a parameter value without fear of an SQL injection.

Besides avoiding injection attacks, parameters can also be faster and use less memory than string manipulations. Using a function such as snprintf() requires an SQL command template, and a sufficiently large output buffer. The string manipulation functions also need working memory, plus you may need additional buffers to copy and sanitize values. Additionally, a number of datatypes, such as integers and floating-point numbers (and especially BLOBs), often take up significantly more space in their string representation. This further increases memory usage. Finally, once the final command buffer has been created, all the data needs to be passed through the SQL parser, where the literal data values are converted back into their native format and stored in additional buffers.

Compare that to the resource usage of preparing and binding a statement. When using parameters, the SQL command statement is essentially static, and can be used as is, without modification or additional buffers. The parser doesn’t need to deal with converting and storing literal values. In fact, the data values normally never leave their native format, further saving time and memory by avoiding conversion in and out of a string representation.

Using parameters is the safe and wise choice, even for situations when a statement is only used once. It may take a few extra lines of code, but the process will be safer, faster, and more memory efficient.

Example

This example executes an INSERT statement. Although this statement is only executed once, it still uses bind parameters to protect against possible injection attacks. This eliminates the need to sanitize the input value.

The statement is first prepared with a statement parameter. The data value is then bound to the statement parameter before we execute the prepared statement:

    char            *data = ""; /* default to empty string */
    sqlite3_stmt    *stmt = NULL;
    int             idx = -1;

    /* ... set "data" pointer ... */
    /* ... open database ... */

    rc = sqlite3_prepare_v2( db, "INSERT INTO tbl VALUES ( :str )", -1, &stmt, NULL );
    if ( rc != SQLITE_OK) exit( -1 );

    idx = sqlite3_bind_parameter_index( stmt, ":str" );
    sqlite3_bind_text( stmt, idx, data, -1, SQLITE_STATIC );

    rc = sqlite3_step( stmt );
    if (( rc != SQLITE_DONE )&&( rc != SQLITE_ROW )) exit ( -1 );
    
    sqlite3_finalize( stmt );

    /* ... close database ... */

In this case we look for either an SQLITE_DONE or an SQLITE_ROW return value. Both are possible. Although the INSERT itself will be fully executed on the first call to sqlite3_step(), if PRAGMA count_changes is enabled, then the statement may return a value. In this case, we want to ignore any potential return value without triggering an error, so we must check for both possible return codes. For more details, see count_changes in Appendix F.

Potential Pitfalls

It is important to understand that all parameters must have some literal associated with them. As soon as you prepare a statement, all the parameters are set to NULL. If you fail to bind an alternate value, the parameter still has a literal NULL associated with it. This has a few ramifications that are not always obvious.

The general rule of thumb is that bound parameters act as literal string substitutions. Although they offer additional features and protections, if you’re trying to figure out the expected behavior of a parameter substitution, it is safe to assume you’ll get the exact same behavior as if the parameter was a literal string substitution.

In the case of an INSERT statement, there is no way to force a default value to be used. For example, if you have the following statement:

INSERT INTO membership ( pid, gid, type ) VALUES ( :pid, :gid, :type );

Even if the type column has a default value available, there is no way this statement can use it. If you fail to bind a value to the :type parameter, a NULL will be inserted, rather than the default value. The only way to insert a default value into the type column is to use a statement that doesn’t reference it, such as:

INSERT INTO membership ( pid, gid ) VALUES ( :pid, :gid );

This means that if you’re heavily dependent on database-defined default values, you may need to prepare several variations of an INSERT statement to cover the different cases when different data values are available. Of course, if your application code is aware of the proper default values, it can simply bind that value to the proper parameter.

The other area where parameters can cause surprises is in NULL comparisons. For example, consider the statement:

SELECT * FROM employee WHERE manager = :manager;

This works for normal values, but if a NULL is bound to the :manager parameter, no rows will ever be returned. If you need the ability to test for a NULL in the manager column, make sure you use the IS operator:

SELECT * FROM employee WHERE manager IS :manager;

For more details, see IS in Appendix D.

This behavior also makes it tricky to “stack” conditionals. For example, if you have the statement:

SELECT * FROM employee WHERE manager = :manager AND project = :project;

you must provide a meaningful value for both :manager and :project. If you want the ability to search on a manager, on a project, or on a manager and a project, you need to prepare multiple statements, or you need to add a bit more logic:

...WHERE ( manager = :manager OR :manager IS NULL )
     AND ( project = :project OR :project IS NULL );

This query will ignore one (or both) of the value conditions if you assign a NULL to the appropriate parameters. This expression won’t let you explicitly search for NULLs, but that can be done with additional parameters and logic. Preparing more flexible statements reduces the number of unique statements you need to manage, but it also tends to make them more complex and can make them run slower. If they get too complex, it might make more sense to simply define a new set of statements, rather than adding more and more parameter logic to the same statement.