Table of Contents for
Using SQLite

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Using SQLite by Jay A. Kreibich Published by O'Reilly Media, Inc., 2010
  1. Cover
  2. Using SQLite
  3. O'Reilly Strata Conference
  4. Using SQLite
  5. Dedication
  6. A Note Regarding Supplemental Files
  7. Preface
  8. SQLite Versions
  9. Email Lists
  10. Example Code Download
  11. How We Got Here
  12. Conventions Used in This Book
  13. Using Code Examples
  14. Safari® Books Online
  15. How to Contact Us
  16. 1. What Is SQLite?
  17. Self-Contained, No Server Required
  18. Single File Database
  19. Zero Configuration
  20. Embedded Device Support
  21. Unique Features
  22. Compatible License
  23. Highly Reliable
  24. 2. Uses of SQLite
  25. Database Junior
  26. Application Files
  27. Application Cache
  28. Archives and Data Stores
  29. Client/Server Stand-in
  30. Teaching Tool
  31. Generic SQL Engine
  32. Not the Best Choice
  33. Big Name Users
  34. 3. Building and Installing SQLite
  35. SQLite Products
  36. Precompiled Distributions
  37. Documentation Distribution
  38. Source Distributions
  39. Building
  40. Build and Installation Options
  41. An sqlite3 Primer
  42. Summary
  43. 4. The SQL Language
  44. Learning SQL
  45. Brief Background
  46. General Syntax
  47. SQL Data Languages
  48. Data Definition Language
  49. Data Manipulation Language
  50. Transaction Control Language
  51. System Catalogs
  52. Wrap-up
  53. 5. The SELECT Command
  54. SQL Tables
  55. The SELECT Pipeline
  56. Advanced Techniques
  57. SELECT Examples
  58. What’s Next
  59. 6. Database Design
  60. Tables and Keys
  61. Common Structures and Relationships
  62. Normal Form
  63. Indexes
  64. Transferring Design Experience
  65. Closing
  66. 7. C Programming Interface
  67. API Overview
  68. Library Initialization
  69. Database Connections
  70. Prepared Statements
  71. Bound Parameters
  72. Convenience Functions
  73. Result Codes and Error Codes
  74. Utility Functions
  75. Summary
  76. 8. Additional Features and APIs
  77. Date and Time Features
  78. ICU Internationalization Extension
  79. Full-Text Search Module
  80. R*Trees and Spatial Indexing Module
  81. Scripting Languages and Other Interfaces
  82. Mobile and Embedded Development
  83. Additional Extensions
  84. 9. SQL Functions and Extensions
  85. Scalar Functions
  86. Aggregate Functions
  87. Collation Functions
  88. SQLite Extensions
  89. 10. Virtual Tables and Modules
  90. Introduction to Modules
  91. Module API
  92. Simple Example: dblist Module
  93. Advanced Example: weblog Module
  94. Best Index and Filter
  95. Wrap-Up
  96. A. SQLite Build Options
  97. Shell Directives
  98. ENABLE_READLINE
  99. Default Values
  100. SQLITE_DEFAULT_AUTOVACUUM
  101. SQLITE_DEFAULT_CACHE_SIZE
  102. SQLITE_DEFAULT_FILE_FORMAT
  103. SQLITE_DEFAULT_JOURNAL_SIZE_LIMIT
  104. SQLITE_DEFAULT_MEMSTATUS
  105. SQLITE_DEFAULT_PAGE_SIZE
  106. SQLITE_DEFAULT_TEMP_CACHE_SIZE
  107. YYSTACKDEPTH
  108. Sizes and Limits
  109. SQLITE_MAX_ATTACHED
  110. SQLITE_MAX_COLUMN
  111. SQLITE_MAX_COMPOUND_SELECT
  112. SQLITE_MAX_DEFAULT_PAGE_SIZE
  113. SQLITE_MAX_EXPR_DEPTH
  114. SQLITE_MAX_FUNCTION_ARG
  115. SQLITE_MAX_LENGTH
  116. SQLITE_MAX_LIKE_PATTERN_LENGTH
  117. SQLITE_MAX_PAGE_COUNT
  118. SQLITE_MAX_PAGE_SIZE
  119. SQLITE_MAX_SQL_LENGTH
  120. SQLITE_MAX_TRIGGER_DEPTH
  121. SQLITE_MAX_VARIABLE_NUMBER
  122. Operation and Behavior
  123. SQLITE_CASE_SENSITIVE_LIKE
  124. SQLITE_HAVE_ISNAN
  125. SQLITE_OS_OTHER
  126. SQLITE_SECURE_DELETE
  127. SQLITE_THREADSAFE
  128. SQLITE_TEMP_STORE
  129. Debug Settings
  130. SQLITE_DEBUG
  131. SQLITE_MEMDEBUG
  132. Enable Extensions
  133. SQLITE_ENABLE_ATOMIC_WRITE
  134. SQLITE_ENABLE_COLUMN_METADATA
  135. SQLITE_ENABLE_FTS3
  136. SQLITE_ENABLE_FTS3_PARENTHESIS
  137. SQLITE_ENABLE_ICU
  138. SQLITE_ENABLE_IOTRACE
  139. SQLITE_ENABLE_LOCKING_STYLE
  140. SQLITE_ENABLE_MEMORY_MANAGEMENT
  141. SQLITE_ENABLE_MEMSYS3
  142. SQLITE_ENABLE_MEMSYS5
  143. SQLITE_ENABLE_RTREE
  144. SQLITE_ENABLE_STAT2
  145. SQLITE_ENABLE_UPDATE_DELETE_LIMIT
  146. SQLITE_ENABLE_UNLOCK_NOTIFY
  147. YYTRACKMAXSTACKDEPTH
  148. Limit Features
  149. SQLITE_DISABLE_LFS
  150. SQLITE_DISABLE_DIRSYNC
  151. SQLITE_ZERO_MALLOC
  152. Omit Core Features
  153. B. sqlite3 Command Reference
  154. Command-Line Options
  155. Interactive Dot-Commands
  156. .backup
  157. .bail
  158. .databases
  159. .dump
  160. .echo
  161. .exit
  162. .explain
  163. .headers
  164. .help
  165. .import
  166. .indices
  167. .iotrace
  168. .load
  169. .log
  170. .mode
  171. .nullvalue
  172. .output
  173. .prompt
  174. .quit
  175. .read
  176. .restore
  177. .schema
  178. .separator
  179. .show
  180. .tables
  181. .timeout
  182. .timer
  183. .width
  184. C. SQLite SQL Command Reference
  185. SQLite SQL Commands
  186. ALTER TABLE
  187. ANALYZE
  188. ATTACH DATABASE
  189. BEGIN TRANSACTION
  190. COMMIT TRANSACTION
  191. CREATE INDEX
  192. CREATE TABLE
  193. CREATE TRIGGER
  194. CREATE VIEW
  195. CREATE VIRTUAL TABLE
  196. DELETE
  197. DETACH DATABASE
  198. DROP INDEX
  199. DROP TABLE
  200. DROP TRIGGER
  201. DROP VIEW
  202. END TRANSACTION
  203. EXPLAIN
  204. INSERT
  205. PRAGMA
  206. REINDEX
  207. RELEASE SAVEPOINT
  208. REPLACE
  209. ROLLBACK TRANSACTION
  210. SAVEPOINT
  211. SELECT
  212. UPDATE
  213. VACUUM
  214. D. SQLite SQL Expression Reference
  215. Literal Expressions
  216. Logic Representations
  217. Unary Expressions
  218. Binary Expressions
  219. Function Calls
  220. Column Names
  221. General Expressions
  222. AND
  223. BETWEEN
  224. CASE
  225. CAST
  226. COLLATE
  227. EXISTS
  228. GLOB
  229. IN
  230. IS
  231. ISNULL
  232. LIKE
  233. MATCH
  234. NOTNULL
  235. OR
  236. RAISE
  237. REGEXP
  238. SELECT
  239. E. SQLite SQL Function Reference
  240. Scalar Functions
  241. abs()
  242. changes()
  243. coalesce()
  244. date()
  245. datetime()
  246. glob()
  247. ifnull()
  248. hex()
  249. julianday()
  250. last_insert_rowid()
  251. length()
  252. like()
  253. load_extension()
  254. lower()
  255. ltrim()
  256. match()
  257. max()
  258. min()
  259. nullif()
  260. quote()
  261. random()
  262. randomblob()
  263. regex()
  264. replace()
  265. round()
  266. rtrim()
  267. sqlite_compileoption_get()
  268. sqlite_compileoption_used()
  269. sqlite_source_id()
  270. sqlite_version()
  271. strftime()
  272. substr()
  273. time()
  274. total_changes()
  275. trim()
  276. typeof()
  277. upper()
  278. zeroblob()
  279. Aggregate Functions
  280. avg()
  281. count()
  282. group_concat()
  283. max()
  284. min()
  285. sum()
  286. total()
  287. F. SQLite SQL PRAGMA Reference
  288. SQLite PRAGMAs
  289. auto_vacuum
  290. cache_size
  291. case_sensitive_like
  292. collation_list
  293. count_changes
  294. database_list
  295. default_cache_size
  296. encoding
  297. foreign_keys
  298. foreign_key_list
  299. freelist_count
  300. full_column_names
  301. fullfsync
  302. ignore_check_constraints
  303. incremental_vacuum
  304. index_info
  305. index_list
  306. integrity_check
  307. journal_mode
  308. journal_size_limit
  309. legacy_file_format
  310. locking_mode
  311. lock_proxy_file
  312. lock_status
  313. max_page_count
  314. omit_readlock
  315. page_count
  316. page_size
  317. parser_trace
  318. quick_check
  319. read_uncommitted
  320. recursive_triggers
  321. reverse_unordered_selects
  322. schema_version
  323. secure_delete
  324. short_column_names
  325. sql_trace
  326. synchronous
  327. table_info
  328. temp_store
  329. temp_store_directory
  330. user_version
  331. vdbe_trace
  332. vdbe_listing
  333. writable_schema
  334. G. SQLite C API Reference
  335. API Datatypes
  336. sqlite3
  337. sqlite3_backup
  338. sqlite3_blob
  339. sqlite3_context
  340. sqlite3_int64, sqlite3_uint64, sqlite_int64, sqlite_uint64
  341. sqlite3_module
  342. sqlite3_mutex
  343. sqlite3_stmt
  344. sqlite3_value
  345. sqlite3_vfs
  346. API Functions
  347. sqlite3_aggregate_context()
  348. sqlite3_auto_extension()
  349. sqlite3_backup_finish()
  350. sqlite3_backup_init()
  351. sqlite3_backup_pagecount()
  352. sqlite3_backup_remaining()
  353. sqlite3_backup_step()
  354. sqlite3_bind_xxx()
  355. sqlite3_bind_parameter_count()
  356. sqlite3_bind_parameter_index()
  357. sqlite3_bind_parameter_name()
  358. sqlite3_blob_bytes()
  359. sqlite3_blob_close()
  360. sqlite3_blob_open()
  361. sqlite3_blob_read()
  362. sqlite3_blob_write()
  363. sqlite3_busy_handler()
  364. sqlite3_busy_timeout()
  365. sqlite3_changes()
  366. sqlite3_clear_bindings()
  367. sqlite3_close()
  368. sqlite3_collation_needed()
  369. sqlite3_column_xxx()
  370. sqlite3_column_bytes()
  371. sqlite3_column_count()
  372. sqlite3_column_database_name()
  373. sqlite3_column_decltype()
  374. sqlite3_column_name()
  375. sqlite3_column_origin_name()
  376. sqlite3_column_table_name()
  377. sqlite3_column_type()
  378. sqlite3_commit_hook()
  379. sqlite3_compileoption_get()
  380. sqlite3_compileoption_used()
  381. sqlite3_complete()
  382. sqlite3_config()
  383. sqlite3_context_db_handle()
  384. sqlite3_create_collation()
  385. sqlite3_create_function()
  386. sqlite3_create_module()
  387. sqlite3_data_count()
  388. sqlite3_db_config()
  389. sqlite3_db_handle()
  390. sqlite3_db_mutex()
  391. sqlite3_db_status()
  392. sqlite3_declare_vtab()
  393. sqlite3_enable_load_extension()
  394. sqlite3_enable_shared_cache()
  395. sqlite3_errcode()
  396. sqlite3_errmsg()
  397. sqlite3_exec()
  398. sqlite3_extended_errcode()
  399. sqlite3_extended_result_codes()
  400. sqlite3_file_control()
  401. sqlite3_finalize()
  402. sqlite3_free()
  403. sqlite3_free_table()
  404. sqlite3_get_autocommit()
  405. sqlite3_get_auxdata()
  406. sqlite3_get_table()
  407. sqlite3_initialize()
  408. sqlite3_interrupt()
  409. sqlite3_last_insert_rowid()
  410. sqlite3_libversion()
  411. sqlite3_libversion_number()
  412. sqlite3_limit()
  413. sqlite3_load_extension()
  414. sqlite3_log()
  415. sqlite3_malloc()
  416. sqlite3_memory_highwater()
  417. sqlite3_memory_used()
  418. sqlite3_mprintf()
  419. sqlite3_mutex_alloc()
  420. sqlite3_mutex_enter()
  421. sqlite3_mutex_free()
  422. sqlite3_mutex_held()
  423. sqlite3_mutex_leave()
  424. sqlite3_mutex_notheld()
  425. sqlite3_mutex_try()
  426. sqlite3_next_stmt()
  427. sqlite3_open()
  428. sqlite3_open_v2()
  429. sqlite3_overload_function()
  430. sqlite3_prepare_xxx()
  431. sqlite3_profile()
  432. sqlite3_progress_handler()
  433. sqlite3_randomness()
  434. sqlite3_realloc()
  435. sqlite3_release_memory()
  436. sqlite3_reset()
  437. sqlite3_reset_auto_extension()
  438. sqlite3_result_xxx()
  439. sqlite3_result_error_xxx()
  440. sqlite3_rollback_hook()
  441. sqlite3_set_authorizer()
  442. sqlite3_set_auxdata()
  443. sqlite3_shutdown()
  444. sqlite3_sleep()
  445. sqlite3_snprintf()
  446. sqlite3_soft_heap_limit()
  447. sqlite3_sourceid()
  448. sqlite3_sql()
  449. sqlite3_status()
  450. sqlite3_step()
  451. sqlite3_stmt_status()
  452. sqlite3_strnicmp()
  453. sqlite3_table_column_metadata()
  454. sqlite3_threadsafe()
  455. sqlite3_total_changes()
  456. sqlite3_trace()
  457. sqlite3_unlock_notify()
  458. sqlite3_update_hook()
  459. sqlite3_user_data()
  460. sqlite3_value_xxx()
  461. sqlite3_value_bytes()
  462. sqlite3_value_numeric_type()
  463. sqlite3_value_type()
  464. sqlite3_version[]
  465. sqlite3_vfs_find()
  466. sqlite3_vfs_register()
  467. sqlite3_vfs_unregister()
  468. sqlite3_vmprintf()
  469. Index
  470. About the Author
  471. Colophon
  472. Copyright

Tables and Keys

Tables may look like simple two-dimensional grids of simple values, but a well-defined table has a fair amount of structure. Different columns can play different roles. Some columns act as unique identifiers that define the intent and purpose of each row. Other columns hold supporting data. Still other columns act as external references that link rows in one table to rows in another table. When designing a table, it is important to understand why each column is there, and what role each column is playing.

Keys Define the Table

When designing a table, you usually start by specifying the primary key. The primary key consists of one or more columns that uniquely identify each row of a table. In a sense, the primary key values represent the fundamental identity of each row in the table. The primary key columns identify what the table is all about. All the other columns should provide supporting data that is directly relevant to the primary key.

Sometimes the primary key is an actual unique data value, such as a room number or a hostname. Very often the primary key is simply an arbitrary identification number, such as an employee or student ID number. The important point is that primary keys must be unique over every possible entry in the table, not just the rows that happen to be in there right now. This is why names are normally not used as primary keys—in a large group of people, it is too easy to end up with duplicate (or very similar) names.

While there is nothing particularly special about the columns that make up a primary key, the keys themselves play a very important role in the design and use of a database. Their role as a unique identifier for each row makes them analogous to the key of a hash table, or the key to a dictionary class of data structure. They are essentially “lookup” values for the rows of a table.

A primary key can be identified in the CREATE TABLE command. Explicitly identifying the primary key will cause the database system to automatically create a UNIQUE index across all of the primary key columns. Declaring the primary key also allows for some syntax shortcuts when establishing relationships between tables.

For example, this table definition defines the employee_id field to be a primary key:

CREATE TABLE employee (
    employee_id   INTEGER   PRIMARY KEY   NOT NULL,
    name          TEXT   NOT NULL
    /* ...etc... */
);

For more information on the syntax used to define a primary key, see the section Primary keys.

In schema documentation, primary keys are often indicated with the abbreviation “PK.” It is also common to double underline primary keys when drawing out tables, as shown in Figure 6-1.

Primary keys are sometimes identified with the abbreviation PK. It is also common to use a double underline when diagramming the table.
Figure 6-1. Primary keys are sometimes identified with the abbreviation PK. It is also common to use a double underline when diagramming the table.

Many database queries use a table’s primary key as either the input or output. The database might be asked to return the row with a given key, such as, “return the record for employee #953.” It is also common to ask for collections of keys, such as, “gather the ID values for all employees hired more than two years ago.” This set of keys might then be joined to another table as part of a report.

Foreign Keys

In addition to identifying each row in a table, primary keys are also central to joining tables together. Since the primary key acts as a unique row identifier, you can create a reference to a specific row by recording the row’s primary key. This is known as a foreign key. A foreign key is a copy or recording of a primary key, and is used as a reference or pointer to a different (or “foreign”) row, most often in a different table.

Like the primary key, columns that make up a foreign key can be identified within the CREATE TABLE command. In this example, we define the format of a task assignment. Each task gets assigned to a specific employee by referencing the employee_id field from the employee table:

CREATE TABLE task_assignment (
    task_assign_id   INTEGER   PRIMARY KEY,
    task_name        TEXT      NOT NULL,
    employee_id      INTEGER   NOT NULL   REFERENCES employee( employee_id )
    /* ...etc... */
);

The REFERENCES constraint indicates that this column is a foreign key. The constraint indicates which table is referenced and, optionally, which column. If no column is indicated, the foreign key will reference the primary key (meaning the column reference used in the prior example is not required, since employee_id is the primary key of the employee table). The vast majority of foreign keys will reference a primary key, but if a column other than the primary key is used, that column must have a UNIQUE constraint, or it must have a single-column UNIQUE index.

A foreign key can also be defined as a table constraint. In that case, there may be multiple local columns that refer to multiple columns in the referenced table. The referenced columns must be a multicolumn primary key, or they must otherwise have a multicolumn UNIQUE index. A foreign key definition can include several other optional parts. For the full syntax, see CREATE TABLE in Appendix C.

Unlike a table’s own primary key, foreign keys are not required to be unique. This is because multiple foreign key values (multiple rows) in one table may refer to the same row in another table. This is known as a one-to-many relationship. Please see the section One-to-Many Relationships. Foreign keys are often marked with the abbreviation “FK,” as shown in Figure 6-2.

Foreign keys are copies of the primary key from another row. Foreign keys act as references or pointers to other rows. They are often identified with the abbreviation FK.
Figure 6-2. Foreign keys are copies of the primary key from another row. Foreign keys act as references or pointers to other rows. They are often identified with the abbreviation FK.

Foreign Key Constraints

Declaring foreign keys in the table definition allows the database to enforce foreign key constraints. Foreign key constraints are used to keep foreign key references in sync. Among other things, foreign key constraints can prevent “dangling references” by requiring that all foreign key values correctly match a row value from the columns of the referenced table. Foreign keys can also be set to NULL. A NULL clearly marks the foreign key as unassigned, which is a bit different than having an invalid value. In many cases, unassigned foreign keys don’t fit the design of the database. In that case, the foreign key columns should be declared with a NOT NULL constraint.

Using our previous example, foreign key constraints would require that every task_assignment.employee_id element needs to contain the value of a valid employee.employee_id. By default, a NULL foreign key would also be allowed, but we’ve defined the task_assignment.employee_id column with a NOT NULL constraint. This demands that every task reference a valid employee.

Native foreign key support was added in SQLite 3.6.19, but is turned off by default. You must use the PRAGMA foreign_keys command to turn on foreign key constraints. This was done to avoid problems with existing applications and database files. A future version of SQLite may have foreign key constraints enabled by default. If your application is dependent on this setting, it should explicitly turn it on or off.

Modifications to either the foreign key table or the referenced table can potentially cause violations of the foreign key constraint. For example, if a statement attempted to update a task_assignment.employee_id value to an invalid employee_id, the foreign key constraint would be violated. Similarly, if an employee row was assigned a new employee_id value, any existing task_assignment references that point to the old value would become invalid. This would also violate the foreign key constraint.

If the database detects a change that would cause a foreign key violation, there are several actions that can be taken. The default action is to prohibit the change. For example, if you attempted to delete an employee that still had tasks assigned to them, the delete would fail. You would need to delete the tasks or transfer them to a different employee before you could delete the original employee.

Other conflict resolutions are also available. For example, using an ON DELETE CASCADE foreign key definition, deleting an employee would cause the database to automatically delete any tasks assigned to that employee. For more information on conflict resolutions and other advanced foreign key options, please see the SQLite website. Up-to-date documentation on SQLite’s support for foreign keys can be found at http://www.sqlite.org/foreignkeys.html.

Correctly defining foreign keys is one of the most critical aspects of data integrity and security. Once defined, foreign key constraints make sure that data relationships remain valid and consistent.

Generic ID Keys

If you look at most database designs, you’ll notice that a large number of the tables have a generic ID column that is used as the primary key. The ID is typically an arbitrary integer value that is automatically assigned as new rows are inserted. The number may be incrementally assigned, or it might be assigned from a sequence mechanism that is guaranteed to never assign the same ID number twice.

When an SQLite column is defined as an INTEGER PRIMARY KEY, that column will replace the hidden ROWID column that acts as the root column of every table. Using an INTEGER PRIMARY KEY allows for some significant performance enhancements. It also allows SQLite to automatically assign sequenced ID values. For more details, see Primary keys.

At first, a generic ID field might seem like a design cheat. If each table should have a specific and well-defined role, then the primary key should reflect what makes any given row unique—reflecting, in part, the essential definition of the items in a table. Using a generic and arbitrary ID to define that uniqueness seems to be missing the point.

From a theoretical standpoint, that may be correct, but this is one of those areas where theory bumps into reality, and reality usually wins.

Practically speaking, many datasets don’t have a truly unique representation. For example, the names of people are not sufficiently unique to be used as database keys. Names are reasonably unique, and they do a fair job at identifying individuals in person, but they lack the inherent and complete uniqueness that good database design demands. Names also change from time to time, such as when people get married.

The more you dig around, the more you’ll find that the world is full of data like this. Data that is sufficiently unique and stable for casual use, but not truly, absolutely unique or fixed enough to use as a smart database key. In these situations, the best solution is to simply use an arbitrary ID that the database designer has total control over, even if it is meaningless outside of the database. This type of key is known as a surrogate key.

There are also situations when the primary key consists of three or four (or more!) columns. This is somewhat rare, but there are situations when it does come up. If you’ve got a large multicolumn primary key in the middle of a complex set of relationships, it can be a big pain to create and match all those multicolumn foreign keys. To simplify such situations, it is often easier to simply create an arbitrary ID and use that as the primary key.

Using an arbitrary ID is also useful if the customary primary key is physically large. Because each foreign key is a full copy of the primary key, it is unwise to use a lengthy text value or BLOB as the primary key. Rather, it would be better to use an arbitrary identifier and simply reference to the identifier.

One final comment on key names. There is often a temptation to name a generic ID field something simple, such as id. After all, if you’ve got an employee table, it might seem somewhat redundant to name the primary key employee_id; you end up with a lot of column references that read employee.employee_id, when it seems that employee.id is clear enough.

Well, by itself, it is clear enough, but primary keys tend to show up in other tables as foreign keys. While employee.employee_id might be slightly redundant, the name task_assignment.employee_id is not. That name also gives you significant clues about the column’s function (a foreign key) and what table and column it references (the employee_id column, which is the PK column of the employee table). Using the same name for primary keys and foreign keys makes the inherent meaning and linkage a lot more obvious. It also allows shortcuts, such as the NATURAL JOIN or JOIN...USING( ) syntax. Both of these forms require that matching columns have the exact same name.

Using a more explicit name also avoids the problem of having multiple tables, each with a column named id. Such a common name can make things somewhat confusing if you join together three or four tables. While I wouldn’t necessarily prefix every column name, keeping primary key names unique within a database (and using the exact same name for foreign keys) can make the intent of the database design a lot more clear.

Keep It Specific

The biggest stumbling block for beginning database developers is that they don’t create enough tables. Less experienced developers tend to view tables as large and monumental structures. There tends to be a feeling that tables are important, and that each table needs a fair number of columns containing significant amounts of data to justify its existence. If a design change calls for a new column or set of data points, there tends to be a strong desire to lump everything together into large centralized tables rather than breaking things apart into logical groups.

The “tables must be big” mentality will quickly lead to poor designs. While you will have a few large, significant tables at the core of most designs, a typical design will also have a fair number of smaller auxiliary tables that may only hold two or three columns of data. In fact, in some cases you may find yourself building tables that consist of nothing but external references to other tables. Creating a new table is the only means you have to define a new data structure or data container, so any time you find yourself needing a new container, that’s going to indicate a new table.

Every table should have a well-defined and clear role, but that doesn’t always mean it will be big or stuffed with data.