TL;DR
When Opening the DB
PRAGMA journal_mode = wal; -- different implementation of the atomicity properties
PRAGMA synchronous = normal; -- synchronise less often to the filesystem
PRAGMA foreign_keys = on; -- check foreign key reference, slightly worst performance
And check user_version
to apply any migrations, for instance with this Rust library.
When Closing the DB
PRAGMA analysis_limit=400; -- make sure pragma optimize does not take too long
PRAGMA optimize; -- gather statistics to improve query optimization
Introduction
SQL pragma are statements (like SELECT …
or CREATE TABLE …
) that change the database behaviors or call a special functions. This post is a short list of SQLite pragma I use in my projects built on SQLite, to get better performance and more consistency.
Performance
File system Interactions
The following pragma statements set the way SQLite deals with the file system.
journal_mode wal
By default, when applying changes, SQLite uses a rollback journal, which is basically a copy of the data before the changes. Changing the journal mode to “Write-Ahead Log” is known to bring significantly better performance in most cases. It also allows concurrent readers with one writer. To activate it, run:
PRAGMA journal_mode = wal;
Some less common use cases are incompatible with this journal mode, for instance having a database on a network file system. The full list of drawbacks is listed in the documentation1.
synchronous normal
To ensure integrity of the database, one of the mechanism SQLite uses is the file system synchronization operations. However, these synchronizations are quite costly. With WAL journal mode enabled, your database will still be consistent while synchronizing less often:
PRAGMA synchronous = normal;
Compared to the default synchronous = full
, committed transactions could be rolled back if there is a power loss (although not if the application crashes).
Optimize
To execute statement as efficiently as possible, SQLite has a query planner, which tries to read the tables to provide good performance (for instance by evaluating the WHERE
clauses that select the fewest rows first).
This query planner sometimes needs to know whether a column has many values or only a few, repeated values (like a boolean column would). To know this, it can’t read the whole table, as that may prove as costly as running the part of the query that is being optimized, so it uses some statistics collected in internal tables.
Using optimize right before closing the database connexion collects the statistics for some table columns. These columns are chosen mainly based on the queries executed during the connexion: if a query had benefited from more accurate statistics, the corresponding columns are analyzed.
For instance:
PRAGMA analysis_limit=400;
PRAGMA optimize;
In the above example, pragma analysis_limit ensures that optimize
won’t run for too long, by limiting the number of rows read.
Allow Using More Memory
These two pragma could result in better performance, depending on your hardware and software configuration.
- Keep temporary storage in memory:
PRAGMA temp_store = 2
. However, setting this pragma does not guarantee that the temporary storage will be held in memory. Please refer to the documentation for other parameters that may change the final outcome. - Keep more of the database pages in memory: for instance, use 32 MiB of memory for this purpose with
PRAGMA cache_size = -32000
. Note that the OS cache is already keeping parts of the database file in RAM, so this might end up wasting memory.
Consistency
user_version
The database’s schema can evolve over time. Your application could start with a car
table to represent a car but later on, you realize you want to represent bicycles, so you add a bicycle
table.
To keep track of the different versions of the schema, some libraries maintain an internal table with a single row with a version number. It then performs migrations as needed. In our example, version 1 would have only the car
table and version 2, also the bicycle
table. The migration from version 1 to version 2 add the bicycle
table.
SQLite offers the user_version pragma, to keep track of these versions. It is an integer at a fixed offset in the database file. It is simpler and more efficient than maintaining a table with versions, in particular because the table has to be found in the database file while the integer is available right away.
The drawback is that this is not portable between database engines. If you only use SQLite though, it is almost always the best option. To use it, you can write some code to check the value of the user_version
pragma value right after opening the database. If it is lower than expected, then atomically 1) run the necessary migrations and 2) increment the user_version
pragma value. I wrote a library to ease this task in rust and here is an example in python.
foreign_keys on
This may come as a surprise for folks with experience with other database system, but SQLite does not enforce foreign key constraints by default. Consequently, a foreign key can point to a row that does not exist.
This can be fixed with pragma foreign_keys:
PRAGMA foreign_keys = ON;
This comes at a performance cost however, because more checks are performed when inserting values with foreign keys. The cost is usually negligible but your mileage may vary.
Note: pragma foreign_key_check can be used to check a particular table for violated foreign key constraint. This can be useful before enabling foreign_keys
on a database with existing data.
STRICT
👾 Important
Requires SQLite version 3.37.0 (2021-11-27)
Another peculiarity of SQLite is dynamic typing through type affinity. When you define a column of type INTEGER
in most other database engines, a value of type TEXT
inserted in that column will be converted or return an error. But with SQLite, type affinities mean that if the value is not of the expected type, it will be converted if possible or stored with a different type if necessary. That’s handy when prototyping, but you may want a stricter behavior for consistency with other major SQL databases or when working with strictly typed languages.
To this end, SQLite supports the STRICT keyword at table creation. For instance, if a table t
is created like so:
CREATE TABLE t(a INTEGER) STRICT;
then
INSERT INTO t VALUES(1);
INSERT INTO t VALUES('1');
are successful but
INSERT INTO t VALUES('f');
is not and returns Runtime error: cannot store TEXT value in INTEGER column t.a (19)
.
Note: Without the STRICT
keyword at table creation, this last INSERT INTO …
statement would have been successful and would have just returned a string.
Go deeper
The full list of supported pragma with detailed descriptions is available in the documentation. Some more options can be tweaked for the specific scenario of running a SQLite for a server.
A significant improvement to WAL mode is being developped, where the WAL file won’t grow to unbound sizes. ↩︎
Liked this post? Subscribe:
Discussions
This blog does not host comments, but you can reply via email or participate in one of the discussions below: