SQLite3
Since last week, SQLite3 is no longer just a library I’ve heard of, but an integral part of my current research work, where I decided to store the data between processing steps in SQLite3. Read on for a quick look at why I didn’t regret this choice yet ;)
First, if you have not heard about SQLite3 previously, check out their feature list: Basically, it is a tiny SQL database, written in C, which can be embedded into other applications. The database is self-contained in a single file, and is portable between different operating systems. Moreover, there are a wide range of bindings for SQLite, for example, Python 3.0 comes with it by default.
Getting started with it is easy, simply follow the C/C++ API introduction to SQLite3. What’s nice is that they provide a single source file which you include into your build, to get the whole database engine, so including is really straightforward and does not require binding against a library.
The library itself is very well designed, and robust: Every possible error is caught, and as long as you check the return codes of the library, you shouldn’t have any problems to use it (basically they do all the stuff I described once in my API design tips, and more ;) ). Setting up a database takes a line. Inserting data and retrieving is very straightforward, and can be done with minimal overhead (2 lines to set up your query, one for iteration, one per element retrieved, and one to terminate it – that’s it).
In my application I’ve set up a pipeline, where various C++ and Python tools process data in turn, and store their intermediate results in a SQL database. This has turned out to be a very good approach for data gathering, because you can rapidly check the data without writing visualisation tools, and writing those tools is very fast as you can use expressive SQL queries for your data.
A side note: You should wrap long inserts into a transaction, otherwise you’ll flush to disk after each row. The simplest way is to execute a BEGIN TRANSACTION
using sqlite3_exec
before your insert statements (for the insert, I tend to use a prepared statement), and issue an END TRANSACTION
after finalizing the prepared statement. The difference between prepared statement and a direct execution is not big, but the difference between a single large transaction and several thousand smaller ones is really dramatic.
All in all, I’m very pleased so far with SQLite, and if you have structured data in your application, it is really worth a look. Especially as it is very fast, easy to integrate and well documented.
Update: Added a tip on the usage.