How to share single SQLite connection in multi-threaded Python application

sqlite multithreading c#
python connect to sqlite
sqlite3 connection thread safe
illegal multi threaded access to database connection swift
sqlite connection pooling
sqlite multiple connections
sqlite concurrent connections
sqlite multithreading flask

I am trying to write a multi-threaded Python application in which a single SQlite connection is shared among threads. I am unable to get this to work. The real application is a cherrypy web server, but the following simple code demonstrates my problem.

What change or changes to I need to make to run the sample code, below, successfully?

When I run this program with THREAD_COUNT set to 1 it works fine and my database is updated as I expect (that is, letter "X" is added to the text value in the SectorGroup column).

When I run it with THREAD_COUNT set to anything higher than 1, all threads but 1 terminate prematurely with SQLite related exceptions. Different threads throw different exceptions (with no discernible pattern) including:

OperationalError: cannot start a transaction within a transaction 

(occurs on the UPDATE statement)

OperationalError: cannot commit - no transaction is active 

(occurs on the .commit() call)

InterfaceError: Error binding parameter 0 - probably unsupported type. 

(occurs on the UPDATE and the SELECT statements)

IndexError: tuple index out of range

(this one has me completely puzzled, it occurs on the statement group = rows[0][0] or '', but only when multiple threads are running)

Here is the code:

CONNECTION = sqlite3.connect('./database/mydb', detect_types=sqlite3.PARSE_DECLTYPES, check_same_thread = False)
CONNECTION.row_factory = sqlite3.Row

def commands(start_id):

    # loop over 100 records, read the SectorGroup column, and write it back with "X" appended.
    for inv_id in range(start_id, start_id + 100):

        rows = CONNECTION.execute('SELECT SectorGroup FROM Investment WHERE InvestmentID = ?;', [inv_id]).fetchall()
        if rows:
            group = rows[0][0] or ''
            msg = '{} inv {} = {}'.format(current_thread().name, inv_id, group)
            print msg
            CONNECTION.execute('UPDATE Investment SET SectorGroup = ? WHERE InvestmentID = ?;', [group + 'X', inv_id])

        CONNECTION.commit()

if __name__ == '__main__':

    THREAD_COUNT = 10

    for i in range(THREAD_COUNT):
        t = Thread(target=commands, args=(i*100,))
        t.start()

It's not safe to share a connection between threads; at the very least you need to use a lock to serialize access. Do also read http://docs.python.org/2/library/sqlite3.html#multithreading as older SQLite versions have more issues still.

The check_same_thread option appears deliberately under-documented in that respect, see http://bugs.python.org/issue16509.

You could use a connection per thread instead, or look to SQLAlchemy for a connection pool (and a very efficient statement-of-work and queuing system to boot).

Using SQLite In Multi-Threaded Applications, Multi-thread. In this mode, SQLite can be safely used by multiple threads provided that no single database connection is used simultaneously in two or more  I found an article How to share a single SQLite connection in multi-threaded Python application. Then I used 'Lock' to avoid data competition, but nothing's happened. Then I used 'Lock' to avoid data competition, but nothing's happened.

I ran into the SqLite threading problem when writing a simple WSGI server for fun and learning. WSGI is multi-threaded by nature when running under Apache. The following code seems to work for me:

import sqlite3
import threading

class LockableCursor:
    def __init__ (self, cursor):
        self.cursor = cursor
        self.lock = threading.Lock ()

    def execute (self, arg0, arg1 = None):
        self.lock.acquire ()

        try:
            self.cursor.execute (arg1 if arg1 else arg0)

            if arg1:
                if arg0 == 'all':
                    result = self.cursor.fetchall ()
                elif arg0 == 'one':
                    result = self.cursor.fetchone ()
        except Exception as exception:
            raise exception

        finally:
            self.lock.release ()
            if arg1:
                return result

def dictFactory (cursor, row):
    aDict = {}
    for iField, field in enumerate (cursor.description):
        aDict [field [0]] = row [iField]
    return aDict

class Db:
    def __init__ (self, app):
        self.app = app

    def connect (self):
        self.connection = sqlite3.connect (self.app.dbFileName, check_same_thread = False, isolation_level = None)  # Will create db if nonexistent
        self.connection.row_factory = dictFactory
        self.cs = LockableCursor (self.connection.cursor ())

Example of use:

if not ok and self.user:    # Not logged out
    # Get role data for any later use
    userIdsRoleIds = self.cs.execute ('all', 'SELECT role_id FROM users_roles WHERE user_id == {}'.format (self.user ['id']))

    for userIdRoleId in userIdsRoleIds:
        self.userRoles.append (self.cs.execute ('one', 'SELECT name FROM roles WHERE id == {}'.format (userIdRoleId ['role_id'])))

Another example:

self.cs.execute ('CREATE TABLE users (id INTEGER PRIMARY KEY, email_address, password, token)')         
self.cs.execute ('INSERT INTO users (email_address, password) VALUES ("{}", "{}")'.format (self.app.defaultUserEmailAddress, self.app.defaultUserPassword))

# Create roles table and insert default role
self.cs.execute ('CREATE TABLE roles (id INTEGER PRIMARY KEY, name)')
self.cs.execute ('INSERT INTO roles (name) VALUES ("{}")'.format (self.app.defaultRoleName))

# Create users_roles table and assign default role to default user
self.cs.execute ('CREATE TABLE users_roles (id INTEGER PRIMARY KEY, user_id, role_id)') 

defaultUserId = self.cs.execute ('one', 'SELECT id FROM users WHERE email_address = "{}"'.format (self.app.defaultUserEmailAddress)) ['id']         
defaultRoleId = self.cs.execute ('one', 'SELECT id FROM roles WHERE name = "{}"'.format (self.app.defaultRoleName)) ['id']

self.cs.execute ('INSERT INTO users_roles (user_id, role_id) VALUES ({}, {})'.format (defaultUserId, defaultRoleId))

Complete program using this construction downloadable at: http://www.josmith.org/

N.B. The code above is experimental, there may be (fundamental) issues when using this with (many) concurrent requests (e.g. as part of a WSGI server). Performance is not critical for my application. The simplest thing probably would have been to just use MySql, but I like to experiment a little, and the zero installation thing about SqLite appealed to me. If anyone thinks the code above is fundamentally flawed, please react, as my purpose is to learn. If not, I hope this is useful for others.

Multi-threaded SQLite without the OperationalErrors, In this post I'd like to share a very effective technique for performing writes to a SQLite Because SQLite does not allow two connections to write to the database For Python programmers, this situation is complicated by the by dedicat ing one thread to the task of issuing all writes for the application. The Sqlite Documentation say about multithreading. 1. Single-thread. In this mode, all mutexes are disabled and SQLite is unsafe to use in more than a single thread at once. 2. Multi-thread. In this mode, SQLite can be safely used by multiple threads provided that no single database connection is used simultaneously in two or more threads. 3. Serialized.

I'm guessing here, but it looks like the reason why you are doing this is a performance concern.

Python threads aren't performant in any meaningful way for this use case. Instead, use sqlite transactions, which are super fast.

If you do all your updates in a transaction, you'll find an order of magnitude speedup.

SQLite: Small. Fast. Reliable. Choose any three., opens a write transaction, it will lock all the tables. I'm used to use a simple adapter, like SingleThreadOnly above, to access SQLite databases, but when I needed it in a multi-threaded application, a server, well, it crashed. The class MultiThreadOK above provides a multi-threaded enabled access to SQLite databases. It can be used as a drop in replacement for SingleThreadOnly.

Maximum Number of Open Simultaneous connections in the latest , provided that no single database connection is used simultaneously in two or more threads. Get the current thread's connection to the database, creating the connection if one does not exist. Ask the connection to give us a cursor. Call cursor.execute() to execute the query. If we are in auto-commit mode (i.e. no transaction is active), call the connection's commit() method to sync the changes to the database.

Python multithreading and SQLite or a similar DB : Python, available, or any other resource the operating system needs. Whatever it is, it's a high number rather than something like 10. Make sure sqlite is compiled with the multi threaded flag. You must call open on your sqlite file to create a connection on each thread, don't share connections between threads. SQLite has a very conservative threading model, when you do a write operation, which includes opening transactions that are about to do an INSERT/UPDATE/DELETE, other threads will be blocked until this operation completes.

MultiThread support for SQLite access. « Python recipes , Preferably I would like a Python/Sqlite solution but if that is not possible I am You could possibly create a queue that is shared between the threads. This then allows threads to write to the queue and then a specific thread will the multithreading and the queue Python modules to see how I can apply that to my case The most efficient solution for a single-threaded application is to simply have direct access to a single database connection. Every solution is in principle a modification or extension of this idea, and will therefore add a certain overhead. This overhead manifests itself in both increased CPU and memory usage.

Comments
  • Why do they need to share a connection? This seems like a bad idea.
  • Did you read docs.python.org/2/library/sqlite3.html#multithreading? Just create a connection per thread instead.
  • Basically, you need to do your own locking if you share the connection between threads; see bugs.python.org/issue16509 for a hint in that direction. You'd be better of using SQLAlchemy and let it handle pooling(it adds an efficient statement-of-work queuing system as well).
  • @MartijnPieters - No, I completely missed that little note at the bottom of the SQLite page. I mistakenly assumed that the existence of check_same_thread in the sqlite3 python module along with the default compilation option of serialized in SQLite made this possible. Please convert your comment to an answer so that I can mark it correct for the next person who comes along.
  • Thanks, I've converted to connection per thread. That caused the testing code for my application (the real one, not the sample code) to terminate threads early with Database is locked even though I have no lengthy write operations. However, extending the database timeout setting seems to have fixed that without any visible harm to performance. Thanks!
  • Extending the database timeout was not the solution. I still had a few cross-thread uses of connection objects. Once I got rid of those and was very careful to always call .close() on every connection once I was done with it (include in threads that only read from the database) then I was able to set the timeout back to the default and reliably run the application under heavy load.