Alex's Slip-box

These are my org-mode notes in sort of Zettelkasten style

Local timeouts and ActiveRecord transactions

There may be cases where the default statement_timeout value for a database connection might not be high enough for certain long-running queries. One option is to just increase the value for the connection. Another option is the increase the timeout locally, for a specific query.

# Connection timeout

For an ActiveRecord Postgres database configuration, a statement_timeout can be set in the database.yml. In this example, all queries to this database will timeout after 1 second (can use other units, ms is the default).

production_replica:
  replica: true
  variables:
    statement_timeout: 1s

It’s a good idea to set a value for this since the default is 0 which disables timeouts and could grind down an application with unexpectedly long running queries.

# Local timeout

A local time can be used to temporarily override the statement_timeout defined on the connection. A local timeout must be declared within a transaction, otherwise it will be ignored and the connection statement_timeout will be used. The timeout of 2s below will apply only for the transaction and will take precedence over the timeout for the connection (1s) See also https://www.postgresql.org/docs/16/sql-set.html

BEGIN;
SET LOCAL statement_timeout = '2s';
SELECT pg_sleep(1.8);
COMMIT;

Without the local timeout, the pg_sleep(1.8) will raise a timeout error given a statement_timeout for the connection is 1s.

ERROR:  canceling statement due to statement timeout

# How can this be used with ActiveRecord’s query interface?

We know we need a transaction, within which we need to execute a SET statement. Putting that together, we might do something like this

ActiveRecord::Base.transaction do
  ActiveRecord::Base.connection.execute("SET LOCAL statement_timeout = '2s';")
  Message.all
end

Strangely, we’ll see this in the logs:

TRANSACTION (0.7ms)  BEGIN
(3.4ms)  SET LOCAL statement_timeout = '2s';
TRANSACTION (0.5ms)  COMMIT
Message Load (0.4ms)  SELECT "messages".* FROM "messages" /* loading for pp */ LIMIT $1  [["LIMIT", 11]]

The transaction appears to COMMIT before the select statement making the local timeout useless.

In fact, this can be tested using pg_sleep. Given the statement_timeout of 1s defined for the connection in the database.yml, the following code…

ActiveRecord::Base.transaction do
  ActiveRecord::Base.connection.execute("SET LOCAL statement_timeout = '2s';")
  Message.select('id, pg_sleep(1.8)')
end

…will raise a timeout error:

An error occurred when inspecting the object: #<ActiveRecord::QueryCanceled:"PG::QueryCanceled: ERROR:  canceling statement due to statement timeout\n">

If we change the ruby code to use execute for both statements…

ActiveRecord::Base.transaction do
  ActiveRecord::Base.connection.execute("SET LOCAL statement_timeout = '2s';")
  ActiveRecord::Base.connection.execute(Message.all.to_sql)
end

…then we something that looks more like what one would expect. Both the local timeout and the query are within the transaction.

TRANSACTION (0.4ms)  BEGIN
(2.4ms)  SET LOCAL statement_timeout = '2s';
(1.2ms)  SELECT "messages".* FROM "messages"
TRANSACTION (0.9ms)  COMMIT

And if we re-do the test above with pg_timeout

ActiveRecord::Base.transaction do
  ActiveRecord::Base.connection.execute("SET LOCAL statement_timeout = '2s';")
  ActiveRecord::Base.connection.execute(
    Message.select('id, pg_sleep(1.8)').to_sql
  )
end

It will complete with no timeout error, again appearing to be within the DB transaction.

TRANSACTION (0.3ms)  BEGIN
(2.0ms)  SET LOCAL statement_timeout = '2s';
(1803.2ms)  SELECT id, pg_sleep(1.8) FROM "messages"
TRANSACTION (0.6ms)  COMMIT

I didn’t know why, in the previous examples, the query was being run outside of the transaction, so I asked about it https://github.com/rails/rails/issues/50201

TL;DR This happens because the transaction block is returning an ActiveRecord::Relation object which is lazily loaded. It is loaded after the transaction block completes (eg, when iterating over the collection or when inspect is called on it by virtue of running it in the rails console). Therefore, chaining .load will force the query to be performed within the transaction block.

ActiveRecord::Base.transaction do
  ActiveRecord::Base.connection.execute("SET LOCAL statement_timeout = '2s';")
  Message.all.load
end
TRANSACTION (0.2ms)  BEGIN
(2.9ms)  SET LOCAL statement_timeout = '2s';
Message Load (7.1ms)  SELECT "messages".* FROM "messages"
TRANSACTION (0.3ms)  COMMIT

While that explains the issue, I find this behavior undesirable. Admittedly, this is likely a rare edge case. Most uses of a transaction will involve performing write operations where this problem does not arise (eg, no lazily loaded relations with update, create, etc). Still, perhaps it is possible to change this behavior so that relations are always loaded when within a transaction block.

Search Results