Selecting and locking records for update with PostgreSQL

In PostgreSQL you can lock records while selecting them using FOR UPDATE. When using a LEFT JOIN you can run into an error if your selection includes NULL records for the joined table, to prevent this you should specify for which table(s) the records should be locked like so:

SELECT * FROM notes LEFT JOIN authors ON notes.author_id = authors.id FOR UPDATE OF notes;

If you also want to lock the author records in the example above you would have to use a regular join and lock the records without an author in a second query.

In some cases you may only want to select records that have not been locked yet (in cases where any match to the selection, for example in a queue like table), you can use the SKIP LOCKED modifier to do this like so:

SELECT * FROM notes LEFT JOIN authors ON notes.author_id = authors.id FOR UPDATE OF notes SKIP LOCKED;

You cannot use FOR UPDATE with aggregate functions such as COUNT, if you are using an ORM of some sorts you may have to retrieve your records and count them afterwards in code.

For more information take a look at the official PostgreSQL documentation for this feature.

Using SQL as an every day tool

Every now and then I find myself in the situation where I need to analyze data coming varying sources, be it in the form of files (spreadsheets, CSV, etc.) or data spread across different servers and/or database platforms. The way that I often see people handle this is by loading the data into a spreadsheet (using Excel, LibreOffice or an alternative) and then start filtering and fidgeting away with spreadsheet formulas and functions, creating temporary columns and/or copying and pasting things around until they get their desired result. When you only have to examine a single table of data this is not much of an issue, it may in fact be the best option to start with: its low-barrier and using column filters is pretty intuitive in most spreadsheet software that I’ve used.

When your searching and filtering needs become more complicated and/or involve multiple tables of data that you want to combine in one way or another things tend to get a bit hairy (at least for me) and I personally prefer using SQL instead. In this article I will give an in-depth explanation on how you can do that using a contrived example, if you are experienced with SQL, command line tools and the shell you can probably skim large parts of this article, if most of this is new to you it hopefully contains enough detail for you to follow along.

Continue reading “Using SQL as an every day tool”