Why Is A Query Slow?

by Szymon LipiƄski

A couple of days ago a colleague of mine had quite a normal problem. A query was slow. The whole application was written in Java, using Hibernate, all the tables were really small.

There were about 30 rows in one table. 10 rows in another. One small join. One call of a PostGIS function.

The query took several seconds to return just about 20 rows. Quite bad.

My first try was of course checking how much time the query runs. The problematic time of a couple of seconds was the time the application needed for getting the data. This was not the time needed by database. Sometimes they differ a lot, so that was my first idea.

We changed the log_min_duration_statement to 0 in the postgresql.conf to log all the queries, even the shortest ones. Then we run the application and had a bunch of other queries in the log. We found the proper query, rerun that with EXPLAIN ANALYZE… and the query finished after 50 ms.

It just confirmed my assumptions. The query was fast.

What was slow then? The query was fast, downloading the selected data was slow.

It turned out that the 30 geometries in the table were quite huge, something like a couple of megabytes for each. The problem was not with the query, but with the speed of downloading the data.

Too often programmers think that adding an index to a database makes that database faster (in fact it makes it slower but I will write about that another time).

The index should be added when needed, after careful checking what the real problem is.