I was talking to a bunch of friends the other day, and one of them started
telling a story how in a startup they’re working at production postgres instance
was slow, and there was some deliberation on making it a bit bigger (ordering a
bigger, more expensive virtual machine), so this was going on for a while and he
asks the other guy - “do you know why it is slow? did you run explain
on it?”.
So the other guy was confused, turns out they didn’t know what EXPLAIN
is, and
they didn’t have the correct indexes and once they added the correct index
they’ve had a 600x speedup without changing the shape of the VM.
I pointed out this was nothing new, as I did exactly the same with out common acquaintance back in year 2000, where that person was complaining that their database is slow and they need a new server, and I asked if their queries are using the correct indexes (“what is an index”).
Another friend in the call joined the party and told a story of the application which fetches the entire contents of the database and then filters out a few rows that’s needed in the app, and the lead developer (who used to be a CTO of a startup) wasn’t able to speed it up no matter what they did! Except one thing that they didn’t, is to write a correct database query which only fetches the records they need. Again, the concept known at least for past 20 years, but suddenly unavailable.
What all of these cases have in common, you’d ask? They all prove the point that our industry is consistently failing to solve well-known problems and keeps cycling around them in weird self-repeating nonsense loops. It should be basic common sense to use indexes (and, in general, use data structures and access methods adequate to the task at hand). But no, everyone is self-taught and because they were learning their databases building their joke of a website with barely 3 visitors per day, they never needed the performance so they never learned.