When to use inherited tables in PostgreSQL?

Question

There are some major reasons for using table inheritance in postgres.

Let’s say, we have some tables needed for statistics, which are created and filled each month:

statistics
    - statistics_2010_04 (inherits statistics)
    - statistics_2010_05 (inherits statistics)

In this sample, we have 2.000.000 rows in each table. Each table has a CHECK constraint to make sure only data for the matching month gets stored in it.

So what makes the inheritance a cool feature – why is it cool to split the data?

PERFORMANCE: When selecting data, we SELECT * FROM statistics WHERE date BETWEEN x and Y, and Postgres only uses the tables, where it makes sense. Eg. SELECT * FROM statistics WHERE date BETWEEN ‘2010-04-01’ AND ‘2010-04-15’ only scans the table statistics_2010_04, all other tables won’t get touched – fast!
Index size: We have no big fat table with a big fat index on column date. We have small tables per month, with small indexes – faster reads.
Maintenance: We can run vacuum full, reindex, cluster on each month table without locking all other data

For the correct use of table inheritance as a performance booster, look at the postgresql manual.
You need to set CHECK constraints on each table to tell the database, on which key your data gets split (partitioned).

I make heavy use of table inheritance, especially when it comes to storing log data grouped by month. Hint: If you store data, which will never change (log data), create or indexes with CREATE INDEX ON () WITH(fillfactor=100); This means no space for updates will be reserved in the index – index is smaller on disk.

UPDATE:
fillfactor default is 100, from http://www.postgresql.org/docs/9.1/static/sql-createtable.html:

The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default

Leave a Comment Cancel reply