Is it better to create an index before filling a table with data, or after the data is in place?

Creating index after data insert is more efficient way (it even often recomended to drop index before batch import and after import recreate it).

Syntetic example (PostgreSQL 9.1, slow development machine, one million rows):

CREATE TABLE test1(id serial, x integer);
INSERT INTO test1(id, x) SELECT x.id, x.id*100 FROM generate_series(1,1000000) AS x(id);
-- Time: 7816.561 ms
CREATE INDEX test1_x ON test1 (x);
-- Time: 4183.614 ms

Insert and then create index – about 12 sec

CREATE TABLE test2(id serial, x integer);
CREATE INDEX test2_x ON test2 (x);
-- Time: 2.315 ms
INSERT INTO test2(id, x) SELECT x.id, x.id*100 FROM generate_series(1,1000000) AS x(id);
-- Time: 25399.460 ms

Create index and then insert – about 25.5 sec (more than two times slower)

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)