Table with 80 million records and adding an index takes more than 18 hours (or forever)! Now what?

Ok turns out that this problem was more than just a simple create a table, index it and forget problem 🙂 Here’s what I did just in case someone else faces the same problem (I have used an example of IP Address but it works for other data types too):

Problem: Your table has millions of entries and you need to add an index really fast

Usecase: Consider storing millions of IP addresses in a lookup table. Adding the IP addresses should not be a big problem but creating an index on them takes more than 14 hours.

Solution: Partition your table using MySQL’s Partitioning strategy

Case #1: When the table you want is not yet created

CREATE TABLE IPADDRESSES(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  ipaddress BIGINT UNSIGNED,
  PRIMARY KEY(id, ipaddress)
) ENGINE=MYISAM
PARTITION BY HASH(ipaddress)
PARTITIONS 20;

Case #2: When the table you want is already created.
There seems to be a way to use ALTER TABLE to do this but I have not yet figured out a proper solution for this. Instead, there is a slightly inefficient solution:

CREATE TABLE IPADDRESSES_TEMP(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  ipaddress BIGINT UNSIGNED,
  PRIMARY KEY(id)
) ENGINE=MYISAM;

Insert your IP addresses into this table. And then create the actual table with partitions:

CREATE TABLE IPADDRESSES(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  ipaddress BIGINT UNSIGNED,
  PRIMARY KEY(id, ipaddress)
) ENGINE=MYISAM
PARTITION BY HASH(ipaddress)
PARTITIONS 20;

And then finally

INSERT INTO IPADDRESSES(ipaddress) SELECT ipaddress FROM IPADDRESSES_TEMP;
DROP TABLE IPADDRESSES_TEMP;
ALTER TABLE IPADDRESSES ADD INDEX(ipaddress)

And there you go… indexing on the new table took me about 2 hours on a 3.2GHz machine with 1GB RAM 🙂 Hope this helps.

Leave a Comment