Is there a REAL performance difference between INT and VARCHAR primary keys?

I was a bit annoyed by the lack of benchmarks for this online, so I ran a test myself.

Note though that I don’t do it on a regular basic, so please check my setup and steps for any factors that could have influenced the results unintentionally, and post your concerns in comments.

The setup was as follows:

  • Intel® Core™ i7-7500U CPU @ 2.70GHz × 4
  • 15.6 GiB RAM, of which I ensured around 8 GB was free during the test.
  • 148.6 GB SSD drive, with plenty of free space.
  • Ubuntu 16.04 64-bit
  • MySQL Ver 14.14 Distrib 5.7.20, for Linux (x86_64)

The tables:

create table jan_int (data1 varchar(255), data2 int(10), myindex tinyint(4)) ENGINE=InnoDB;
create table jan_int_index (data1 varchar(255), data2 int(10), myindex tinyint(4), INDEX (myindex)) ENGINE=InnoDB;
create table jan_char (data1 varchar(255), data2 int(10), myindex char(6)) ENGINE=InnoDB;
create table jan_char_index (data1 varchar(255), data2 int(10), myindex char(6), INDEX (myindex)) ENGINE=InnoDB;
create table jan_varchar (data1 varchar(255), data2 int(10), myindex varchar(63)) ENGINE=InnoDB;
create table jan_varchar_index (data1 varchar(255), data2 int(10), myindex varchar(63), INDEX (myindex)) ENGINE=InnoDB;

Then, I filled 10 million rows in each table with a PHP script whose essence is like this:

$pdo = get_pdo();

$keys = [ 'alabam', 'massac', 'newyor', 'newham', 'delawa', 'califo', 'nevada', 'texas_', 'florid', 'ohio__' ];

for ($k = 0; $k < 10; $k++) {
    for ($j = 0; $j < 1000; $j++) {
        $val="";
        for ($i = 0; $i < 1000; $i++) {
            $val .= '("' . generate_random_string() . '", ' . rand (0, 10000) . ', "' . ($keys[rand(0, 9)]) . '"),';
        }
        $val = rtrim($val, ',');
        $pdo->query('INSERT INTO jan_char VALUES ' . $val);
    }
    echo "\n" . ($k + 1) . ' millon(s) rows inserted.';
}

For int tables, the bit ($keys[rand(0, 9)]) was replaced with just rand(0, 9), and for varchar tables, I used full US state names, without cutting or extending them to 6 characters. generate_random_string() generates a 10-character random string.

Then I ran in MySQL:

  • SET SESSION query_cache_type=0;
  • For jan_int table:
    • SELECT count(*) FROM jan_int WHERE myindex = 5;
    • SELECT BENCHMARK(1000000000, (SELECT count(*) FROM jan_int WHERE myindex = 5));
  • For other tables, same as above, with myindex = 'califo' for char tables and myindex = 'california' for varchar tables.

Times of the BENCHMARK query on each table:

  • jan_int: 21.30 sec
  • jan_int_index: 18.79 sec
  • jan_char: 21.70 sec
  • jan_char_index: 18.85 sec
  • jan_varchar: 21.76 sec
  • jan_varchar_index: 18.86 sec

Regarding table & index sizes, here’s the output of show table status from janperformancetest; (w/ a few columns not shown):

|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Name              | Engine | Version | Row_format | Rows    | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Collation              |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| jan_int           | InnoDB |      10 | Dynamic    | 9739094 |             43 |   422510592 |               0 |            0 |   4194304 |           NULL | utf8mb4_unicode_520_ci |  
| jan_int_index     | InnoDB |      10 | Dynamic    | 9740329 |             43 |   420413440 |               0 |    132857856 |   7340032 |           NULL | utf8mb4_unicode_520_ci |   
| jan_char          | InnoDB |      10 | Dynamic    | 9726613 |             51 |   500170752 |               0 |            0 |   5242880 |           NULL | utf8mb4_unicode_520_ci |  
| jan_char_index    | InnoDB |      10 | Dynamic    | 9719059 |             52 |   513802240 |               0 |    202342400 |   5242880 |           NULL | utf8mb4_unicode_520_ci |  
| jan_varchar       | InnoDB |      10 | Dynamic    | 9722049 |             53 |   521142272 |               0 |            0 |   7340032 |           NULL | utf8mb4_unicode_520_ci |   
| jan_varchar_index | InnoDB |      10 | Dynamic    | 9738381 |             49 |   486539264 |               0 |    202375168 |   7340032 |           NULL | utf8mb4_unicode_520_ci | 
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

My conclusion is that there’s no performance difference for this particular use case.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)