Ask The Real Tom

July 30, 2010



MySQL solution: Speed up pattern searches starting with a wildchar (like ‘%abc%’)

Filed under: Uncategorized — Tags: , , , , , , , , — admin @ 10:04 am

Searches on MySQL with expression like ” … like ‘%abc%'” can be incredibly slow on large tables.
The problem is, MySQL can not use indices for expression starting with a wildcard “%”.
Neither MySQL supports function based indexes.

I will explain a simple solution. But first lets have a look to our problem.
Table size: 2G
Row count: 33’000
Query response time: ~ 25s

mysql> desc images;
+-------------+-----------------------------------------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------------------------------------------------------------------+------+-----+---------+----------------+
| imageid | int(11) | NO | PRI | NULL | auto_increment |
| filename | varchar(200) | YES | MUL | NULL | |
| description | varchar(200) | YES | | NULL | |
| data | longblob | YES | | NULL | |
| type | enum('coversmall','cover','coverback','screenshot','specialcover','banner') | YES | MUL | NULL | |
+-------------+-----------------------------------------------------------------------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)

The index can not be used


mysql> explain select imageid,filename,description,type from images where filename like '%call%' order by filename;
+----+-------------+--------+------+---------------+------+---------+------+--------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+------+---------+------+--------+-----------------------------+
| 1 | SIMPLE | images | ALL | NULL | NULL | NULL | NULL | 371398 | Using where; Using filesort |
+----+-------------+--------+------+---------------+------+---------+------+--------+-----------------------------+
1 row in set (0.00 sec)

Index can be use if the expression does not start with a wild card


mysql> explain select imageid,filename,description,type from images where filename like 'call%' order by filename;
+----+-------------+--------+-------+---------------------+---------------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------------+---------------------+---------+------+------+-------------+
| 1 | SIMPLE | images | range | idx_images_filename | idx_images_filename | 603 | NULL | 66 | Using where |
+----+-------------+--------+-------+---------------------+---------------------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select imageid,filename,description,type from images where filename like 'call%abc' order by filename;
+----+-------------+--------+-------+---------------------+---------------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------------+---------------------+---------+------+------+-------------+
| 1 | SIMPLE | images | range | idx_images_filename | idx_images_filename | 603 | NULL | 66 | Using where |
+----+-------------+--------+-------+---------------------+---------------------+---------+------+------+-------------+

Since we are searching for the filename, we a new column for the filename. Add a character in front of the string and index the new column.


mysql> alter table images add vfilename varchar(201);

Query OK, 33064 rows affected (3 min 4.73 sec)
Records: 33064 Duplicates: 0 Warnings: 0

update images set vfilename=concat('a',filename);

Query OK, 33064 rows affected (52.40 sec)
Rows matched: 33064 Changed: 33064 Warnings: 0

create index idx_images_vfilename on images (vfilename);

Now we change our application so it add always the same character “a” in the pattern.
So we can use the index.


mysql> explain select imageid,filename,description,type from images where vfilename like 'a%abc%' order by filename;
+----+-------------+--------+-------+----------------------+----------------------+---------+------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+----------------------+----------------------+---------+------+-------+-----------------------------+
| 1 | SIMPLE | images | range | idx_images_vfilename | idx_images_vfilename | 603 | NULL | 75840 | Using where; Using filesort |
+----+-------------+--------+-------+----------------------+----------------------+---------+------+-------+-----------------------------+
1 row in set (0.00 sec)

The perfomance benefit is, that we can do now a index scan (full/range scan) which read much less data than a full table scan.
We reduced i/o, saved our cache and speed up the query by magnitudes!

This is a real ask the real tom solution!

Powered by WordPress