|
One more idea. Table digg_pageviews grows fast and searches there are done to tell when you last time looked on some story. Query looks on fields pv_type, pv_page_id and pv_user_id fields to identify tiny final set of result fields. Only index it can use is defined:
KEY `pv_type` (`pv_type`,`pv_page_id`,`pv_user_id`)
So, first thing in key is pv_type which is defined as enum('story','out','profile') default NULL
Meanwhile
mysql> show keys from digg_pageviews;
+----------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| digg_pageviews | 0 | PRIMARY | 1 | pv_id | A | 1536729 | NULL | NULL | | BTREE | |
| digg_pageviews | 1 | pv_type | 1 | pv_type | A | 18 | NULL | NULL | YES | BTREE | |
| digg_pageviews | 1 | pv_type | 2 | pv_page_id | A | 22598 | NULL | NULL | | BTREE | |
| digg_pageviews | 1 | pv_type | 3 | pv_user_id | A | 59104 | NULL | NULL | | BTREE | |
| digg_pageviews | 1 | pv_page_id | 1 | pv_page_id | A | 15680 | NULL | NULL | | BTREE | |
| digg_pageviews | 1 | pv_user_id | 1 | pv_user_id | A | 328 | NULL | NULL | | BTREE | |
+----------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
6 rows in set (1.40 sec)
mysql>
We see that cardinality of pv_type is low and query can run much better if it can use pv_user_id or pv_page_id fields. Which on, depends on exact query. For example Google crawler constantly looking on our pages and making new records in this table with pv_user_id=0. Then pv_page_id is better field because it is more unique. Sometime pv_user_id is better. So I added two more indexes to let query optimizer to choose from:
alter table digg_pageviews add key pv_page_id (pv_page_id), add key pv_user_id (pv_user_id);
Warning! Adding those indexes can be really slow and block whole site! I made it 4am at night automatically to avoid problems. After that change I cannot see related queries in mysql processlist anymore. This means, they went MUCH faster.
I wrote down more notes somewhere but cannot find them at moment. Anyway I identify slow queries just looking "show processlist" many times. If you see something there often, then it is probably slow and needs to be optimized.
|