从以下的体育比赛表格中:
| id_ | p1_id | p2_id | match_date | p1_stat | p2_stat |
|---|---|---|---|---|---|
| 852666 | 1 | 2 | 01/01/1997 | 1301 | 249 |
| 852842 | 1 | 2 | 13/01/1997 | 2837 | 2441 |
| 853471 | 2 | 1 | 05/05/1997 | 1474 | 952 |
| 4760 | 2 | 1 | 25/05/1998 | 1190 | 1486 |
| 6713 | 2 | 1 | 18/01/1999 | 2084 | 885 |
| 9365 | 2 | 1 | 01/11/1999 | 2894 | 2040 |
| 11456 | 1 | 2 | 15/05/2000 | 2358 | 1491 |
| 13022 | 1 | 2 | 14/08/2000 | 2722 | 2401 |
| 29159 | 1 | 2 | 26/08/2002 | 431 | 2769 |
| 44915 | 1 | 2 | 07/10/2002 | 1904 | 482 |
对于选定的比赛id_,我希望返回两位选手在他们各自上一场比赛中的统计数据,无论该选手在上一场比赛中是p1还是p2。请参考下面的预期输出,其中id_ = 11456:
| id_ | p1_id | p2_id | match_date | p1_stat | p2_stat | p1_prev_stat | p2_prev_stat |
|---|---|---|---|---|---|---|---|
| 11456 | 1 | 2 | 15/05/2000 | 2358 | 1491 | 2040 | 2894 |
在这个大小的表上,以下的SQL语句完全正常运行:
WITH cte_1 AS (
(
SELECT
id_,
match_date,
p1_id AS player_id,
p1_stat AS stat
FROM
test.match_table
UNION ALL
SELECT
id_,
match_date,
p2_id AS player_id,
p2_stat AS stat
FROM
test.match_table
)
),
cte_2 AS (
SELECT
id_,
player_id,
LAG(stat) OVER (
PARTITION BY player_id
ORDER BY
match_date,
id_
) AS prev_stat
FROM
cte_1
)
SELECT
m.*,
cte_p1.prev_stat AS p1_prev_stat,
cte_p2.prev_stat AS p2_prev_stat
FROM
test.match_table AS m
JOIN cte_2 AS cte_p1 ON cte_p1.id_ = m.id_
AND cte_p1.player_id = m.p1_id
JOIN cte_2 AS cte_p2 ON cte_p2.id_ = m.id_
AND cte_p2.player_id = m.p2_id
WHERE m.id_ = 11456
ORDER BY m.match_date
然而,实际的表格有130万行,大约需要12秒的时间。根据这个回答,问题似乎是CTE加载了所有或大部分的表格行,而不仅仅是需要的行。然而,解决方案并没有涵盖这种情况。
是否有人能提供一些建议,以改善性能?、
以下是创建小表格的SQL语句:
CREATE TABLE `match_table` (
`id_` int NOT NULL AUTO_INCREMENT,
`p1_id` int NOT NULL,
`p2_id` int NOT NULL,
`match_date` date NOT NULL,
`p1_stat` int DEFAULT NULL,
`p2_stat` int DEFAULT NULL,
PRIMARY KEY (`id_`),
KEY `ix__p1_id` (`p1_id`),
KEY `ix__p2_id` (`p2_id`),
KEY `ix__match_date` (`match_date`),
KEY `ix__comp` (`p1_id`, `p2_id`, `match_date`)
);
INSERT INTO `match_table`
VALUES
(
4760, 2, 1, '1998-05-25', 1190, 1486
),
(6713, 2, 1, '1999-01-18', 2084, 885),
(
9365, 2, 1, '1999-11-01', 2894, 2040
),
(
11456, 1, 2, '2000-05-15', 2358, 1491
),
(
13022, 1, 2, '2000-08-14', 2722, 2401
),
(
29159, 1, 2, '2002-08-26', 431, 2769
),
(
44915, 1, 2, '2002-10-07', 1904, 482
),
(
852666, 1, 2, '1997-01-01', 1301, 249
),
(
852842, 1, 2, '1997-01-13', 2837, 2441
),
(
853471, 2, 1, '1997-05-05', 1474, 952
); Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号
首先,确保你的数据库上有正确的索引是关键。你应该确保id_、p1_id、p2_id和match_date这些字段都建立了索引,因为它们在你的查询中被使用到。
其次,不要使用两个CTE(公共表达式),而是直接在主查询中使用LAG()函数。这样可以避免重复行的需求,从而加快查询速度。以下是你可以使用的方法:
SELECT m.*, LAG(m.p1_stat) OVER (PARTITION BY m.p1_id ORDER BY m.match_date, m.id_) AS p1_prev_stat, LAG(m.p2_stat) OVER (PARTITION BY m.p2_id ORDER BY m.match_date, m.id_) AS p2_prev_stat FROM test.match_table AS m WHERE m.id_ = 11456 ORDER BY m.match_date;这个查询执行以下操作:
然而,如果这种方法无法满足您的性能需求,可以考虑创建一个汇总表。该表会跟踪每个球员在每场比赛中的统计数据。每当添加新的比赛结果时,更新汇总表。这会占用更多的存储空间并影响写入性能,但极大地提高了读取性能。