Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexFullScan with higher cost is chosen to avoid sorting for window function #46177

Closed
pcqz opened this issue Aug 17, 2023 · 4 comments · Fixed by #48186 or #56414
Closed

IndexFullScan with higher cost is chosen to avoid sorting for window function #46177

pcqz opened this issue Aug 17, 2023 · 4 comments · Fixed by #48186 or #56414
Assignees
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. found/gs found by gs severity/major sig/planner SIG: Planner type/bug The issue is confirmed as a bug.

Comments

@pcqz
Copy link

pcqz commented Aug 17, 2023

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

mysql> explain format='verbose'  select row_number() over(order by a.k), a.* from (select * from sbtest1 where id<10) a;
+------------------------------------+------------+--------------+-----------+-----------------------------+--------------------------------------------------------------------------------------------------+
| id                                 | estRows    | estCost      | task      | access object               | operator info                                                                                    |
+------------------------------------+------------+--------------+-----------+-----------------------------+--------------------------------------------------------------------------------------------------+
| Projection_8                       | 2.85       | 16898882.30  | root      |                             | Column#6, sbtest1.sbtest1.id, sbtest1.sbtest1.k, sbtest1.sbtest1.c, sbtest1.sbtest1.pad          |
| └─Window_9                         | 2.85       | 16898880.88  | root      |                             | row_number()->Column#6 over(order by sbtest1.sbtest1.k rows between current row and current row) |
|   └─IndexLookUp_13                 | 2.85       | 16898880.88  | root      |                             |                                                                                                  |
|     ├─Selection_12(Build)          | 2.85       | 253400000.00 | cop[tikv] |                             | lt(sbtest1.sbtest1.id, 10)                                                                       |
|     │ └─IndexFullScan_10           | 1000000.00 | 203500000.00 | cop[tikv] | table:sbtest1, index:k_1(k) | keep order:true                                                                                  |
|     └─TableRowIDScan_11(Probe)     | 2.85       | 898.48       | cop[tikv] | table:sbtest1               | keep order:false                                                                                 |
+------------------------------------+------------+--------------+-----------+-----------------------------+--------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

2. What did you expect to see? (Required)

mysql> explain format='verbose'  select row_number() over(order by a.k), a.* from (select * from sbtest1 use index(primary) where id<10) a;
+-------------------------------+---------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------+
| id                            | estRows | estCost | task      | access object | operator info                                                                                    |
+-------------------------------+---------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------+
| Projection_8                  | 2.85    | 561.13  | root      |               | Column#6, sbtest1.sbtest1.id, sbtest1.sbtest1.k, sbtest1.sbtest1.c, sbtest1.sbtest1.pad          |
| └─Window_9                    | 2.85    | 559.71  | root      |               | row_number()->Column#6 over(order by sbtest1.sbtest1.k rows between current row and current row) |
|   └─Sort_12                   | 2.85    | 559.71  | root      |               | sbtest1.sbtest1.k                                                                                |
|     └─TableReader_11          | 2.85    | 222.33  | root      |               | data:TableRangeScan_10                                                                           |
|       └─TableRangeScan_10     | 2.85    | 898.48  | cop[tikv] | table:sbtest1 | range:[0,10), keep order:false                                                                   |
+-------------------------------+---------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)

3. What did you see instead (Required)

IndexFullScan with higher cost is chosen.

4. What is your TiDB version? (Required)

v7.1.0

@pcqz pcqz added the type/bug The issue is confirmed as a bug. label Aug 17, 2023
@jebter jebter added sig/planner SIG: Planner severity/major affects-7.1 This bug affects the 7.1.x(LTS) versions. labels Aug 18, 2023
@ti-chi-bot ti-chi-bot bot added may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels Aug 18, 2023
@seiya-annie
Copy link

/found gs

@ti-chi-bot ti-chi-bot bot added the found/gs found by gs label Sep 20, 2023
@qw4990
Copy link
Contributor

qw4990 commented Oct 13, 2023

A minimum reproducible case:

 CREATE TABLE `sbtest` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  KEY `k` (`k`)
);

mysql> explain format='verbose'  select row_number() over(order by a.k) from (select * from sbtest where id<10) a;
+------------------------------+----------+------------+-----------+--------------------------+----------------------------------------------------------------------------------------------+
| id                           | estRows  | estCost    | task      | access object            | operator info                                                                                |
+------------------------------+----------+------------+-----------+--------------------------+----------------------------------------------------------------------------------------------+
| Projection_9                 | 10.00    | 169018.81  | root      |                          | Column#6->Column#7                                                                           |
| └─Window_10                  | 10.00    | 169017.81  | root      |                          | row_number()->Column#6 over(order by test.sbtest.k rows between current row and current row) |
|   └─IndexReader_13           | 10.00    | 169017.81  | root      |                          | index:Selection_12                                                                           |
|     └─Selection_12           | 10.00    | 2534000.00 | cop[tikv] |                          | lt(test.sbtest.id, 10)                                                                       |
|       └─IndexFullScan_11     | 10000.00 | 2035000.00 | cop[tikv] | table:sbtest, index:k(k) | keep order:true, stats:pseudo                                                                |
+------------------------------+----------+------------+-----------+--------------------------+----------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)


mysql> explain format='verbose'  select row_number() over(order by a.k) from (select * from sbtest use index(primary) where id<10) a;
+-------------------------------+---------+---------+-----------+---------------+----------------------------------------------------------------------------------------------+
| id                            | estRows | estCost | task      | access object | operator info                                                                                |
+-------------------------------+---------+---------+-----------+---------------+----------------------------------------------------------------------------------------------+
| Projection_9                  | 10.00   | 2008.16 | root      |               | Column#6->Column#7                                                                           |
| └─Window_10                   | 10.00   | 2007.16 | root      |               | row_number()->Column#6 over(order by test.sbtest.k rows between current row and current row) |
|   └─Sort_13                   | 10.00   | 2007.16 | root      |               | test.sbtest.k                                                                                |
|     └─TableReader_12          | 10.00   | 285.52  | root      |               | data:TableRangeScan_11                                                                       |
|       └─TableRangeScan_11     | 10.00   | 3015.62 | cop[tikv] | table:sbtest  | range:[0,10), keep order:false, stats:pseudo                                                 |
+-------------------------------+---------+---------+-----------+---------------+----------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)

@qw4990
Copy link
Contributor

qw4990 commented Oct 13, 2023

When generating Scan operators, if the upper operator needs ordered data, there are 2 choices:
Iterate just some of paths that can keep data in this order, generate plans.
Iterate other remaining paths, add an explicit Sort upon their plans to keep data in order.
Currently, the optimizer will try 1 first, if it can find any valid plan, it'll return and finish the process directly; otherwise, it'll try 2.
Now, the problem is plans generated from 2 may be better than 1 sometimes, but the optimzier finishes the process too early:
img_v2_eb82fca8-bc67-4542-8603-11bb6e0fcb1g

@qw4990
Copy link
Contributor

qw4990 commented Oct 13, 2023

Window and StreamAgg have this problem.
Another case about StreamAgg:
55d8f14a-407a-4e7f-88b3-05175b48f6f0

@ti-chi-bot ti-chi-bot added the affects-7.5 This bug affects the 7.5.x(LTS) versions. label Oct 23, 2023
@qw4990 qw4990 added affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. and removed may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels Nov 1, 2023
ti-chi-bot bot pushed a commit that referenced this issue Nov 3, 2023
ti-chi-bot bot pushed a commit that referenced this issue Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. found/gs found by gs severity/major sig/planner SIG: Planner type/bug The issue is confirmed as a bug.
Projects
None yet
5 participants