Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: don't calc the heavy expression used in ORDER BY stmt twice #58208

Merged
merged 10 commits into from
Dec 23, 2024

Conversation

EricZequan
Copy link
Contributor

@EricZequan EricZequan commented Dec 12, 2024

What problem does this PR solve?

Issue Number: ref #54245 , close #56318

Problem Summary:
cherry-pick : https://github.com/tidbcloud/tidb-cse/pull/1426

In origin plan, root will calculate distance when using vector search although the result has been calculated in store node. For example:

mysql> explain SELECT id FROM t ORDER BY vec_l2_distance(val, '[1,3,6]') LIMIT 2;
+----------------------------------+---------+-----------+---------------+-----------------------------------------------------------------------+
| id                               | estRows | task      | access object | operator info                                                         |
+----------------------------------+---------+-----------+---------------+-----------------------------------------------------------------------+
| Projection_7                     | 2.00    | root      |               | test.t.id                                                             |
| └─Projection_15                  | 2.00    | root      |               | test.t.id, test.t.val                                                 |
|   └─TopN_8                       | 2.00    | root      |               | Column#4, offset:0, count:2                                           |
|     └─Projection_16              | 2.00    | root      |               | test.t.id, test.t.val, vec_l2_distance(test.t.val, [1,3,6])->Column#4 |
|       └─TableReader_14           | 2.00    | root      |               | data:TopN_13                                                          |
|         └─TopN_13                | 2.00    | cop[tikv] |               | vec_l2_distance(test.t.val, [1,3,6]), offset:0, count:2               |
|           └─TableFullScan_12     | 3.00    | cop[tikv] | table:t       | keep order:false, stats:pseudo                                        |
+----------------------------------+---------+-----------+---------------+-----------------------------------------------------------------------+
7 rows in set (0.01 sec)

mysql> explain SELECT id FROM t2 ORDER BY vec_cosine_distance(val, '[1,3,6]') LIMIT 2;
+----------------------------------------+---------+--------------+------------------------------------+------------------------------------------------------------------------------+
| id                                     | estRows | task         | access object                      | operator info                                                                |
+----------------------------------------+---------+--------------+------------------------------------+------------------------------------------------------------------------------+
| Projection_7                           | 2.00    | root         |                                    | test.t2.id                                                                   |
| └─Projection_29                        | 2.00    | root         |                                    | test.t2.id, test.t2.val                                                      |
|   └─TopN_11                            | 2.00    | root         |                                    | Column#5, offset:0, count:2                                                  |
|     └─Projection_30                    | 2.00    | root         |                                    | test.t2.id, test.t2.val, vec_cosine_distance(test.t2.val, [1,3,6])->Column#5 |
|       └─TableReader_26                 | 2.00    | root         |                                    | MppVersion: 1, data:ExchangeSender_25                                        |
|         └─ExchangeSender_25            | 2.00    | mpp[tiflash] |                                    | ExchangeType: PassThrough                                                    |
|           └─Projection_27              | 2.00    | mpp[tiflash] |                                    | test.t2.id, test.t2.val                                                      |
|             └─TopN_24                  | 2.00    | mpp[tiflash] |                                    | Column#4, offset:0, count:2                                                  |
|               └─Projection_28          | 2.00    | mpp[tiflash] |                                    | test.t2.id, test.t2.val, vec_cosine_distance(test.t2.val, [1,3,6])->Column#4 |
|                 └─TableFullScan_23     | 2.00    | mpp[tiflash] | table:t2, index:idx_embedding(val) | keep order:false, stats:pseudo, annIndex:COSINE(val..[1,3,6], limit:2)       |
+----------------------------------------+---------+--------------+------------------------------------+------------------------------------------------------------------------------+
10 rows in set (0.00 sec)

After this pr, it can be optimized by reuse the distance column and avoid exchange vector column.

mysql> explain select id from t1 order by vec_cosine_distance(vec, '[1,1,1]') limit 10;
+----------------------------------+---------+--------------+------------------------------------+------------------------------------------------------------------------------+
| id                               | estRows | task         | access object                      | operator info                                                                |
+----------------------------------+---------+--------------+------------------------------------+------------------------------------------------------------------------------+
| Projection_7                     | 10.00   | root         |                                    | test.t1.id                                                                   |
| └─TopN_11                        | 10.00   | root         |                                    | Column#9, offset:0, count:10                                                 |
|   └─TableReader_28               | 10.00   | root         |                                    | MppVersion: 1, data:ExchangeSender_27                                        |
|     └─ExchangeSender_27          | 10.00   | mpp[tiflash] |                                    | ExchangeType: PassThrough                                                    |
|       └─TopN_26                  | 10.00   | mpp[tiflash] |                                    | Column#9, offset:0, count:10                                                 |
|         └─Projection_25          | 10.00   | mpp[tiflash] |                                    | test.t1.id, test.t1.vec, vec_cosine_distance(test.t1.vec, [1,1,1])->Column#9 |
|           └─TableFullScan_24     | 10.00   | mpp[tiflash] | table:t1, index:idx_embedding(vec) | keep order:false, stats:pseudo, annIndex:COSINE(vec..[1,1,1], limit:10)      |
+----------------------------------+---------+--------------+------------------------------------+------------------------------------------------------------------------------+
7 rows in set (0.00 sec)

mysql> explain SELECT id FROM t ORDER BY vec_l2_distance(val, '[1,3,6]') LIMIT 2;
+--------------------------------+----------+-----------+---------------+-----------------------------------------------------------+
| id                             | estRows  | task      | access object | operator info                                             |
+--------------------------------+----------+-----------+---------------+-----------------------------------------------------------+
| Projection_8                   | 2.00     | root      |               | test.t.id                                                 |
| └─TopN_12                      | 2.00     | root      |               | Column#4, offset:0, count:2                               |
|   └─TableReader_25             | 2.00     | root      |               | data:TopN_24                                              |
|     └─TopN_24                  | 2.00     | cop[tikv] |               | Column#4, offset:0, count:2                               |
|       └─Projection_23          | 10000.00 | cop[tikv] |               | test.t.id, vec_l2_distance(test.t.val, [1,3,6])->Column#4 |
|         └─TableFullScan_20     | 10000.00 | cop[tikv] | table:t       | keep order:false, stats:pseudo                            |
+--------------------------------+----------+-----------+---------------+-----------------------------------------------------------+
6 rows in set (0.00 sec)

What changed and how does it work?

We add getPushedDownTopN4VectorSearch to get partial TopN and set a children plan physicalProjection to resolve partial TopN distance column. At the same time, apply the column in root plan.

In 768 dimension and 10000 vector data, we test the sql execute time -- SELECT id FROM table_name ORDER BY Vec_Cosine_Distance(embedding, search_vector) limit 10;, about 20+% performance improvement. ⬆️

before: execute time 15.81586ms
after: execute time 12.27633ms

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: “EricZequan” <[email protected]>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 12, 2024
Copy link

tiprow bot commented Dec 12, 2024

Hi @EricZequan. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@EricZequan
Copy link
Contributor Author

/cc @breezewish PTAL~

Signed-off-by: “EricZequan” <[email protected]>
@breezewish
Copy link
Member

/ok-to-test

@ti-chi-bot ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Dec 12, 2024
Copy link

codecov bot commented Dec 12, 2024

Codecov Report

Attention: Patch coverage is 84.12698% with 20 lines in your changes missing coverage. Please review.

Project coverage is 73.9225%. Comparing base (f05cbdd) to head (0c3027e).
Report is 39 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #58208        +/-   ##
================================================
+ Coverage   73.2252%   73.9225%   +0.6972%     
================================================
  Files          1681       1682         +1     
  Lines        463134     471572      +8438     
================================================
+ Hits         339131     348598      +9467     
+ Misses       103197     102159      -1038     
- Partials      20806      20815         +9     
Flag Coverage Δ
integration 43.4009% <26.1904%> (?)
unit 72.7427% <84.1269%> (+0.3823%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 53.0100% <ø> (+0.3190%) ⬆️
parser ∅ <ø> (∅)
br 45.1516% <ø> (-0.8534%) ⬇️

Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest LGTM

@winoros winoros changed the title planner: Reuse the vector distance column in TopN planner: don't calc the heavy expression used in ORDER BY stmt twice Dec 12, 2024
@winoros
Copy link
Member

winoros commented Dec 12, 2024

/hold until two approvals from planner

@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 12, 2024
Signed-off-by: “EricZequan” <[email protected]>
@EricZequan
Copy link
Contributor Author

/retest

Signed-off-by: “EricZequan” <[email protected]>
Signed-off-by: “EricZequan” <[email protected]>
Signed-off-by: “EricZequan” <[email protected]>
pkg/planner/core/task.go Outdated Show resolved Hide resolved
pkg/planner/core/task.go Outdated Show resolved Hide resolved
pkg/planner/core/task.go Show resolved Hide resolved
pkg/planner/core/task.go Show resolved Hide resolved
pkg/planner/core/task.go Outdated Show resolved Hide resolved
@EricZequan
Copy link
Contributor Author

/retest

Signed-off-by: “EricZequan” <[email protected]>
@EricZequan
Copy link
Contributor Author

/retest

Copy link
Contributor

@AilinKid AilinKid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

pkg/planner/core/task.go Outdated Show resolved Hide resolved
Signed-off-by: “EricZequan” <[email protected]>
@EricZequan
Copy link
Contributor Author

/retest

1 similar comment
@EricZequan
Copy link
Contributor Author

/retest

Copy link
Contributor

@AilinKid AilinKid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

@EricZequan
Copy link
Contributor Author

/retest

Signed-off-by: “EricZequan” <[email protected]>
@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Dec 20, 2024
@elsa0520
Copy link
Contributor

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 23, 2024
Comment on lines +884 to +890
byItemIndex := make([]int, 0)
for i, byItem := range p.ByItems {
if ContainHeavyFunction(byItem.Expr) {
byItemIndex = append(byItemIndex, i)
}
}
if fixValue && len(byItemIndex) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a plan unit test to cover this case (multiple heavy byItems and also some not-heavy byItems altogether)?

// └─Byitem: vec_distance(vec, '[1,2,3]')
// └─Schema: id, vec
//
// New: DataSource(id, vec) -> Projection(id, vec->dis) -> TopN(by dis) -> Projection(id)
Copy link
Member

@breezewish breezewish Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that this comment is incorrect? Actually it does not eliminate any columns from projection, but append new ones:

New: DataSource(id, vec) -> Projection(id, vec, vec->dis) -> TopN(by dis) -> Projection(id)

Copy link
Contributor Author

@EricZequan EricZequan Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comment will fix in next cherry-pick~

Copy link

ti-chi-bot bot commented Dec 23, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AilinKid, zanmato1984

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm approved and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Dec 23, 2024
Copy link

ti-chi-bot bot commented Dec 23, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-12-20 07:49:53.070744069 +0000 UTC m=+1202383.159546596: ☑️ agreed by AilinKid.
  • 2024-12-23 17:22:34.225563492 +0000 UTC m=+1495944.314366030: ☑️ agreed by zanmato1984.

@ti-chi-bot ti-chi-bot bot merged commit 33f0727 into pingcap:master Dec 23, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sometimes the pushed down TopN(tikv, tiflash) can keep the order expression to save cpu or network
6 participants