Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query is very slow when keep order is true #54969

Closed
guo-shaoge opened this issue Jul 26, 2024 · 5 comments · Fixed by #55141 or #55196
Closed

query is very slow when keep order is true #54969

guo-shaoge opened this issue Jul 26, 2024 · 5 comments · Fixed by #55141 or #55196
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. report/customer Customers have encountered this bug. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug. type/regression

Comments

@guo-shaoge
Copy link
Collaborator

guo-shaoge commented Jul 26, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

if it.req.KeepOrder {
// Don't set high concurrency for the keep order case. It wastes a lot of memory and gains nothing.
// TL;DR
// Because for a keep order coprocessor request, the cop tasks are handled one by one, if we set a
// higher concurrency, the data is just cached and not consumed for a while, this increase the memory usage.
// Set concurrency to 2 can reduce the memory usage and I've tested that it does not necessarily
// decrease the performance.
// For ReqTypeAnalyze, we keep its concurrency to avoid slow analyze(see https://github.com/pingcap/tidb/issues/40162 for details).
if it.concurrency > 2 && it.req.Tp != kv.ReqTypeAnalyze {
oldConcurrency := it.concurrency
partitionNum := req.KeyRanges.PartitionNum()
if partitionNum > it.concurrency {
partitionNum = it.concurrency
}
it.concurrency = 2
if it.concurrency < partitionNum {
it.concurrency = partitionNum
}
failpoint.Inject("testRateLimitActionMockConsumeAndAssert", func(val failpoint.Value) {
if val.(bool) {
// When the concurrency is too small, test case tests/realtikvtest/sessiontest.TestCoprocessorOOMAction can't trigger OOM condition
it.concurrency = oldConcurrency
}
})
}
if it.smallTaskConcurrency > 20 {
it.smallTaskConcurrency = 20
}
it.sendRate = util.NewRateLimit(2 * (it.concurrency + it.smallTaskConcurrency))
it.respChan = nil
} else {
it.respChan = make(chan *copResponse)
it.sendRate = util.NewRateLimit(it.concurrency + it.smallTaskConcurrency)
}

This code is intended to make memory usage more efficient. The basic idea is when keep order is true, there is no need to start too many cop workers to receive data, because we want the data to return by order.

However, if selection pushdown occurs, it may require reading many rows to get the desired data. In that case, keeping the concurrency level too low can significantly slow down performance.

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

master

@guo-shaoge guo-shaoge added the type/bug The issue is confirmed as a bug. label Jul 26, 2024
@guo-shaoge
Copy link
Collaborator Author

introduced by #35975

@tiancaiamao
Copy link
Contributor

Maybe we can simply relax the restraction of concurrency when the cop request goes to tiflash.
Because in the tiflash scanerio, the tidb memory usage is not the top concern.

Other changes need more expirements, like here you mentioned:

However, if selection pushdown occurs, it may require reading many rows to get the desired data. In that case, keeping the concurrency level too low can significantly slow down performance.

If the selection pushdown filter a lot data, the cop response packet is small enough, then increase concurrency would not increase the memory usage ... but it's more difficult to predicate how large the return packet would be.

@kennedy8312
Copy link

/type regression

@winoros
Copy link
Member

winoros commented Aug 5, 2024

More than 2 oncalls are related to this problem.
Just change the severity to ensure we will not forget to bring it to the later minor releases.

@seiya-annie
Copy link

/report customer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. report/customer Customers have encountered this bug. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug. type/regression
Projects
None yet
6 participants