-
-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved performance of String.repeat()
#64489
Conversation
Would it be an unreasonable loss of performance to switch between the two algorithms depending on how long the String is? |
I'm not sure the method itself is as optimized as it could be, I don't know why it's slower for small inputs... But idk how to improve it. Maybe somehow avoid using a zero terminator for the intermediate strings, or somehow do a single resize at the beginning, or something else? I'm open to ideas. |
String String::repeat(int p_count) const {
ERR_FAIL_COND_V_MSG(p_count < 0, "", "Parameter count should be a non-negative number.");
if (p_count == 0 || is_empty()) {
return "";
}
if (p_count == 1) {
return *this;
}
int len = length();
String new_string = *this;
new_string.resize(p_count * len + 1);
char32_t *dst = new_string.ptrw();
int offset = 1;
int stride = 1;
while (offset < p_count) {
memcpy(dst + offset * len, dst, stride * len * sizeof(char32_t));
offset += stride;
stride = MIN(stride * 2, p_count - offset);
}
dst[p_count * len] = _null;
return new_string;
} |
b0bc971
to
a18cae4
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@MewPurPur You need to fix the formatting (you have spaces instead of tabs), see Code style guidelines for how-to (or go directly to hooks). |
a18cae4
to
32a8d31
Compare
No idea how that happened |
No idea why the checks don't pass. But I want to try to re-implement this with memcpy anyway. I've figured the resizes from the concatenation are most likely the reason for the performance drop on small strings and inputs. I don't want it merged like this, because 80% of the use cases would see a performance drop. |
It happens because Line 537 in dbd1524
in such case destination and source memory blocks overlap as src + rhs_len * sizeof(char32_t) == dst (last char32_t to copy from src is at the same address as first char32_t in dst ). And that's an undefined behavior as memcpy requires these to not overlap.
|
32a8d31
to
068cb43
Compare
Ended up doing a condition for different lengths after all. Thankfully, it turns out for strings longer than 4 characters, this solution is always beneficial. |
Looks way overcomplicated to me. I've edited my previous comment with a new version using |
068cb43
to
72fb219
Compare
@kleonc Oh, my, god. The performance difference is stunning, especially compared to how it used to be. All of the tests now take between 5-8 ms in the benchmark. |
72fb219
to
b0a5aab
Compare
I'm not sure why the godot-cpp test is failing. I restarted the build, if it fails again it might require a rebase if somehow this PR was made at a time where godot-cpp was in a broken state. |
This will need a rebase indeed to fix the CI. |
b0a5aab
to
dae64e5
Compare
Never done rebases, hope I did well. Also I removed the conditions for basic cases since the default algorithm handles them fast now. |
Thanks! |
@MewPurPur Do you have the benchmark script available? I'm working on backporting this optimization to |
@Calinou Yes, just gimme a few mins to get to my laptop! I'll pass it to you on rocketchat |
I was interested in implementing
Array.repeat
for my game so I looked into whether the method is already optimized inString.repeat
; it wasn't, and also, it didn't avoid expensive computations in simple cases.Optimization is like the fast power algorithm but with string concatenation via memcpy. Scales much better with big strings or more repeats.
Benchmark (updated):