-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Big integer Karatsuba algorithm #108
Comments
GNU says the threshold can be reduced to 10 words, in case I feel like optimizing it. |
Hm. I think I could do a bit better if I had composable (perhaps generic) methods like: public static func increment(
_ base: inout UnsafeMutableBufferPointer<UInt>,
by rhs: UnsafeBufferPointer<UInt>,
times lhs: UnsafeBufferPointer<UInt>) {
if lhs.count < 40 || rhs.count < 40 {
...
} else {
...
}
}
public static func incrementByLongAlgorithm(
_ base: inout UnsafeMutableBufferPointer<UInt>,
by rhs: UnsafeBufferPointer<UInt>,
times lhs: UnsafeBufferPointer<UInt>) {
...
}
public static func incrementByKaratsubaAlgorithm(
_ base: inout UnsafeMutableBufferPointer<UInt>,
by rhs: UnsafeBufferPointer<UInt>,
times lhs: UnsafeBufferPointer<UInt>) {
...
} By that I mean, I wrote something quite neat and it'll take a day to clean up and push. Edit: |
I'll be doing fewer allocations, so it appears that I can lower the threshold to 20 words. |
I can remove an additional allocation and some additions if I make a decrementing version too. Hm. Dunno whether it's worth it, but it'd be neat insofar as every level of recursion allocating |
Welp. It's possible to add the middle product in-place, but when I tried it I got random results. So I suppose there might be some finicky life-time thing to think about when doing it. Or maybe the recursion breaks exclusive access. It's not an issue I want to tackle at the moment, however. |
I want to eventually end up with a non-recursive algorithm, but I'll towards it incrementally. |
Hm. The middle product can reuse the 1st product's allocation. It's at least something 🤷♂️
|
I chose to normalize each subsequence because I saw this in [silly string] - [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...] |
I suppose there is still some work to do. Zzz.
|
It's late. I'll fix it tomorrow. Also, I think I can write |
Observation. It works if I floor the partition, not ceil it. Is that a coincidence? Feels like it. Maybe not? Zzz. |
Hm. I rather confident that flooring solves it, because then the high part is larger and it must be trimmable to fit with a half shift because otherwise it would not fit with a full shift, which it must because that's just |
|
I'm adding Karatsuba multiplication to UIntXL (#33).
Computing the 10,000,000th Fibonacci element now takes 2.6 seconds on my M1 MacBook Pro. With long multiplication only, the same calculation takes 34.4 seconds. The threshold for using the Karatsuba algorithm is 40 words. Edit: I have reduced it to 2.3 seconds and 20 words, with some improvements.
The text was updated successfully, but these errors were encountered: