Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text de/encoding abstraction #86

Closed
oscbyspro opened this issue Sep 22, 2023 · 14 comments
Closed

Text de/encoding abstraction #86

oscbyspro opened this issue Sep 22, 2023 · 14 comments
Labels
addition oh, so shiny!
Milestone

Comments

@oscbyspro
Copy link
Owner

oscbyspro commented Sep 22, 2023

The text conversions of NBKDoubleWidth and NBKFlexibleWidth can almost be abstracted and packaged as a standalone module, perhaps as a formatter. I think that would be neat. I'll see what changes I need to make, I don't think it's much.

@oscbyspro oscbyspro added the addition oh, so shiny! label Sep 22, 2023
@oscbyspro
Copy link
Owner Author

oscbyspro commented Sep 22, 2023

One idea that does not require additional protocol requirements is copying the words to a temporary allocation and then doing whatever from there. Copying the words has some overhead, but I don't know whether it's significant compared to the rest of it (I doubt it). With some conveniences on StrictBinaryInteger (#85) it wouldn't even be that bad to perform arithmetic on an UnsafeMutableBufferPointer. (On the topic of encoding.)

@oscbyspro oscbyspro added the maybe to do, or not to do? label Sep 23, 2023
@oscbyspro oscbyspro removed the maybe to do, or not to do? label Sep 24, 2023
@oscbyspro oscbyspro added this to the v0.13.0 milestone Sep 24, 2023
oscbyspro added a commit that referenced this issue Sep 24, 2023
@oscbyspro
Copy link
Owner Author

oscbyspro commented Sep 24, 2023

Note: if radix literal encoding is important, then I think it should be part of a dedicated hexadecimal (etc) formatter. I think that's cleaner, because otherwise you end up with some silly branch checking the radix or whether the radix matches the literal. It's also non-option for most radices.

@oscbyspro
Copy link
Owner Author

oscbyspro commented Sep 24, 2023

At a glance, I think I can abstract the decoding part (efficiently) if binary integers have an init(words:) method.

@oscbyspro
Copy link
Owner Author

oscbyspro commented Sep 24, 2023

Maybe I want some kind of static maxBitWidth, so I can break too-large-fixed-width decoding early. Or a static bitWidth: Int? to check for fixed-width? Just throwing some ideas around. A single implementation would be nice.

@oscbyspro
Copy link
Owner Author

oscbyspro commented Sep 24, 2023

I can save some multiplications with a staggered approach (more significant for big integers):

while ... {
    .....
    words[endIndex] = SUISS.multiplyFullWidth(&words, by: radix.power, add: word, upTo: endIndex) // new
    words.formIndex(after: &endIndex)
}

Additionally, you don't need to pre-initialize the memory when doing that. You can do it in-loop.

@oscbyspro
Copy link
Owner Author

oscbyspro commented Sep 24, 2023

Hm. If I'm using an init(words:) method, maybe I should try ContigousArray(...) vs withUnsafeTemporaryAllocation(...), since a big integer might just take the array without copying anything (assuming the big integer is array-based).

Edit: withUnsafeTemporaryAllocation(...) much better, for fixed-width at least.

@oscbyspro
Copy link
Owner Author

I see some initial performance hits, but I believe I can fix most of them. Still, it's a huge maintainability win if NBKDoubleWidth, NBKFlexibleWidth and NBKSigned can share one text decoding/encoding implementation.

oscbyspro added a commit that referenced this issue Sep 25, 2023
@oscbyspro
Copy link
Owner Author

oscbyspro commented Sep 26, 2023

I don't think it will end up being too complex, so I might create a private namespace rather than a new module.

@oscbyspro oscbyspro changed the title TextKit Text conversion abstraction Sep 26, 2023
oscbyspro added a commit that referenced this issue Sep 26, 2023
@oscbyspro oscbyspro changed the title Text conversion abstraction Text de/encoding abstraction Sep 26, 2023
@oscbyspro
Copy link
Owner Author

Looking at Instruments, it appears that UInt256.init(words:) accounts for about 5% of UInt256.init(_:radix:).

oscbyspro added a commit that referenced this issue Sep 26, 2023
@oscbyspro
Copy link
Owner Author

oscbyspro commented Sep 27, 2023

It turns out that short circuiting decoding based on length is a bit finicky and can only be done exactly in base 2.


Maybe I'll let it behave as-if I'm converting it to a big integer first (which is basically what I'm doing with pointers).

@oscbyspro
Copy link
Owner Author

oscbyspro commented Sep 27, 2023

Hm. I suppose I can perform perfect radix decoding without a temporary allocation, via a custom lazy collection.

Edit: Hm, maybe not. The subscript would need to be throwing or return optional elements.

@oscbyspro
Copy link
Owner Author

Hm. Most of the decoding performance issue is solved by making the decoding algorithm non-generic and then:

@inlinable static func decode<Magnitude: NBKUnsignedInteger>(digits: NBK.UnsafeUTF8, solution: PerfectRadixSolution<UInt>, as type: Magnitude.Type = Magnitude.self) -> Magnitude? {
    var    magnitude:   Magnitude?
    self.decode(digits: digits, solution: solution) { magnitude = Magnitude(words: $0) }
    return magnitude as Magnitude?
}

@usableFromInline static func decode(digits: NBK.UnsafeUTF8, solution: PerfectRadixSolution<UInt>, perform: (NBK.UnsafeWords) -> Void) -> Void? { 
    ...
}

@oscbyspro
Copy link
Owner Author

Note to self: merge future branches with pull requests so you have something to reference later. Zzz.

@oscbyspro
Copy link
Owner Author

The the encoder works for Signed<Magnitude>, but I need a sign-magnitude-pair decoding method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition oh, so shiny!
Projects
None yet
Development

No branches or pull requests

1 participant