-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document timestamp encoder precision fix #1623
base: series/0.18
Are you sure you want to change the base?
Document timestamp encoder precision fix #1623
Conversation
DNumber(epochSecondsWithNanos) | ||
val bd = (BigDecimal(ts.epochSecond) + (BigDecimal( | ||
ts.nano | ||
) * BigDecimal(10).pow(-9))).underlying.stripTrailingZeros |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potential issue: this is different from the current behavior in the regular encoders, because it strips all the trailing zeros, rather than setting the scale to 9 at all times. Based on this comment jsoniter will respect the scales set on the BigDecimal
, so striping all trailing zeros seem like a more general solution, rather than doing that only if nano == 0
.
Please let me know what you think, it definitely has to be consistent across both visitors, so I will adjust once this discussion is resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with stripping trailing zeros, as long as the adjustment on the jsoniter-based codecs is not too complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I missed something, but to me it looks like it's as straighforward as the Document
fix.
@Baccata can you take a look? |
} else { | ||
out.writeVal(BigDecimal(x.epochSecond) + x.nano / 1000000000.0) | ||
} | ||
val bd = BigDecimal(x.epochSecond) + (x.nano * BigDecimal(10).pow(-9)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: shouldn't it be BigDecimal(x.epochSecond) + (BigDecimal(x.nano) * BigDecimal(10).pow(-9))
?
otherwise we might lose some information due to overflow before the conversion to BigDecimal
happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result of this multiplication is already a BigDecimal
, so I think it is fine
I'm happy with the change. Did you want to do anything else on this PR ? |
I stored the following script in //> using jmh
//> using jmhVersion 1.37
package bench
import java.util.concurrent.TimeUnit
import org.openjdk.jmh.annotations.*
@BenchmarkMode(Array(Mode.Throughput))
@OutputTimeUnit(TimeUnit.SECONDS)
@Warmup(iterations = 2, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 1, timeUnit = TimeUnit.SECONDS)
@State(Scope.Thread)
@Fork(1)
class Timestamp {
@Param(Array("123456789", "123456000", "123000000", "0"))
var nano: Int = 123456789
var epochSecond: Long = System.currentTimeMillis() / 1000
@Benchmark
def pow() = BigDecimal((BigDecimal(epochSecond) + (nano * BigDecimal(10).pow(-9))).underlying.stripTrailingZeros)
@Benchmark
def scale() = BigDecimal(java.math.BigDecimal.valueOf(epochSecond).add(java.math.BigDecimal.valueOf(nano.toLong, 9).stripTrailingZeros))
} On
|
The version you posted unfortunately breaks the Below you can find the results for the version that passes all tests, it behaves slightly worse, but still much better than the @Benchmark
def scalaBd() = BigDecimal(epochSecond) + BigDecimal(nano * 1/1000000000.0).underlying().stripTrailingZeros()
measured on JDK 21, AMD Ryzen 7 3700X. EDIT: unfortunately this one breaks the same test for Scala Native |
The workaround for Scala.js and Scala Native is using of yet more optimized version: BigDecimal({
val es = java.math.BigDecimal.valueOf(epochSecond)
if (nano == 0) es
else es.add(java.math.BigDecimal.valueOf(nano, 9).stripTrailingZeros)
}) Here are results of benchmarks:
|
@plokhotnyuk thanks for help, all green now |
@msosnicki I've provided a serialization method for timestamps in the 2.32.0 release of jsoniter-scala, so after version upgrade you can serialize timestamps without any allocations or redundant CPU overhead as here: def encodeValue(x: Timestamp, out: JsonWriter): Unit = out.writeTimestampVal(x.epochSecond, x.nano) Also, BigDecimal({
val es = java.math.BigDecimal.valueOf(epochSecond)
if (nano == 0) es
else es.add(java.math.BigDecimal.valueOf({
if (epochSeconds < 0) -nano
else nano
}.toLong, 9).stripTrailingZeros)
}) |
@msosnicki Please read my previous comment that I updated recently, where I propose more correct and efficient solution. |
Will add a comment about the new jsoniter method. Quesion about the negative nano from your example, where is it documented? The same value before 1970 could be expressed in two ways: with flipping the sign of nano like you did, or adding to a negative |
@msosnicki @Baccata I've found only following documentation about the timestamp serialization methods: https://smithy.io/2.0/spec/protocol-traits.html#timestampformat-trait It requires truncation to milliseconds and do not mention any support of dates before the Linux epoch (Jan 1, 1970). Also, some method implementations of def epochMilli: Long = epochSecond * 1000 + {
if (epochSecond < 0) -nano
else nano
} / 1000000 |
Yeah, I think the logic needs to change in multiple places, to correctly handle values below epoch. Will look into that EDIT: only changed the decoding side, nanos are expected to always be positive (as in |
PR Checklist (not all items are relevant to all PRs)
fixes #1622