Replace protobuf-java(lite) with pure Kotlin implementations on JVM #148

garyp · 2021-04-25T23:51:50Z

This avoids compatibility issues when applications have transitive dependencies on different versions of protobuf-java(lite) via pbandk and some other library (e.g. Firebase). Since we no longer depend on protobuf-java(lite), we now also bundle
the well-known types proto files ourselves. Applications using the Protobuf Gradle Plugin expect these proto files to be available in pbandk (or one of its dependencies) in order to run protoc-gen-kotlin.

Additional changes included in this PR:

Allow reading multiple messages from an InputStream

Since protobuf messages are not self-delimiting, by default decodeFromStream() will try to read until the end of the stream and try to decode all bytes it reads as part of the message. Applications will often prefix a message with its length when writing multiple messages to a single output stream. When consuming such a stream, the application can read the length first and then pass it to decodeFromStream() to make sure only that many bytes are read from the stream. Also modify the encodeToStream() method to return the number of bytes that were written to the stream.
Allow running JVM conformance tests with different I/O implementations

By setting the PBANDK_CONFORMANCE_JVM_IO environment variable to either BYTE_BUFFER or BYTE_BUFFER, the conformance tests will instead encode/decode using ByteBuffer or InputStream/OutputStream on the JVM. This is handy as a quick test that those I/O paths work correctly. It's also handy as a rough benchmark since the conformance test involves a fair amount of protocol buffer encoding/decoding.

I did some very rough benchmarks of this change by running the conformance tests using both the old protobuf-java and the new pure-Kotlin implementations. From my benchmarks, the pure-Kotlin implementation is 70% slower than the protobuf-java implementation when encoding/decoding using ByteArrays or ByteBuffers. The pure-Kotlin implementation is comparable in speed (might even be slightly faster) when encoding/decoding using InputStream/OutputStream. This isn't completely surprising since the protobuf-java library has specialized implementations that make use of sun.misc.Unsafe for faster access to the byte array when sun.misc.Unsafe is available, whereas our pure-Kotlin implementation is only using the official ByteArray APIs.

This benchmark was running under the OpenJDK JVM on MacOS. Results might or might not be different on Android (since the ART runtime is very different than a typical desktop JVM, and the protobuf-javalite library also is implemented differently from protobuf-java) but I don't have an easy way to run the benchmarks on an Android device. The conformance test runner communicates with pbandk using protocol buffers sent over stdin/stdout. I modified the pbandk conformance test code to allow choosing whether to perform that stdin/stdout communication using either ByteArray, ByteBuffer, or InputStream/OutputStream.

These are the results with the previous protobuf-java pbandk implementation:

» hyperfine -w 3 -m 100 -L jvm_io BYTE_ARRAY,BYTE_BUFFER,STREAM 'env PBANDK_CONFORMANCE_JVM_IO={jvm_io} ./conformance/test-conformance.sh jvm'
Benchmark #1: env PBANDK_CONFORMANCE_JVM_IO=BYTE_ARRAY ./conformance/test-conformance.sh jvm
  Time (mean ± σ):      1.405 s ±  0.022 s    [User: 240.9 ms, System: 42.4 ms]
  Range (min … max):    1.361 s …  1.522 s    100 runs

Benchmark #2: env PBANDK_CONFORMANCE_JVM_IO=BYTE_BUFFER ./conformance/test-conformance.sh jvm
  Time (mean ± σ):      1.408 s ±  0.026 s    [User: 240.8 ms, System: 42.5 ms]
  Range (min … max):    1.351 s …  1.541 s    100 runs

Benchmark #3: env PBANDK_CONFORMANCE_JVM_IO=STREAM ./conformance/test-conformance.sh jvm
  Time (mean ± σ):      1.423 s ±  0.066 s    [User: 246.0 ms, System: 45.0 ms]
  Range (min … max):    1.332 s …  1.602 s    100 runs

Summary
  'env PBANDK_CONFORMANCE_JVM_IO=BYTE_ARRAY ./conformance/test-conformance.sh jvm' ran
    1.00 ± 0.02 times faster than 'env PBANDK_CONFORMANCE_JVM_IO=BYTE_BUFFER ./conformance/test-conformance.sh jvm'
    1.01 ± 0.05 times faster than 'env PBANDK_CONFORMANCE_JVM_IO=STREAM ./conformance/test-conformance.sh jvm'

and these are the results with the new pure-Kotlin pbandk implementation:

» hyperfine -w 3 -m 100 -L jvm_io BYTE_ARRAY,BYTE_BUFFER,STREAM 'env PBANDK_CONFORMANCE_JVM_IO={jvm_io} ./conformance/test-conformance.sh jvm'
Benchmark #1: env PBANDK_CONFORMANCE_JVM_IO=BYTE_ARRAY ./conformance/test-conformance.sh jvm
  Time (mean ± σ):      2.412 s ±  0.040 s    [User: 2.604 s, System: 0.332 s]
  Range (min … max):    2.345 s …  2.554 s    100 runs

Benchmark #2: env PBANDK_CONFORMANCE_JVM_IO=BYTE_BUFFER ./conformance/test-conformance.sh jvm
  Time (mean ± σ):      2.456 s ±  0.032 s    [User: 2.751 s, System: 0.387 s]
  Range (min … max):    2.365 s …  2.539 s    100 runs

Benchmark #3: env PBANDK_CONFORMANCE_JVM_IO=STREAM ./conformance/test-conformance.sh jvm
  Time (mean ± σ):      1.316 s ±  0.018 s    [User: 243.4 ms, System: 44.6 ms]
  Range (min … max):    1.286 s …  1.392 s    100 runs

Summary
  'env PBANDK_CONFORMANCE_JVM_IO=STREAM ./conformance/test-conformance.sh jvm' ran
    1.83 ± 0.04 times faster than 'env PBANDK_CONFORMANCE_JVM_IO=BYTE_ARRAY ./conformance/test-conformance.sh jvm'
    1.87 ± 0.04 times faster than 'env PBANDK_CONFORMANCE_JVM_IO=BYTE_BUFFER ./conformance/test-conformance.sh jvm'

seanadkinson

LGTM 👍 lmk if I should review the files with the license at the top, since those are meaty.

conformance/lib/src/nativeMain/kotlin/pbandk/conformance/Platform.kt

runtime/src/commonJvmAndroid/kotlin/pbandk/internal/binary/InputStreamWireReader.kt

JeroenMols

Looks great! I must admit that my knowledge of the project is still a bit limited to understand all changes in depth, but I tried to leave some meaningful comments.

Thanks also for the clear performance data! Honestly. I'm not too worried about that, because:

protobuf most likely used in conjunction with some kind of network requests, so we should ensure "serialization/deserialization time" <<< "network request time" instead of focussing on absolute times
time per serialization/deserialization is still low: if I understand the data correctly, the entire conformance suite runs in under 2 sec? How many tests are there in the suite? Assuming it's a 1000, then we are looking at ~2ms, which is below a 60fps frame rendering time.
there is a way to optimize this further (using streams) which we can explain in the readme OR we can ensure that the default examples use the fast method (e.g. provide a retrofit convertor based on streams)

protoc-gen-kotlin/lib/build.gradle.kts

runtime/build.gradle.kts

runtime/src/commonJvmAndroid/kotlin/pbandk/internal/binary/BinaryMessageDecoderJvm.kt

runtime/src/commonMain/kotlin/pbandk/InvalidProtocolBufferException.kt

Since protobuf messages are not self-delimiting, by default `decodeFromStream()` will try to read until the end of the stream and try to decode all bytes it reads as part of the message. Applications will often prefix a message with its length when writing multiple messages to a single output stream. When consuming such a stream, the application can read the length first and then pass it to `decodeFromStream()` to make sure only that many bytes are read from the stream.

The conformance test communicates with the conformance test runner using protocol buffer messages over stdin/stdout. By default it uses `encodeToByteArray()` and `decodeFromByteArray()` to encode/decode the messages on all platforms. By setting the `PBANDK_CONFORMANCE_JVM_IO` environment variable to either `BYTE_BUFFER` or `BYTE_BUFFER`, the conformance tests will instead encode/decode using `ByteBuffer` or `InputStream`/`OutputStream` on the JVM. This is handy as a quick test that those I/O paths work correctly. It's also handy as a rough benchmark since the conformance test involves a fair amount of protocol buffer encoding/decoding.

This avoids compatibility issues when applications have transitive dependencies on different versions of protobuf-java(lite) via pbandk and some other library (e.g. Firebase). Also modify the `encodeToStream()` method to return the number of bytes that were written to the stream. This can be useful information for the caller.

Since we no longer depend on protobuf-java(lite), we now have to bundle these proto files ourselves. Applications using the Protobuf Gradle Plugin expect these proto files to be available in pbandk (or one of its dependencies) in order to run `protoc-gen-kotlin`.

…d proto files Now that we're bundling the well-known type proto files, we no longer need to read them from the copy of protobuf installed on the build system. This update also pulled in a newer version of `descriptor.proto` with an added field.

garyp requested a review from JeroenMols April 26, 2021 00:14

seanadkinson approved these changes Apr 26, 2021

View reviewed changes

JeroenMols approved these changes Apr 26, 2021

View reviewed changes

garyp force-pushed the rm-protobuf-java branch from e871552 to f8d6e24 Compare April 26, 2021 17:24

garyp added 5 commits April 28, 2021 17:17

garyp force-pushed the rm-protobuf-java branch from f8d6e24 to 8bfd99b Compare April 29, 2021 00:47

garyp marked this pull request as ready for review April 29, 2021 00:47

garyp merged commit af13914 into master Apr 29, 2021

garyp deleted the rm-protobuf-java branch April 29, 2021 15:47

garyp mentioned this pull request Apr 29, 2021

PBandK 0.10.0.beta3 incompatible with Firebase Performance monitoring 19.0.10 or lower #138

Closed

garyp linked an issue Apr 29, 2021 that may be closed by this pull request

PBandK 0.10.0.beta3 incompatible with Firebase Performance monitoring 19.0.10 or lower #138

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace protobuf-java(lite) with pure Kotlin implementations on JVM #148

Replace protobuf-java(lite) with pure Kotlin implementations on JVM #148

garyp commented Apr 25, 2021 •

edited

Loading

seanadkinson left a comment

JeroenMols left a comment

Replace protobuf-java(lite) with pure Kotlin implementations on JVM #148

Replace protobuf-java(lite) with pure Kotlin implementations on JVM #148

Conversation

garyp commented Apr 25, 2021 • edited Loading

seanadkinson left a comment

Choose a reason for hiding this comment

JeroenMols left a comment

Choose a reason for hiding this comment

garyp commented Apr 25, 2021 •

edited

Loading