Metadata v.s OneOf : which is better? #96

huan · 2020-10-24T17:07:35Z

After we found that the @chatie/grpc < 0.18 has a problem that it will send big (i.e. 100MB) files in the gRPC request directly, which will block the Node.js event loop (wechaty/puppet-service#80), we decided to use a grpc stream service to solve this problem. (#90)

However, the gRPC streaming services can only have a single parameter, which makes us have to find a way to pass more parameters when we are using a stream request.

There are two options for that:

use metadata with the gRPC call. It is clean and simple, however, it does not support static typing. (see Let's remove the extra info inside stream message #94)
use oneof in ProtoBuffer. It will be able to support static typing, however, it makes the stream complicated. (see GRPC Stream tools for FileBox windmemory/repro-grpc-bug#1)

To be decided.

Appendix

Backpressuring in Streams

The text was updated successfully, but these errors were encountered:

lhr0909 · 2020-10-25T00:03:34Z

Does it make sense to follow similar protocol as S3 multipart upload requests? Would it be easier than streams?

https://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html

That said, I think it is okay to keep metadata in each stream message (no need for oneof) since it is just a small price to pay to make the processing easy.

windmemory · 2020-10-25T03:16:51Z

I personally think there is no better choice, these two designs have it's own pros and cons:

Use metadata

Pros: more convenient to use and implement, the code is less
Cons: loss constraint of typing, if anyone made a mistake in the code, it can only be found during runtime.

Use `oneof` of the `protobuf`

Pros: better constraint of typing(static typing)
Cons: hidden protocol (first payload has to be arguments other than filebox args, then filebox name, finally data stream), more complicate code

So in my opinion, this is a choice between dynamic typing and static typing and I am okay with both design.
But from Wechaty community perspective, I think the oneof design would be better, since the whole code style of Wechaty is lean to stronger constraint.

huan · 2020-10-25T07:16:14Z

TypeScript code for using `oneof` with `Protocol Buffers`

1. Decode FileBox from ProtoBuf

const decoder = () => new TypedTransform<FileBoxChunk, any> ({
  transform: (chunk: FileBoxChunk, _: any, callback: any) => {
    if (!chunk.hasData()) {
      throw new Error('no data')
    }
    const data = chunk.getData()
    callback(null, data)
  },
  objectMode: true,
})

async function toFileBox (
  stream: Readable<FileBoxChunk>,
): Promise<FileBox> {
  const chunk = await firstData(stream)
  if (!chunk.hasName()) {
    throw new Error('no name')
  }
  const fileName = chunk.getName()
  const fileStream = stream.pipe(decoder())

  const fileBox = FileBox.fromStream(fileStream, fileName)

  return fileBox
}

2. Encode FileBox to ProtoBuf

const encoder = () => new TypedTransform<any, FileBoxChunk> ({
  transform: (chunk: any, _: any, callback: any) => {
    const fileBoxChunk = new FileBoxChunk()
    fileBoxChunk.setData(chunk)
    callback(null, fileBoxChunk)
  },
  objectMode: true,
})

async function toFileBoxChunk (
  fileBox: FileBox,
): Promise<Readable<FileBoxChunk>> {
  const stream = new PassThrough({ objectMode: true })

  const chunk = new FileBoxChunk()
  chunk.setName(fileBox.name)

  // FIXME: Huan(202010) write might return false
  // See: Backpressuring in Streams - https://nodejs.org/en/docs/guides/backpressuring-in-streams/

  stream.write(chunk)

  fileBox
    .pipe(encoder())
    .pipe(stream)

  return stream
}

3. Decode MessageSendFile Args from ProtoBuf

const decoder = () => new TypedTransform<
  MessageSendFileStreamRequest,
  FileBoxChunk
>({
  transform: (chunk: MessageSendFileStreamRequest, _: any, callback: any) => {
    if (!chunk.hasFileBoxChunk()) {
      throw new Error('no file box chunk')
    }
    const fileBoxChunk = chunk.getFileBoxChunk()
    callback(null, fileBoxChunk)
  },
  objectMode: true,
})

async function toMessageSendFileStreamRequestArgs (
  stream: Readable<MessageSendFileStreamRequest>
): Promise<MessageSendFileStreamRequestArgs> {
  const chunk = await firstData(stream)
  if (!chunk.hasConversationId()) {
    throw new Error('no conversation id')
  }
  const conversationId = chunk.getConversationId()

  const fileBoxChunkStream = stream.pipe(decoder())
  const fileBox = await toFileBox(fileBoxChunkStream)

  return {
    conversationId,
    fileBox,
  }
}

4. Encode MessageSendFile Args to ProtoBuf

const encoder = () => new TypedTransform<
  FileBoxChunk,
  MessageSendFileStreamRequest
> ({
  transform: (chunk: FileBoxChunk, _: any, callback: any) => {
    const req = new MessageSendFileStreamRequest()
    req.setFileBoxChunk(chunk)
    callback(null, req)
  },
  objectMode: true,
})

async function toMessageSendFileStreamRequest (
  conversationId: string,
  fileBox: FileBox,
): Promise<Readable<MessageSendFileStreamRequest>> {
  const stream = new PassThrough({ objectMode: true })

  const req1 = new MessageSendFileStreamRequest()
  req1.setConversationId(conversationId)
  stream.write(req1)

  const fileBoxChunkStream = await toFileBoxChunk(fileBox)
  fileBoxChunkStream
    .pipe(encoder())
    .pipe(stream)

  return stream
}

Source code:

Unit Tests:

huan · 2020-10-26T09:18:23Z

@lhr0909 @windmemory Thank you very much for sharing the great opinions on this important topic!

Finally, we will go with the oneof way with our grpc streaming service because it will provide strong typing for all languages including Python, Go, Java, PHP, CSharp, etc.

Cheers!

huan added proto Protocol Buffer enhancement New feature or request labels Oct 24, 2020

huan mentioned this issue Oct 25, 2020

Improve stream message #95

Merged

huan self-assigned this Oct 25, 2020

huan mentioned this issue Oct 26, 2020

Restore gRPC limitation for protobuf data size to the default (4MB) wechaty/puppet-service#86

Open

huan closed this as completed Oct 26, 2020

huan mentioned this issue Nov 7, 2020

Need better compatibility when sending filebox data across grpc wechaty/puppet-service#90

Closed

huan mentioned this issue Nov 21, 2020

How to implement stream protocol for binary data in gRPC server? wechaty/puppet-service#106

Closed

huan mentioned this issue Aug 19, 2022

Using UploadRequest and DownloadRequest for FileBox after Dec 31, 2022 #185

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metadata v.s OneOf : which is better? #96

Metadata v.s OneOf : which is better? #96

huan commented Oct 24, 2020 •

edited

Loading

lhr0909 commented Oct 25, 2020

windmemory commented Oct 25, 2020

huan commented Oct 25, 2020 •

edited

Loading

huan commented Oct 26, 2020

Metadata v.s OneOf : which is better? #96

Metadata v.s OneOf : which is better? #96

Comments

huan commented Oct 24, 2020 • edited Loading

Appendix

lhr0909 commented Oct 25, 2020

windmemory commented Oct 25, 2020

Use metadata

Use oneof of the protobuf

huan commented Oct 25, 2020 • edited Loading

TypeScript code for using oneof with Protocol Buffers

1. Decode FileBox from ProtoBuf

2. Encode FileBox to ProtoBuf

3. Decode MessageSendFile Args from ProtoBuf

4. Encode MessageSendFile Args to ProtoBuf

huan commented Oct 26, 2020

huan commented Oct 24, 2020 •

edited

Loading

Use `oneof` of the `protobuf`

huan commented Oct 25, 2020 •

edited

Loading

TypeScript code for using `oneof` with `Protocol Buffers`