Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata v.s OneOf : which is better? #96

Closed
huan opened this issue Oct 24, 2020 · 4 comments
Closed

Metadata v.s OneOf : which is better? #96

huan opened this issue Oct 24, 2020 · 4 comments
Assignees
Labels
enhancement New feature or request proto Protocol Buffer

Comments

@huan
Copy link
Member

huan commented Oct 24, 2020

After we found that the @chatie/grpc < 0.18 has a problem that it will send big (i.e. 100MB) files in the gRPC request directly, which will block the Node.js event loop (wechaty/puppet-service#80), we decided to use a grpc stream service to solve this problem. (#90)

However, the gRPC streaming services can only have a single parameter, which makes us have to find a way to pass more parameters when we are using a stream request.

There are two options for that:

  1. use metadata with the gRPC call. It is clean and simple, however, it does not support static typing. (see Let's remove the extra info inside stream message #94)
  2. use oneof in ProtoBuffer. It will be able to support static typing, however, it makes the stream complicated. (see GRPC Stream tools for FileBox  windmemory/repro-grpc-bug#1)

To be decided.

Appendix

  1. Backpressuring in Streams
@huan huan added proto Protocol Buffer enhancement New feature or request labels Oct 24, 2020
@lhr0909
Copy link

lhr0909 commented Oct 25, 2020

Does it make sense to follow similar protocol as S3 multipart upload requests? Would it be easier than streams?

https://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html

That said, I think it is okay to keep metadata in each stream message (no need for oneof) since it is just a small price to pay to make the processing easy.

@windmemory
Copy link
Member

I personally think there is no better choice, these two designs have it's own pros and cons:

Use metadata

  • Pros: more convenient to use and implement, the code is less
  • Cons: loss constraint of typing, if anyone made a mistake in the code, it can only be found during runtime.

Use oneof of the protobuf

  • Pros: better constraint of typing(static typing)
  • Cons: hidden protocol (first payload has to be arguments other than filebox args, then filebox name, finally data stream), more complicate code

So in my opinion, this is a choice between dynamic typing and static typing and I am okay with both design.
But from Wechaty community perspective, I think the oneof design would be better, since the whole code style of Wechaty is lean to stronger constraint.

@huan
Copy link
Member Author

huan commented Oct 25, 2020

TypeScript code for using oneof with Protocol Buffers

1. Decode FileBox from ProtoBuf

const decoder = () => new TypedTransform<FileBoxChunk, any> ({
  transform: (chunk: FileBoxChunk, _: any, callback: any) => {
    if (!chunk.hasData()) {
      throw new Error('no data')
    }
    const data = chunk.getData()
    callback(null, data)
  },
  objectMode: true,
})

async function toFileBox (
  stream: Readable<FileBoxChunk>,
): Promise<FileBox> {
  const chunk = await firstData(stream)
  if (!chunk.hasName()) {
    throw new Error('no name')
  }
  const fileName = chunk.getName()
  const fileStream = stream.pipe(decoder())

  const fileBox = FileBox.fromStream(fileStream, fileName)

  return fileBox
}

2. Encode FileBox to ProtoBuf

const encoder = () => new TypedTransform<any, FileBoxChunk> ({
  transform: (chunk: any, _: any, callback: any) => {
    const fileBoxChunk = new FileBoxChunk()
    fileBoxChunk.setData(chunk)
    callback(null, fileBoxChunk)
  },
  objectMode: true,
})

async function toFileBoxChunk (
  fileBox: FileBox,
): Promise<Readable<FileBoxChunk>> {
  const stream = new PassThrough({ objectMode: true })

  const chunk = new FileBoxChunk()
  chunk.setName(fileBox.name)

  // FIXME: Huan(202010) write might return false
  // See: Backpressuring in Streams - https://nodejs.org/en/docs/guides/backpressuring-in-streams/

  stream.write(chunk)

  fileBox
    .pipe(encoder())
    .pipe(stream)

  return stream
}

3. Decode MessageSendFile Args from ProtoBuf

const decoder = () => new TypedTransform<
  MessageSendFileStreamRequest,
  FileBoxChunk
>({
  transform: (chunk: MessageSendFileStreamRequest, _: any, callback: any) => {
    if (!chunk.hasFileBoxChunk()) {
      throw new Error('no file box chunk')
    }
    const fileBoxChunk = chunk.getFileBoxChunk()
    callback(null, fileBoxChunk)
  },
  objectMode: true,
})

async function toMessageSendFileStreamRequestArgs (
  stream: Readable<MessageSendFileStreamRequest>
): Promise<MessageSendFileStreamRequestArgs> {
  const chunk = await firstData(stream)
  if (!chunk.hasConversationId()) {
    throw new Error('no conversation id')
  }
  const conversationId = chunk.getConversationId()

  const fileBoxChunkStream = stream.pipe(decoder())
  const fileBox = await toFileBox(fileBoxChunkStream)

  return {
    conversationId,
    fileBox,
  }
}

4. Encode MessageSendFile Args to ProtoBuf

const encoder = () => new TypedTransform<
  FileBoxChunk,
  MessageSendFileStreamRequest
> ({
  transform: (chunk: FileBoxChunk, _: any, callback: any) => {
    const req = new MessageSendFileStreamRequest()
    req.setFileBoxChunk(chunk)
    callback(null, req)
  },
  objectMode: true,
})

async function toMessageSendFileStreamRequest (
  conversationId: string,
  fileBox: FileBox,
): Promise<Readable<MessageSendFileStreamRequest>> {
  const stream = new PassThrough({ objectMode: true })

  const req1 = new MessageSendFileStreamRequest()
  req1.setConversationId(conversationId)
  stream.write(req1)

  const fileBoxChunkStream = await toFileBoxChunk(fileBox)
  fileBoxChunkStream
    .pipe(encoder())
    .pipe(stream)

  return stream
}

Source code:

  1. https://github.com/windmemory/repro-grpc-bug/blob/cefbb4f88846c25644756f55fd55aa830d747dcf/src/grpc-stream-parser/to-file-box.ts
  2. https://github.com/windmemory/repro-grpc-bug/blob/cefbb4f88846c25644756f55fd55aa830d747dcf/src/grpc-stream-parser/message-send-file-stream-request.ts

Unit Tests:

  1. https://github.com/windmemory/repro-grpc-bug/blob/cefbb4f88846c25644756f55fd55aa830d747dcf/src/grpc-stream-parser/to-file-box.spec.ts
  2. https://github.com/windmemory/repro-grpc-bug/blob/cefbb4f88846c25644756f55fd55aa830d747dcf/src/grpc-stream-parser/message-send-file-stream-request.spec.ts

@huan
Copy link
Member Author

huan commented Oct 26, 2020

@lhr0909 @windmemory Thank you very much for sharing the great opinions on this important topic!

Finally, we will go with the oneof way with our grpc streaming service because it will provide strong typing for all languages including Python, Go, Java, PHP, CSharp, etc.

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request proto Protocol Buffer
Projects
None yet
Development

No branches or pull requests

3 participants