Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support mapping ObjectId message as mongodb.ObjectId #467

Merged
merged 1 commit into from
Jan 9, 2022

Conversation

davearata-snorack
Copy link
Contributor

MongoDB is a popular database for JS / TypeScript projects and the default type for a MongoDB id is an ObjectId. When building protobuf messages from mongo documents it becomes tedious to convert ObjectIds to strings. For example if we have a simple mongo document that has the fields _id and name like so:

{
  _id: ObjectId(),
  name: "foo"
}

and a protobuf definition like this:

message Document {
  string _id = 1;
  string name = 2;
}

You would need to call _id.toString() any time you want to send the document as a protobuf message and you would need to call new mongodb.ObjectId(message._id) any time you want to convert a message to a mongo document. If it was just one field it would not be so bad. But if you have a more complex document with arrays of ObjectIds or nested objects with ObjectIds you can run into making large functions just for iterating over fields to convert ObjectIds to strings and vice versa.

This change adds an optional flag useObjectId. The flag is set to false by default so if the flag is not set then there will be no behavior changes. Similar to the useDate flag when the useObjectId is set to true any messages with the name ObjectId will be mapped to mongodb.ObjectId so that the conversion is handled for the client. This does have an assumption that the ObjectId message has one field called value that is a string. So the protobuf definition would look like this:

message ObjectId {
  string value = 1;
}

This also requires the client installs the mongodb package from npm. There are two tests one where the ObjectId message is defined in the same proto file as the message that has an ObjectId field. A second test where the ObjectId message is defined in an external file.

Copy link
Owner

@stephenh stephenh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool.

Honestly I'm a little torn to add yet-another-flag into the mix, but I get the value-add and boilerplate reduction you're going after. And this sort of "just hack in your own use cases" is one of the benefits ts-proto tries to provide.

Just brainstorming, it seems like ideally this sort of "bring your own custom types" would be great to somehow be achievable via config settings instead of code changes...

Like basically what you're doing is setting up "for protobuf types that match ...this name..., then use ...this JS type... as the field, with ...this function for encode mapping..., ...this function for decode mapping..., and ...this function for from json handling...". Etc.

Each of those "...this function..." snippets could probably be ts-poet imp strings that would resolve to non-ts-proto-generated files that users would just ship / write themselves...

Dunno @webmaster128 / @boukeversteegh / @aikoven see ^ musing if any of you are particularly excited about running with / would benefit applications you're using ts-proto in.

But @davearata-snorack , unless you're really jazzed to explore the config-based approach (which would be great! :-)), I'm good with the current approach as well. Maybe we could eventually port it over to the config-based approach if/when that gets implemented.

I had a few minor naming questions, want to remove a few integration tests, but otherwise I'm good approving + shipping this after that.

src/main.ts Outdated
@@ -75,6 +76,10 @@ import { generateGenericServiceDefinition } from './generate-generic-service-def
export function generateFile(ctx: Context, fileDesc: FileDescriptorProto): [string, Code] {
const { options, utils } = ctx;

if (options.useObjectId) {
imp('mongodb*mongodb');
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this is not how imp works...like just calling it won't really do anything, you need have it used in a code... string that is actually output for the "auto-import" logic to kick in. I.e. you can probably just remove this line b/c it shouldn't be doing anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this line

src/main.ts Outdated
const fromObjectId = conditionalOutput(
'fromObjectId',
code`
function fromObjectId(oid: ObjectId): ${mongodb}.ObjectId {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this fromProtoObjectId?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed this to fromProtoObjectId

src/main.ts Outdated
const toObjectId = conditionalOutput(
'toObjectId',
code`
function toObjectId(oid: ${mongodb}.ObjectId): ObjectId {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, maybe toProtoObjectId?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed this to toProtoObjectId

src/options.ts Outdated
@@ -34,6 +34,7 @@ export type Options = {
forceLong: LongOption;
useOptionals: boolean | 'none' | 'messages' | 'all'; // boolean is deprecated
useDate: DateOption;
useObjectId: boolean;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets call this useMongoObjectId? Given it's got mongodb specific imports/etc, I think that will be more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed this to useMongoObjectId

@@ -0,0 +1 @@
useObjectId=false
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the comprehensiveness, but let's just remove the use-object-id-false test, I think.

The other tests essentially test the "false" codepath, and each new integration/* test we add adds a bit of non-zero overhead (due to the extra proto codegen/etc. steps each one has), so I'd prefer to just remove it for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this test

README.markdown Outdated
@@ -671,6 +678,20 @@ The representation of `google.protobuf.Timestamp` is configurable by the `useDat
| --------------------------- | ---------------------- | ------------------------------------ | ---------------- |
| `google.protobuf.Timestamp` | `Date` | `{ seconds: number, nanos: number }` | `string` |

## ObjectId
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, maybe just drop this section? I think the - With ... bullet point up above is good enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this part of the doc

@@ -333,6 +338,8 @@ Generated code will be placed in the Gradle build directory.

- With `--ts_proto_opt=useDate=false`, fields of type `google.protobuf.Timestamp` will not be mapped to type `Date` in the generated types. See [Timestamp](#timestamp) for more details.

- With `--ts_proto_opt=useObjectId=true`, fields of a type called ObjectId where the message is constructed to have on field called value that is a string will be mapped to type `mongodb.ObjectId` in the generated types. This will require your project to install the mongodb npm package. See [ObjectId](#objectid) for more details.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantically, afaiu we don't actually look inside of the ObjectId protobuf type to see that it only has a single value: string field; currently we're just matching on the type name and calling that good enough.

Which is fine, but let's update the docs to match that, and just say "protobuf fields of a type ObjectId will be mapped to mongodb.ObjectId in the generated types."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code relies on the stucture of the objectid protobuf message has that field value that's type string. that's how we convert between a mongodb.ObjectId and a protoObjectId

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, agreed we rely on .value but technically the isObjectId function is only checking the name of the message type:

https://github.com/stephenh/ts-proto/pull/467/files#diff-c54113cf61ec99691748a3890bfbeb00e10efb3f0a76f03a0fd9ec49072e410aR407

Just the way this is worded, "With ...this setting..., fields called ObjectId where the message ... has field called value" made me think that isObjectId really was going to check both message type name & like message.fields.length === 1 & message.fields[0].name === value && message.fields[0].type === string.

string value = 1;
}

message Todo {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the same vein of reducing our integration/* sprawl, I'd be tempted to make your use-objectid-true-external-import be the only integration test this PR adds.

Having ObjectId in a separate proto file I think is a good / important boundary case to test, and given that is covered by the other test, personally I'm pretty fine with trusting that the same-file use case works as well, w/o an explicit test (unless for some reason we end up having a bug / regression b/c of it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this test

src/types.ts Outdated Show resolved Hide resolved
@webmaster128
Copy link
Collaborator

I pretty much share @stephenh's concerns here. Maintaining well known types that are shred across the protobuf industry probably makes sense. Adding software specific code will hardly scale.

I think it would be relatively straight forward to create a plugin-system where you map a fully qualified protobuf type name to a plugin file that implements the methods from main.ts.

Is there any official proto definition of an ObjectId? Looking at the type definition itself, it should be either a 12 bytes bytes value or 3 components (timestamp, random, counter) as 3x bytes. Writing that as 24 hex byte strings on the wire seems pointless.

@davearata-snorack
Copy link
Contributor Author

@stephenh @webmaster128 Thanks for the feedback! This is my first time working with some of the modules like ts-poet so something like that auto import functionality i wasn't fully aware of.

I hear what you mean when you compare my ObjectId specific implementation vs a generic extensible implementation. I am not sure if I would be the best person to take on that implementation as that seems to be a bit larger of a task and I am not sure I have the availability to take that on, but I do see the value in it.

For now, I will work on the feedback you provided and update the PR.

@webmaster128
Copy link
Collaborator

I would give this plugin system a shot based on this particular use case. It think it's very helpful for some of the things we are doing as well. In a recent use case, we need to convert decimal types and I'd also love to convert Timestamps to DateWithNanoseconds.

But what I would need for this is 👇. Timestamp can be customized via google.protobuf.Timestamp. Is there any fully qualified name for this message ObjectId that is widely accepted?

Is there any official proto definition of an ObjectId?

@davearata-snorack
Copy link
Contributor Author

No I don't believe there is an official proto definition of an ObjectId. I went with a simple

message ObjectId {
  string value = 1;
}

Because that was how I was converting ObjectId's. My understanding is that it is safe to call _id.toString() and new mongodb.ObjectId(somestring) and have consistent conversion results. That being said here is the ObjectId typescript definition. Looks like the property id: Buffer is the one you are referencing

/**
 * A class representation of the BSON ObjectId type.
 * @public
 */
declare class ObjectId {
    _bsontype: 'ObjectId';
    /* Excluded from this release type: index */
    static cacheHexString: boolean;
    /* Excluded from this release type: [kId] */
    /* Excluded from this release type: __id */
    /**
     * Create an ObjectId type
     *
     * @param inputId - Can be a 24 character hex string, 12 byte binary Buffer, or a number.
     */
    constructor(inputId?: string | number | ObjectId | ObjectIdLike | Buffer | Uint8Array);
    /*
    * The ObjectId bytes
    * @readonly
    */
    id: Buffer;
    /*
    * The generation time of this ObjectId instance
    * @deprecated Please use getTimestamp / createFromTime which returns an int32 epoch
    */
    generationTime: number;
    /** Returns the ObjectId id as a 24 character hex string representation */
    toHexString(): string;
    /* Excluded from this release type: getInc */
    /**
     * Generate a 12 byte id buffer used in ObjectId's
     *
     * @param time - pass in a second based timestamp.
     */
    static generate(time?: number): Buffer;
    /**
     * Converts the id into a 24 character hex string for printing
     *
     * @param format - The Buffer toString format parameter.
     */
    toString(format?: string): string;
    /** Converts to its JSON the 24 character hex string representation. */
    toJSON(): string;
    /**
     * Compares the equality of this ObjectId with `otherID`.
     *
     * @param otherId - ObjectId instance to compare against.
     */
    equals(otherId: string | ObjectId | ObjectIdLike): boolean;
    /** Returns the generation date (accurate up to the second) that this ID was generated. */
    getTimestamp(): Date;
    /* Excluded from this release type: createPk */
    /**
     * Creates an ObjectId from a second based number, with the rest of the ObjectId zeroed out. Used for comparisons or sorting the ObjectId.
     *
     * @param time - an integer number representing a number of seconds.
     */
    static createFromTime(time: number): ObjectId;
    /**
     * Creates an ObjectId from a hex string representation of an ObjectId.
     *
     * @param hexString - create a ObjectId from a passed in 24 character hexstring.
     */
    static createFromHexString(hexString: string): ObjectId;
    /**
     * Checks if a value is a valid bson ObjectId
     *
     * @param id - ObjectId instance to validate.
     */
    static isValid(id: string | number | ObjectId | ObjectIdLike | Buffer | Uint8Array): boolean;
    /* Excluded from this release type: toExtendedJSON */
    /* Excluded from this release type: fromExtendedJSON */
    inspect(): string;
}

@boukeversteegh
Copy link
Collaborator

boukeversteegh commented Jan 8, 2022

Well, this does kind of point to the need for custom serialization / deserialization logic, but I'm also not in favor of putting application specific transformation logic within ts-proto. A hook would be preferable. Also in this case I would probably go for a generic transformer function that replaces all ObjectIds with strings.

For example (not tested):

fix(message: any) {
  if (typeof(message) === 'object' && message != null) {
    if (message.objectId) {
      message.objectId = message.objectId.toString();
    }
    Object.values(message).forEach(([key, value]) => fix(value));
  }
}

Or another approach, that can be used without modifying ts-proto (this only works for encode/decode):

  • Define your messages like this:
message User {
  google.protobuf.Value id= 1;
}
  • Overwrite the .wrap of Value. This is a decorator that detects mongo object ids, and calls toString before passing on.
    const _wrap = Value.wrap;
    Value.wrap = (value) => {
      return isMongoObjectId(value) ? _wrap(value.toString()) : _wrap(value);
    };

Now you can call User.encode({id: someMongoObjectId}) and it will serialize as string.

    expect(User.decode(User.encode({id: someMongoObjectId}).finish())).toEqual({id: "38f7gw3478fw9457gt9w457t"});

Unfortunately this cannot be done with StringValue because there is no StringValue.wrap which is called during encoding/decoding. This is something that could be fixed though, it would be a neater implementation in my opinion.

You can even do this with StringValue! Just wrap the StringValue.encode method with your own logic.

@stephenh
Copy link
Owner

stephenh commented Jan 9, 2022

Hm, @boukeversteegh I think the .wrap hooks would work at runtime, but my assumption is that @davearata-snorack wants the generated FooMesasge.oid fields to really be a monogo.ObjectId instance, so that passing data in/out of ts-proto/mongo passes type checks.

Also I would assume that the rest of the services within @davearata-snorack 's org already use a protobuf type of theirPackage.ObjectId, so switching their *.proto files over to something generic like google.protobuf.Value is not that doable, at least for this current use case.

Per the plugin system, I agree we should definitely head that way; i.e. I could see something like: --ts_proto_opt=customTypes=./customTypes.json where customTypes.json looks like something:

{
  "somePackage.ObjectId": {
    "type": "MongoDb@mongodb",
    "encode": "[email protected]",
    "decode": "[email protected]",
    "fromJson": "[email protected]"
  }
}

And fromProto / toProto / fromJson would all be export-d top-level functions in the objectIdMapper.ts file, that the user would be responsible for creating by hand / separately from the ts-proto output.

But, for now I'm going to merge this PR as-is, because if anything I like to implement things the super-simple / super-hard-coded way first, as @davearata-snorack has done, and then use that to guide making fancier abstractions like the custom types solution (for whoever wants to take it on; I personally probably won't have time soon).

Then, if/when the custom types are implemented, I'd also be fine making a breaking change of removing useMongoObjectId and have users like @davearata-snorack just migrate to the new custom types approach.

(If anything, if we don't like it, the existence of useMongoObjectId in the codebase will be a good reminder to implement the custom types system. :-D )

@stephenh stephenh merged commit 8b23897 into stephenh:main Jan 9, 2022
stephenh pushed a commit that referenced this pull request Jan 9, 2022
# [1.100.0](v1.99.0...v1.100.0) (2022-01-09)

### Features

* support mapping ObjectId message as mongodb.ObjectId ([#467](#467)) ([8b23897](8b23897))
@stephenh
Copy link
Owner

stephenh commented Jan 9, 2022

🎉 This PR is included in version 1.100.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@boukeversteegh
Copy link
Collaborator

Hm, @boukeversteegh I think the .wrap hooks would work at runtime, but my assumption is that @davearata-snorack wants the generated FooMesasge.oid fields to really be a monogo.ObjectId instance, so that passing data in/out of ts-proto/mongo passes type checks.

Also I would assume that the rest of the services within @davearata-snorack 's org already use a protobuf type of theirPackage.ObjectId, so switching their *.proto files over to something generic like google.protobuf.Value is not that doable, at least for this current use case.

Per the plugin system, I agree we should definitely head that way; i.e. I could see something like: --ts_proto_opt=customTypes=./customTypes.json where customTypes.json looks like something:

{
  "somePackage.ObjectId": {
    "type": "MongoDb@mongodb",
    "encode": "[email protected]",
    "decode": "[email protected]",
    "fromJson": "[email protected]"
  }
}

And fromProto / toProto / fromJson would all be export-d top-level functions in the objectIdMapper.ts file, that the user would be responsible for creating by hand / separately from the ts-proto output.

But, for now I'm going to merge this PR as-is, because if anything I like to implement things the super-simple / super-hard-coded way first, as @davearata-snorack has done, and then use that to guide making fancier abstractions like the custom types solution (for whoever wants to take it on; I personally probably won't have time soon).

Then, if/when the custom types are implemented, I'd also be fine making a breaking change of removing useMongoObjectId and have users like @davearata-snorack just migrate to the new custom types approach.

(If anything, if we don't like it, the existence of useMongoObjectId in the codebase will be a good reminder to implement the custom types system. :-D )

I like your approach! Premature abstraction often leads to bad design, but to avoid it we have to accept over-specification at first. Thanks for that insight!

@davearata-snorack
Copy link
Contributor Author

@webmaster128 @boukeversteegh @stephenh thanks for the feedback, help and merging the PR! If you decide to go down the path of a more configurable abstraction, I would love to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants