Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should discard and populate first make a copy of the result objects? #150

Closed
eddyystop opened this issue Apr 14, 2017 · 18 comments
Closed
Labels

Comments

@eddyystop
Copy link
Collaborator

This is relavent when Sequelize and Mongoose return ORMs rather than POJS objects. @daffl and @marshall think this is a good idea.

The discard and remove hooks work the same way. We should probably change both. We would have to choose a deep clone utility. They are slow but would only have to process a limited number of records.

I have concerns about making deep copies of every recursive record in populate. I think this would degrade performance. We were concerned enough about populate's performance that we included perf measurement features in the hook.

Perhaps we should consider running .toJson or .toObject if its found on a result record. This way the performance penalty is paid only by Sequelize use without raw:true (now that Mongoose defaults to lean:true).

However I had a longish discussion with ryan.wheale. He has come around to the following point of view:

"Well, I am on board with the idea that the ORM "goodness" is probably only used a fraction of the time. Most services will just load data, do some simple data mods, and then send it out.

In the case that users need the ORM stuff, they will need to instantiate the data. This could ship as a pre-built hook for convenience and would be easy to document.

The documentation could clearly state that instantiating ORM objects will likely break feathers hooks so that nothing comes by surprise (as it currently does)"

This would mean that hook.result must contain POJO and if the dev wanted the ORM, they would run a hook and the ORM would be instantiated on, say, hook.params.instantiated.

So ... decisions. I'm attracted to ryan.wheale's idea.

@marshallswain
Copy link
Member

I'm on board with everything you just said. I didn't think about performance penalties of deep cloning everything.

@eddyystop
Copy link
Collaborator Author

I would like ryan.wheale to contribute his thoughts.

@DesignByOnyx
Copy link

In order to play well with others, feathers-sequelize should always run raw queries. This is the first part of the discussion that needs a concrete decision as this will likely break existing apps (including the one I've been working on for the past year).

From there, I think that the ORM adapters can ship with a simple hydrate or instantiate hook that users can use to get ORM instances if they so choose (because we're nice like that ;). After having thought about this for some time I think it also makes sense to have the inverse serialize hook as tell to turn the data back into an POJO for further use. This would allow people to compose hook pipelines in an easy-to-conceptualize way:

[
  someCommonHook(),
  instantiatORMObject(),
  processORMObject(),
  serializeORMObject(),
  someOtherCommonHook()
]

@eddyystop
Copy link
Collaborator Author

eddyystop commented Apr 17, 2017

@DesignByOnyx, for clarity, would both the POJO and the ORM be at hook.result? You previously were playing around with the idea of hook.data always being a POJO, and any instantiated ORM would always be hook.params.instantiated.

@DesignByOnyx
Copy link

DesignByOnyx commented Apr 17, 2017

I think it makes the most sense to instantiate on hook.result - the biggest reason is that data gets out of sync really quickly and you start to experience the whole "source of truth" issue. If the ORM objects are mutated, then someone (or something) has to be responsible for updating hook.result before the response goes out. This seems like unnecessary confusion and I think it's easier to document and debug the instantiate -> process -> serialize flow, always using hook.result to hold the data.

Another compelling reason is that by using hook.result - users don't have to re-serialize the data. The ORM objects automatically get serialized when the response is sent.

@sasdf
Copy link

sasdf commented Apr 18, 2017

For In-memory database (feathers-memory), applying discard on after hook will mutate the object in database.
Testing code:

'use strict'
/* Versions:
 *   "feathers": "^2.1.1",
 *   "feathers-hooks": "^1.8.1",
 *   "feathers-hooks-common": "^3.0.0",
 *   "feathers-memory": "^1.1.0",
 */

const feathers = require('feathers')
const app = feathers()
const hooks = require('feathers-hooks')
const { discard } = require('feathers-hooks-common')
const db = require('feathers-memory')

app.configure(hooks())
app.use('/test', db())

app.service('test').hooks({
  after: {
    create: [
      discard('bad')
    ]
  }
})

app.service('test').create({
  good: 1,
  bad: 1
})

app.service('test').get(0).then(
  console.log.bind(console.log)
  // Output: { good: 1, id: 0 }
)

@eddyystop
Copy link
Collaborator Author

eddyystop commented Apr 18, 2017

Place this hook before the discard:

hook => {
  hook.result = JSON.parse(JSON.stringify(hook.result));
  return hook;
}

I suspect feathers-memory is returning the actual object in the database in hook.result. Mutating it, mutates the database.

@daffl
Copy link
Member

daffl commented Apr 18, 2017

Yes, feathers-memory does not make a copy of the result. I think those hooks should make shallow copies of the data (and deep copies of objects that are accessed via the dot . operator).

@marshallswain
Copy link
Member

marshallswain commented Apr 18, 2017

You don't think shallow copies would lead to confusion? To avoid confusion, seems like maybe feathers-memory should return a deep clone inside the service. Isn't it the only adapter that's inconsistent?

@sasdf
Copy link

sasdf commented Apr 18, 2017

Returning deep clone make sense, but maybe it will losing capability to store any js internal Object(something circular or connected with callback, etc)? Maybe left an option in constructor?

@eddyystop
Copy link
Collaborator Author

This problem is not limited to hooks.

Besides this present issue, @daffl helped me realize that if you modify data in service.on('created', data => ...) that modification will mutate the data in the service.create(...).then(data => ...). (I had to clone deeply as I absolutely had to use a mutated version of that event data .)

I suggest it needs to be handled elsewhere to solve it throughout Feathers.

I also suggest a deep copy would have to be made, else the problem will just be buried deeper in sub-objects.

@eddyystop
Copy link
Collaborator Author

I've thought some more about this.

There is no documentation about which Feathers param objects may not be safely mutated. So people will run into unexpected issues which may become time consuming.

Even if the common hooks make (deep) copies where needed, developers will still be exposed to the problem as they write their own hooks.

I think Feathers core should at least freeze any non-mutateable objects, preferably deeply. That way problems should pop up quicker and the reason would be fairly obvious.

This seems a fair balance between speed and safety.

@eikaramba
Copy link

Just updated to latest common-hooks and experienced this problem with sequelize. IMHO why not handling the different cases of ORM inside the hook? e.g. for sequelize checking if items.dataValues is there or setDataValues is a function. Sure, this means supporting every single ORM that feathers currently support, but that's how it is. Just providing shallow copies of the objects would break a lot of stuff and although i'm already having a unsequelize() hook myself for such use cases, i'm not sure if this is a good idea to have by default.

BTW: here's my own discard copy that handles sequelize objects as well. If anyone has the same problem currently

const { getItems } = require('feathers-hooks-common');
....
discard: function (...fieldNames) {
    return function (hook) {
      let items = getItems(hook);

      (Array.isArray(items) ? items : [items]).forEach(item => {
        fieldNames.forEach(fieldName => {
          let value = typeof item.setDataValue === "function"? item.dataValues[fieldName] :item[fieldName];
          if (value !== undefined) {
              if(typeof item.setDataValue === "function"){
                delete item.dataValues[fieldName];
              }else{
                delete item[fieldName];
              }
          }
        });
      });

      return hook;
    }
  },

@eddyystop
Copy link
Collaborator Author

The new populate and serialize hooks always required POJO. This is not new.

No common hook (except the legacy populate) handles any ORM object, so 'fixing' the problem in populate isn't fixing it. feathers-mongoose v4.1 already defaults to a plain JS object. We have a plan for Sequelize which DesignByOnyx describes above.

Since Feathers only handles POJO its your responsibility to provide it such, with { query: { raw: true } } in the base method call, and select: { raw: true } in the includes.

@eikaramba
Copy link

eikaramba commented May 1, 2017

i started from the cli skeleton and as a new user i would assume everything works as described(the current code the cli is generating uses discard and does not to prevent this issue from happening). so then the docs need to be adapted. can this not be handled by the database adapter, if that is really required in the future?

@marshallswain
Copy link
Member

Yeah. We can probably add a hook to the generator when you select Sequelize. That would be a good thing to learn right up front.

@eddyystop
Copy link
Collaborator Author

@marshallswain The solution we've settled on is #150 (comment). We will eventually have to remove any normalization hooks we add to the generator.

@eddyystop
Copy link
Collaborator Author

Now that #150 (comment) has been implemented I think we can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants