-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for populates++ hook #38
Comments
Looks really good @eddyystop! I would probably keep it as Can you explain what you were thinking this does? organization: {
owners: {},
rooms: {
service: 'rooms', localField: '_id', foreignField: 'roomIds',
on: 'result', multiple: true, query: {},
expandBy: [populates.room, 'messages', 'owners'], // I don't get this part
}
}, From my understanding calling Maybe I'm getting tripped up by I was also thinking that maybe instead of |
I had the same thought as @ekryski re: |
Ya or what @corymsmith suggested. |
(1) @ekryski @corymsmith populates.room is essentially a "view" on the DB. My experience is that a reasonably sized app has many views, some similar to one another. You could end up with this populates.room including creators, messages, owners and then having populates.room1 including messages and owners, then having populates.room2 including just messages. So I'm looking for a syntax which says use the includes only from the following props.
Perhaps you can suggest a more intuitive syntax. (2) This syntax would not be compatible with the present (3) I like 'include' more than 'expandBy'. (4) |
(5) As an aside, we can have Perhaps you can suggest a different syntax for these multi-field situations. (6) The field names need to support dot notation ( |
Hmmmm 🤔 What we're doing is population + serialization schemes (to some extent). Sort of reminds me of Rails' active model serializers. Think I'm going to have to stew on this a bit. It's definitely headed in the right direction.
Ok let's go with
I'm good with deprecating and inserting a breaking change. It's not a brutal change to migrate I don't think and we can always set the old one to |
Perhaps |
Ideas from ekryski's Rails link:
The part related to populate: // to include associations
{ include: :posts }
// Second level and higher order associations
{ include: {
posts: {
only: :title,
include: { comments: { only: :body } }
},
}} I prefer using established keywords and syntax, so I'd go with The syntax above makes use of relationships between tables defined elsewhere, sorta like we've proposed.
I considered allowing nested properties in |
Here is a more Rails flavored syntax. const populates = {
post: {
author: {
service: 'users',
// the following 2 lines are mutually exclusive
localField: '_id', foreignField: 'userId', // can use when relation is a single field
getItems: (post) => ({ _id: post.userId, ifDeleted: false }), // more complicated relations
// the following 2 lines are mutually exclusive
nameItemAs: 'author', // save first item found as an object
nameItemsAs: 'authors', // save all items found as an array
placeWhere: 'result',
query: {},
// from Rails active model serializer
only: ['email', 'username', 'birthDate'],
except: ['password', 'verifyToken', 'verifyTokenShort'],
computed: {
// how do we later remove these when we want to 'update' the table?
// (a) add a _computed_: ['over18'] to item or (b) name the prop _over18
over18: (user, post) => Date.now() < (user.birthDate + 18 * 100000), // create a prop over18
},
},
comments: {
service: 'comments',
localField: 'postId', foreignField: '_id',
nameItemsAS: 'comments',
},
categories: {
service: 'categories',
localField: '_id', foreignField: 'postCategories', // categories is an array so use $in
nameItemsAs: 'categories',
},
},
favorites: {
posts: {
service: 'posts',
localField: '_id', foreignField: 'postIds',
nameItemsAs: 'posts',
include: {
post: {
// Override part of the post.author definition
include: { author: { only: ['username', 'birthDate'], } },
// Ignore post.categories. Don't include categories items for a post.
exclude: ['categories']
}
},
}
},
};
favorites.after({
find: [
hook.population(populates.favorites),
// or
hook.population({
posts: {
service: 'posts',
localField: '_id', foreignField: 'postIds',
nameItemsAs: 'posts',
include: { post: {} },
},
}),
]
}); |
Could I think those methods should just be passed the I was also looking at the Sail ORM solution and we already support everything they do. Mongoose's syntax is not super clear and also doesn't support deep populates well. Not much to draw from there. This last iteration feels like we are venturing into ORM territory. Not entirely because they aren't defined on the models, and may not need to rely on the models in order to do population. In my mind this style of Just brain dumping but I'm wondering if we are getting a little too complex with this function (trying to do too much). Including and excluding might be better defined as subsequent hooks in the hook chain. However, one thing I have been thinking a lot about is defining different serializations (or views) of the data based on the permissions of the client requesting that data (which this is starting to venture into). Basically "serialization policies". No one has solved that in a nice way yet. A couple notes that are more specific to the above example:
I'm going to think about this and put together an example or two. I already started but my hunch is that we are mixing serialization and population and that really they should be 2 different things and that in general when we populate we should bring in everything and then make a second pass with another hook to serialize data out or sanitize data in appropriately. |
About mixing serialization and population: Beeplin has made a comment that items could have a lot of fields (I agree) and its better to drop fields on the I really like the idea of including permissions in deciding on how to populate as they are one of the big reasons for different populate definitions. How are permissions exposed in feathers-permissions? I agree we are getting complex with this function. Perhaps the question is whether the function will be very simple to use for the simple cases and for the majority of cases? Further, will it be performant enough for the complex ones? I prefer About your (3) above. We need We need to know if we add an array of objects or add an object. Perhaps we have |
About this nested included stuff. Its just to temporarily override values within an object. We could just clone the object and let the user use dot notation to customize it. |
This is my stab at thinking through a realistic example for populating and serialization. This would support having data in different services that are backed by different data stores. By default everything is included so I'm only using Example 1: NoSQL Posts + SQL Comments// Example Data
const Favourites = [
{
userId: 'as61389dadhga62343hads6712',
postId: '1'
},
{
userId: 'as61389dadhga62343hads6712',
postId: '2'
},
{
userId: '167asdf3689348sdad7312131s',
postId: '1'
}
];
const Posts = [
{
id: '1',
title: 'Post 1',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: 'as61389dadhga62343hads6712',
comments: ['1', '3'],
createdAt: ''
},
{
id: '2',
title: 'Post 2',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: '167asdf3689348sdad7312131s',
comments: ['2'],
createdAt: ''
}
];
const Comments = [
{
id: '1',
title: 'Comment 1',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: 'as61389dadhga62343hads6712',
createdAt: ''
},
{
id: '2',
title: 'Comment 2',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: 'as61389dadhga62343hads6712',
createdAt: ''
},
{
id: '3',
title: 'Comment 3',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: '167asdf3689348sdad7312131s',
createdAt: ''
}
];
const Users = [
{
_id: 'as61389dadhga62343hads6712',
name: 'Author 1',
email: '[email protected]',
password: '2347wjkadhad8y7t2eeiudhd98eu2rygr',
age: 55
},
{
_id: '167asdf3689348sdad7312131s',
name: 'Author 2',
email: '[email protected]',
password: '2347wjkadhad8y7t2eeiudhd98eu2rygr',
age: 16
}
];
const serializers = {
favorites: {
exclude: ['userId'] // we don't need the userId as we likely queried for the user's favourites already
post: {
// Possibly we can use this as a short hand syntax? In this case comments.createdAt wouldn't do anything because that field wasn't included in the original query
exclude: ['comments.createdAt']
author: {
exclude: ['password', 'createdAt'],
computed: {
isChild: (author) => author.age < 18,
}
}
},
}
};
const populations = {
// NOTE: We're not populating the user on the favourite object
favorite: {
on: 'result', // where to place the populated item on the hook object
post: {
service: 'posts', // The service to populate from
localField: 'postId', // The field local to the favourite. Only needed if different than the key name.
foreignField: 'id', // The key of the populated object to use for comparison to the id stored
nameAs: 'post', // Key name to place the populated items on (optional. Defaults to the key name in the object)
author: {
service: 'users', // The service to populate from
foreignField: '_id' // The key of the populated object to use for comparison to the id stored
},
comments: {
service: 'comments', // The service to populate from
foreignField: 'id', // The key of the populated object to use for comparison to the id stored
query: { // Normal feathers query syntax. Get the title and body of the last 5 comments
$limit: 5,
$select: ['title', 'content'],
$sort: { createdAt: -1 }
},
}
}
}
};
favorites.after({
find: [
hook.populate(populations.favourite),
hook.serialize(serializers.favourite)
]
}); |
What if |
Personally I don't think we should optimize on that too early. There is definitely going to be a cost to ripping through objects in JS twice to exclude or include items as opposed to getting the DB to do it but I think that is an optimization we can introduce if we are finding it's not fast enough, as opposed to doing it up front and making the The problem with the current hooks was that previously until you added dot notation there wasn't an easy way include/exclude nested properties, so I think people gravitated to that external hook the other guy wrote. For permissions really they are defined wherever you want. Currently it is assumed in the database associated with the entity that has limitations (ie. user, device, organization, etc.) but I think we'd be able to cook up something that allows different serialization based on permissions. I think that would be easily supported by what I proposed above as it would just be a different serializer definition.
I might be missing something here but I think we can look at the object we are going to populate into (ie. |
That might be what @eddyystop is referring to. Let me modify and think about that. my damn nosql brain |
localField: 'postId', // field name in comments
foreignField: '_id', // field name in posts If an array in posts localField: '_id', // field name in comments
foreignField: 'commentIds', // name in posts. since this is an array at run time, a $in is used in the query |
Example 2: SQL Posts + Comments// Example Data
// Favourites stored in SQL DB
const Favourites = [
{
userId: 'as61389dadhga62343hads6712',
postId: 1
},
{
userId: 'as61389dadhga62343hads6712',
postId: 2
},
{
userId: '167asdf3689348sdad7312131s',
postId: 1
}
];
// Posts stored in SQL DB
const Posts = [
{
id: 1,
title: 'Post 1',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: 'as61389dadhga62343hads6712',
createdAt: ''
},
{
id: 2,
title: 'Post 2',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: '167asdf3689348sdad7312131s',
createdAt: ''
}
];
// Comments stored in SQL DB
const Comments = [
{
id: 1,
postId: 1,
title: 'Comment 1',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: 'as61389dadhga62343hads6712',
createdAt: ''
},
{
id: 2,
postId: 2,
title: 'Comment 2',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: 'as61389dadhga62343hads6712',
createdAt: ''
},
{
id: 3,
postId: 1,
title: 'Comment 3',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: '167asdf3689348sdad7312131s',
createdAt: ''
}
];
// Users stored in NoSQL DB
const Users = [
{
_id: 'as61389dadhga62343hads6712',
name: 'Author 1',
email: '[email protected]',
password: '2347wjkadhad8y7t2eeiudhd98eu2rygr',
age: 55
},
{
_id: '167asdf3689348sdad7312131s',
name: 'Author 2',
email: '[email protected]',
password: '2347wjkadhad8y7t2eeiudhd98eu2rygr',
age: 16
}
];
const serializers = {
favorites: {
exclude: ['userId'] // we don't need the userId as we likely queried for the user's favourites already
post: {
// Possibly we can use this as a short hand syntax? In this case comments.createdAt wouldn't do anything because that field wasn't included in the original query
exclude: ['comments.createdAt']
author: {
exclude: ['password', 'createdAt'],
virtuals: {
isChild: (author) => author.age < 18,
}
}
},
}
};
const populations = {
// NOTE: We're not populating the user on the favourite object
favorite: {
on: 'result', // where to place the populated item on the hook object
post: {
service: 'posts', // The service to populate from
localField: 'postId', // The field local to the favourite. Only needed if different than the key name.
foreignField: 'id', // The key of the populated object to use for comparison to the id stored
author: {
service: 'users', // The service to populate from
foreignField: '_id' // The key of the populated object to use for comparison to the id stored
},
comment: {
service: 'comments', // The service to populate from
localField: 'post.id', // compare the post's id field to the comments postId field
foreignField: 'postId',
nameAs: 'comments', // Key name to place the populated items on (optional. Defaults to the key name in the object)
asArray: true,
query: { // Normal feathers query syntax. Get the title and body of the last 5 comments
$limit: 5,
$select: ['title', 'content'],
$sort: { createdAt: -1 }
},
}
}
}
};
favorites.after({
find: [
hook.populate(populations.favourite),
hook.serialize(serializers.favourite)
]
}); |
(1) My meaning for localField and foreignField seem to be the reverse of yours. See my comment above yours. (2) In the post object, do we really want to consider every prop that's not a "reserved name" to define a relationship? Typos, etc? |
@eddyystop I'm not sure your suggestion covers that. Don't we need to be comparing the I was thinking it could be: // parsing the tree is always assumed to start from the root
// so local props you use `this.id` or `./id`. Sort of handlerbars-ish.
// don't really love that idea....
comment: {
localField: 'post.id'
}
// or this. Assume everything is references local to the object
// scope and you can back out.
// Also don't love this...
comment: {
localField: '../id'
}
// or this. I personally think this would be much
// easier to parse in the hook.
comment: {
localField: {
field: 'id',
from: 'post'
},
foreignField: 'postId'
} |
Ya I must be confused then. I don't think I follow which is which. Maybe you can update the your example with comments spelling out what the
Riiiiiiggght. Ya maybe we should keep that |
@eddyystop thanks for updating that. I think it might still be a bit confusing. I'd rather keep it so that @feathersjs/core-team suggestions? |
Since we have post: {
comment: {
ekryskiLocalField: '_id', // this is always from the item that is including comment i.e. post
ekryskiForeignField: 'postId', // this is alwys from the item be are including i.e. comment
}} People used to Entity Relation Diagrams, Third Normal Form and SQL will have a firm idea of what foreign key should mean. A foreign key is something you have which is the primary key of another entity. So in our syntax perhaps foreignField -- I avoided using foreignKey -- should mean something that's in another table, not in our table. BTW, that's what f_key and l_key mean in MichaelErmer/feathers-populate-hook |
@eddyystop yup I follow. Ok final example with those changes: Example 3: NoSQL Users, SQL Posts + Comments + Favourites// Example Data
// Favourites stored in SQL DB
const Favourites = [
{
userId: 'as61389dadhga62343hads6712',
postId: 1
},
{
userId: 'as61389dadhga62343hads6712',
postId: 2
},
{
userId: '167asdf3689348sdad7312131s',
postId: 1
}
];
// Posts stored in SQL DB
const Posts = [
{
id: 1,
title: 'Post 1',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: 'as61389dadhga62343hads6712',
createdAt: ''
},
{
id: 2,
title: 'Post 2',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: '167asdf3689348sdad7312131s',
createdAt: ''
}
];
// Comments stored in SQL DB
const Comments = [
{
id: 1,
postId: 1,
title: 'Comment 1',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: 'as61389dadhga62343hads6712',
createdAt: ''
},
{
id: 2,
postId: 2,
title: 'Comment 2',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: 'as61389dadhga62343hads6712',
createdAt: ''
},
{
id: 3,
postId: 1,
title: 'Comment 3',
content: 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Possimus, architecto!',
author: '167asdf3689348sdad7312131s',
createdAt: ''
}
];
// Users stored in NoSQL DB
const Users = [
{
_id: 'as61389dadhga62343hads6712',
name: 'Author 1',
email: '[email protected]',
password: '2347wjkadhad8y7t2eeiudhd98eu2rygr',
age: 55
},
{
_id: '167asdf3689348sdad7312131s',
name: 'Author 2',
email: '[email protected]',
password: '2347wjkadhad8y7t2eeiudhd98eu2rygr',
age: 16
}
];
const serializers = {
favorites: {
exclude: ['userId'] // we don't need the userId as we likely queried for the user's favourites already
post: {
// Possibly we can use this as a short hand syntax? In this case comments.createdAt wouldn't do anything because that field wasn't included in the original query
exclude: ['comments.createdAt']
author: {
exclude: ['password', 'createdAt'],
virtuals: {
isChild: (author) => author.age < 18,
}
}
},
}
};
const populations = {
// NOTE: We're not populating the user on the favourite object
favorite: {
on: 'result', // where to place the populated item on the hook object
include: {
post: {
service: 'posts', // The service to populate from
localField: 'postId', // The local field to the parent (ie. favourite). Only needed if different than the key name.
foreignField: 'id', // The field of populated object to use for comparison to the localField
include: {
author: {
service: 'users', // The service to populate from
localField: 'author', // The local field to the parent (ie. post)
foreignField: '_id' // The field of populated object to use for comparison to the localField
},
comment: {
service: 'comments', // The service to populate from
localField: 'id', // The local field to the parent (ie. post)
foreignField: 'postId', // The field of populated object to use for comparison to the localField
nameAs: 'comments', // Key name to place the populated items on (optional. Defaults to the key name in the object)
asArray: true,
query: { // Normal feathers query syntax. Get the title and body of the last 5 comments
$limit: 5,
$select: ['title', 'content'],
$sort: { createdAt: -1 }
},
}
}
}
}
}
};
favorites.after({
find: [
hook.populate(populations.favourite),
hook.serialize(serializers.favourite)
]
}); |
Test of initial populate hook, with trace logs. |
@eddyystop nice! Any chance you can add some performance timing? In the past, I've found a big issue with populating data on json and key value stores is the performance of making multiple queries, especially at the 3rd to Nth levels, as each population requires and additional call. This can be pretty heavy if the data structure is an array of arrays, so it would be ideal for us to have some consideration around bench marks when it comes to this so people are thinking about it up front when they design their data models. Food for thought, but great work here! |
@slajax that's a good idea and I'll keep it in mind. On the one hand, the data is presently being populated in place within the On the other hand, I've listened to fasinating, but a bit dated, talks On Wed, Nov 9, 2016 at 3:14 PM, Kyle Campbell [email protected]
|
One optimization for performance when populating an array of items would be to collect all It looks like the JSON schema is going to have a lot of options. One thought I had was to provide a more intuitive query builder interface to create a hook. Something like populate('users')
.from('userId')
.withService('/users')
.withQuery(function(hook) {
return {
active: true
}
}) Also, can we create a spec document for this? It's a pretty long thread. I almost feel that this is one of those cases I mentioned where it would make sense for it to live in its own module. To me the common hooks repo is kind of like a "Hooks Lodash" and this one needs a lot more context and documentation than the standard Lodash documentation style. |
@daffl The $in operator is used when the parent item contains an array of child ids, that is Check out https://github.com/eddyystop/feathers-test-populate-etc for the latest specs and implementation of both the populate and serialize hooks |
https://github.com/eddyystop/feathers-test-populate-etc is where this is being prototyped. |
An attempt at a design to start discussion.
I'm hoping for
The text was updated successfully, but these errors were encountered: