Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CachedReferenceField creates DBRefs on to_python, but can't save them on to_mongo #1047

Closed
amitlicht opened this issue Jun 28, 2015 · 6 comments

Comments

@amitlicht
Copy link
Contributor

import mongoengine
from mongoengine import Document
from mongoengine.fields import CachedReferenceField, BooleanField, IntField

mongoengine.connect('test')

class TestCachedDoc(Document):
    test = BooleanField(required=True)

class TestDoc(Document):
    cached_doc = CachedReferenceField(TestCachedDoc)
    other_field = IntField(required=True)

TestDoc.objects.delete()
TestDoc(cached_doc=TestCachedDoc(test=False).save(),
        other_field=10).save()
t = TestDoc.objects.get()
t.other_field = 1
t.save()   # --> raises mongoengine.errors.ValidationError: ValidationError (TestDoc:559003e21bacb5bcf87a3195) (A CachedReferenceField only accepts documents: ['cached_doc'])
@amitlicht amitlicht changed the title CachedReferenceField creates DBRef on to_python, but can't save them on to_mongo CachedReferenceField creates DBRefs on to_python, but can't save them on to_mongo Jun 28, 2015
amitlicht added a commit to amitlicht/mongoengine that referenced this issue Jun 28, 2015
…f on to_python, but can't save them on to_mongo.

Dereferencing DBRef to document type before returning it from to_python.
@MRigal MRigal added this to the 0.10.1 milestone Jun 29, 2015
amitlicht added a commit to amitlicht/mongoengine that referenced this issue Jun 30, 2015
MRigal added a commit that referenced this issue Jul 1, 2015
…ce_field_bugfix

Suggested fix for #1047: CachedReferenceField DBRefs bug
@MRigal MRigal closed this as completed Jul 1, 2015
@coffenbacher
Copy link

Queryset evaluation performance for me after this change is horrendous, unfortunately...

Item.objects.all() on 200 items with a few CachedReferenceFields apiece takes ~10 seconds to dereference everything before returning. It's not lazy, either, which makes this untenable for my uses at the moment.

I'm not 100% sure what the original design of CachedReferenceField called for, but this works perfectly for my use case and is highly performant since it's using the cached values:

def to_python(self, value):
    if isinstance(value, dict):
        return self.document_type._from_son(value)
    return value

@MRigal MRigal reopened this Jul 17, 2015
@MRigal
Copy link
Member

MRigal commented Jul 17, 2015

You're right @coffenbacher , we should add this. Are you or @amitlicht willing to submit a complementary PR?

@yarneo
Copy link

yarneo commented Jul 21, 2015

There are so many issues with CachedReferenceFields that it baffles me how it was put in the version to begin with. Basically unworkable, you can't fetch the object then save it without a validation error. You can't append a cachedreferencefield to a list without an attribute error. I would suggest to remove it entirely to not cause developers extra headache.

@coffenbacher
Copy link

I ended up abandoning CachedReferenceField and completing my project using a different approach, so unfortunately I'm not in a position to submit a PR on this one

@amcgregor
Copy link
Contributor

amcgregor commented Jun 29, 2017

[Edited for unintentional tone in places and to split up certain parts into more relevant tickets.]

In the process of executing on the Contentment ticket mentioned directly above I… accidentally my own ODM. A large piece to this was ensuring a) lazy transformation to native types with field data stored in the foreign representation for transformation-free direct use with PyMongo (you "pay" only for what you access, with no pre-caching of query sets, to-foreign transformations occur on assignment), and b) a viable cached reference field implementation usable as a projected instance of the referenced document without additional querying/dereferencing, allowing .reload() use (which accepts fields to reload, defaulting to all) to "complete" the cached record if desired.

Trying to not spam alternative whole packages—that's lame—but I do want to offer an alternative approach to CachedReferenceField which accomplishes the goals of performance and one of my long-ago filed tickets regarding direct use of the cached values as instances. My intent: It's MIT, so go nuts, I want to share any code you might like to the benefit of MongoEngine.

While the lazy access thing would require… some substantial refactoring in MongoEngine, a viable CachedRefernceField replacement is a far more tractable problem. It's entirely baked into the fundamental Reference field in my case, but _populate_cache and to_foreign should be relatively self-explanatory, and are tested.

As a note, whole-document construction from MongoDB types generally must happen explicitly; automatic dereferencing/deserialization is largely a discouraged idea across Marrow Mongo, instead using Document.from_mongo() where needed to strongly discourage inefficient multiple automatic conversion. Any such automatic behaviour extremely isolated into a mix-in class for active collection and queryable documents; base Document has no idea what a collection even is.

Oh, yeah, model closures (a parent document class wholly containing related embedded document classes, with natural subclassing or overriding—Embeds referencing local attributes e.g. ".ClassName") are totally a thing, too, as best utilized in core by the Localized trait. But I digress.

@bagerard
Copy link
Collaborator

Since the initial bug of this ticket got fixed, I created another ticket to track this. I'm also not a big fan of CachedReferenceField...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants