Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document registry in mongoengine: class name as a unique Document identifier leads to failing unit tests. #1319

Closed
BurkovBA opened this issue Jun 19, 2016 · 5 comments
Labels

Comments

@BurkovBA
Copy link

BurkovBA commented Jun 19, 2016

Hi, guys!

I'm trying to get Travis-CI up and running with django-rest-framework-mongoengine. Unfortunately, while running tests, I've stumbled on a problem with mongoengine's document registry.

Currently, document registry identifies a model just by its _class_name. So, I can't have 2 models with the same name in different modules.

I stumbled upon this, while running drf-mongoengine's unit-tests. Some tests define different Documents with identical names. For instance, both models.py and test_patching.py define DumbDocument, but with different fields. Unit-testing script, based on pytest, which, I suppose, uses Unittest's TestRunner and TestLoader, discovers and imports all my unittests at once, so that different definitions of DumbDocument clash with each other in documents registry. The latest import wins and this leads to failing unit tests.

Can you suggest a workaround for those tests?

Can Mongoengine users experience some problems with this? Could it make sense for Mongoengine to use full path to a Model (e.g. package.module.model) as its identifier in document registry, like it is done in pickle module, not just the model name?

@lafrech lafrech added the Bug label Jun 19, 2016
@amcgregor
Copy link
Contributor

amcgregor commented Jun 23, 2016

The concept of a central document class registry is used to facilitate inheritance, the name of the class is stored with the document, and MongoEngine uses the registry to then know which class to instantiate for the document. Unfortunately, a name is not uniquely identifying as you have noticed. Storing of the full import path to the class, as per pickle, would be a solution.

My own marrow.package utility package contains a 2.7+ portable canonical resolver to identify the name for a given object, as well as a short and sweet utility to load those references back out. The former is a surprisingly difficult proposition, and can't handle certain edge cases, such as closures, on Python < 3.3 where __qualname__ was added.

Unfortunately, right now, the only advise is bad advise: don't do that. In my own code this means I end up with some truly ludicrously descriptive class names, but no conflicts, remembering to override the collection name so as to avoid propagating the silliness to the DB side of things.

@BurkovBA
Copy link
Author

BurkovBA commented Jun 23, 2016

@lafrech, @amcgregor Thanks for discussion. So far I just renamed one of the conflicting documents and the tests passed.

On a second thought, in Django you can pass a model to a ForeignKey by its class name or module.class_name. So, django should have some mechanism to resolve names to classes. I didn't study it in detail, but may be, it makes sense to consider reusing or replicating that mechanism in Mongoengine?

@amcgregor Your module looks quite nice. I wonder, if so many projects need this problem solved, may be solution should be abstracted-out into a module of python standard library? May be you could propose your module for python standard library? I never investigated the problem myself, but it might turn out that there's an existing solution already. And if it won't cut it for pickle, django or mongoengine, may be you could extend it (or your module) with some of the features, required by them?

@amcgregor
Copy link
Contributor

amcgregor commented Jun 23, 2016

A solution is already in the standard library. In Python versions 3.3 or newer all objects have a __qualname__ attribute which identifies the object's origin. My own library uses this where available, falling back on exhaustive module search (via obj.__module__ then searching that object for obj.__name__, then comparing to ensure it actually is the same object) in older versions.

My library's sole purpose is to manage dynamic loading of objects, plugins, etc. with 2.7+ support, so you can consider, for the purposes of __qualname__, it to be a back-port… just one that can't resolve [edit to add:] functional closures like __qualname__ can. It can resolve most first-order attributes of classes at the module scope, including class closures of classes.

I don't quite understand what features are missing. load("some object path") or load("plugin name", "plugin namespace") and name(some_object) to get its import path suitable for passing to load. It's pretty simple, and fully tested. ;)

Always remember the Zen of Python: explicit is better than implicit.

Edited to add: the responsible code for name resolution isn't… a lot. But very few implement it this exhaustively, and being small, everyone and their dog tend to reimplement it. ;)

@ceorcham
Copy link

I found this same issue today, now that python 2.7 support was dropped from mongoengine, and python 3.4 have reached end of life

What about using __qualname__ as a breaking change for the next mayor version?

Is actually not a big change, just need to update some tests that rely on having the _class_name in the _document_registry and implement a new test that validates this new feature.

I can volunteer myself for implementing this

@bagerard
Copy link
Collaborator

bagerard commented Oct 2, 2024

There are pros and cons in both approach but the class name is not just used in MongoEngine codebase, it is also appearing in user code (e.g ReferenceField('User')) and is being burned in the databases (through "_cls" attribute) so although you have some valid argument, this is a too big breaking change to be introduced. Burning the module in the database isn't an option as it would imply a migration every time user move their class around...
Outside of test suites, it's a bad idea to re-use the same name for your Document classes. That's a rabbit hole I'd prefer not exploring. In test suite, you just need to drop the collection or unregister the class from the document_registry, this is used in MongoEngine test suite and isn't causing much trouble

@bagerard bagerard closed this as completed Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants