-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data corruption when using db_field #1730
Labels
Comments
could you please add simple failing test case and send pr request? |
th-erker
added a commit
to th-erker/mongoengine
that referenced
this issue
Feb 6, 2018
Problem regarding model/db field name mapping.
Done: #1741 |
I think I can able to work on your reported bug. |
I think that the issue described below is related to this:
(will raise assertion error) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Summary
The mongoengine code makes the implicit assumption that db field names and model field names only overlap if they refer to the same field. If this condition is not satisfied, either by explicit model design (test case 1 and 2) or by garbage/old data in the database (test case 3 and 4), all kinds of data corruption happen.
How to reproduce
Run the attached file db_field_test.txt with
python2 -R db_field_test.txt
. The expected (bug free) output would be that in all four test cases f, g, h show the same values. But the actual output is like this:Some bugs are dependent on the order iterators return dictionary items, so several runs might be necessary to see the bugs (and that's the reason for the -R flag).
The test program defines a strict / dynamic document with the following fields
In each of the four test cases it creates an object f and sets some of the fields to True or False, as shown in the output. The object is saved and loaded again (g). The object h has set the same fields with the same values as f, but using the constructor instead of attribute access (
h = Doc(x1=False, ...)
.The first and third test cases use strict documents, the second and fourth test cases use a dynamic document.
In the last two test cases, the field 'w1' is set directly in the database to False after f is saved and before g is loaded.
Analysis
In the mongoengine code, two patterns are used which work if the above assumption holds, but break down if not:
In
base.document.BaseDocument.__init__
, field names are converted from db names to model names, but the field names should already be model names. This explains h1. This conversion only happens if the document is strict, therefore h2 does not show the bug.In
base.document.BaseDocument._from_son
, field names are converted to db names. Again, this does not make sense, as the SON object uses db names when loading objects from the database. (As this bug is "repaired" by__init__
for strict documents, only g2.x1 shows the wrong value and not g1.x1.)The names/values are copied to
data
dictionary, there they are converted to model names by overwritingdata
in a loop and deleting the db name from it. At this point, the order of the items returned byfield.iteritems()
matters, as this might result in multiple conversions in thedata
dictionary. For example y1 is saved in the database as y0. The conversion does not change it, so it is y0 indata
. In the loop it is copied to y1. but iffield.iteritems()
returns y2 after y1, the loop treats y1 as the db name of the model field y2, and therefore copies it todata['y2']
.The 3rd and 4th test case, the (double) conversion to db names in
base.document.BaseDocument._from_son
is the culprit. In the database, w1 and w2 exists. The latter by saving f, the former by setting it explicitly. Both are converted to db name w2 in a loop overson.iteritems()
. If w1 was last, the wrong data is loaded.It is remarkable that this is even the case with strict documents (g3), as fields not defined in a strict model are filtered at the last moment.
If both x1 and x2 are set (test case not included) and saved, this conversion of db names (x2 and x3) would both copy to x3, resulting in only x2 is present after loading, with the value of either the original x1 or x2 (depending of the order of
son.iteritems()
).The text was updated successfully, but these errors were encountered: