Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Related types yield duplicate nodes and distinct filter fails #651

Open
Eraldo opened this issue Oct 25, 2024 · 2 comments
Open

Related types yield duplicate nodes and distinct filter fails #651

Eraldo opened this issue Oct 25, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@Eraldo
Copy link
Contributor

Eraldo commented Oct 25, 2024

Bug Description

Observation in my app:
I have a contacts page where contacts can have tags and the same tag is shown shown multiple times.

I checked the dev api and saw that it only happens when using the "related type" (TagListConnection via contact tags field.)

Using a tags query directly works. (not via related type)
Using no filter yields duplicate nodes.
Using the related one while a filter is active does work.
Using the DISTINCT filter does not work. (also shows the duplicates)

Example:

This is the SQL for the following query:

query ContactsTags {
  contacts(last: 1) {
    totalCount
    edges {
      node {
        id
        name
        tags(filters: {DISTINCT: true}) {
          totalCount
          edges {
            node {
              id
              name
            }
          }
        }
      }
    }
  }
}

The result shoes 8 duplicate tags.

{
  "data": {
    "contacts": {
      "totalCount": 63,
      "edges": [
        {
          "node": {
            "id": "SOMEID==",
            "name": "Carmen",
            "tags": {
              "totalCount": 8,
              "edges": [
                {
                  "node": {
                    "id": "VGFnOjYy",
                    "name": "target"
                  }
                },
                {
                  "node": {
                    "id": "VGFnOjYy",
                    "name": "target"
                  }
                },
                ...  // 5 more here
                {
                  "node": {
                    "id": "VGFnOjYy",
                    "name": "target"
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}

Using the django debug toolbar while running the query shows the following SQL information:
Image

System Information

  • Operating system: MacOSX
  • Strawberry version (if applicable):
strawberry-graphql-django = "^0.49.1"
strawberry-graphql = "^0.247.0"

Additional Context

According to @bellini666 the general underlying issue of duplicates due to SQL is known.

Conversation snipet for context:

Basically, if you try to filter the relation through the original model, the join will generate spurious tuples. What you need to do in this case is to filter through a subquery with exists
BUT, having said that, I see that you used DISTINCT, which should also work
Can you check in the generated SQL if the distinct was not applied there? If not, then you just found a bug 😛

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@Eraldo Eraldo added the bug Something isn't working label Oct 25, 2024
@SupImDos
Copy link
Contributor

Hi @Eraldo

I think coincidentally we also ran into the same issue a few days ago in #650. Our understanding is that the duplicate results are caused by the extra LEFT OUTER JOIN added by Strawberry Django's window pagination approach when prefetching related m2m types.

For reference, we don't think that using DISTINCT will be enough to solve the issue, as it will only remove the duplicates from the final result set, but the calculated total_count will still be incorrect.

At the moment, we think the two options are to either:

  1. Use Django's inbuilt prefetch slicing to do nested pagination
  2. Prefetch the through model and then prefetch the other side of the m2m as part of handling m2m relations.

See #650 for a more in depth explanation.

@Eraldo
Copy link
Contributor Author

Eraldo commented Nov 7, 2024

Thank you @SupImDos for making me aware of the related issue detailing the challenge and possible solution paths. 🙏
I will keep an eye on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants