Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align SearchFilter behaviour to django.contrib.admin search #9017

Merged
merged 7 commits into from
Jul 25, 2023

Conversation

sevdog
Copy link
Contributor

@sevdog sevdog commented Jun 22, 2023

Description

Make SearchFilter behave more like django.contrib.admin search (which is already stated in docs).

  • Allow use any lookup by appending it the fields in search_fields
  • Perform a query-only distinct filtering
  • Handle quotes surrounded terms with escapes (ie: "the text", "the \"text\"")
  • Split by spaces only by spaces, considering quoteing (breaking change)
  • Raise a validation error if any null-character is provided in search (breaking change)

See #8205

@auvipy auvipy self-requested a review June 23, 2023 02:51
@auvipy
Copy link
Member

auvipy commented Jun 23, 2023

looks like a nice improvment

tests/test_filters.py Outdated Show resolved Hide resolved
@auvipy
Copy link
Member

auvipy commented Jun 23, 2023

can we avoid at least one breaking change

  • Split by spaces only by spaces, considering quoteing (breaking change) --- and keep both? if it is not too hard?

@sevdog
Copy link
Contributor Author

sevdog commented Jun 23, 2023

can we avoid at least one breaking change

  • Split by spaces only by spaces, considering quoteing (breaking change) --- and keep both? if it is not too hard?

This could be interesting, however it should be defined the order of split logics due to how smart_split works and which kind of user experience is expected. Just to give an example, consider the following search:

'"hello, world" can I,help you'

With only smart_split this becomes

['hello, world', 'can', 'I,help', 'you']

Maybe a good compromise would be:

  • if comma is inside quotes, do not split
  • otherwhise split

With this logic the first search whould become

['hello, world', 'can', 'I' , 'help', 'you']

There is an edge case: what to do if there are one or more commas before or after a quoted sentence?

',"hello, world", what to,,do,?'

I belive that in this case it would be nice to strip commas from tokens extracted by smart_split and then yield quoted sentences as the are and split other by comma.

The generator to handle this iteration whould be something like:

def search_smart_split(search_terms):
    for term in smart_split(search_terms):
        term = term.strip(',')
        if term.startswith(('"', "'")) and term[0] == term[-1]:
            yield unescape_string_literal(term)
        else:
            yield from (sub_term for sub_term in term.split(',') if sub_term)

This should parse the latter term into:

['hello, world', 'what', 'to', 'do', '?']

@sevdog
Copy link
Contributor Author

sevdog commented Jun 23, 2023

Testing the previous approach I encountered an other case which I did not considered before. What to do when there is a quoted phrase followed by other characters:

'"hello, world",found'

The smart_split keeps this phrase togheter, so splitting by comma results in

['"hello', ' world"', 'found']

The whitespace is easy and intuitive to remove (IMO), but I am not sure if in this case also the quotes should be removed (thus using .split(' \'"') instead of .split()).

@auvipy auvipy requested a review from a team June 24, 2023 06:05
@auvipy
Copy link
Member

auvipy commented Jun 24, 2023

are we going to preserve the old way or going to have option for both new and old ways?

@sevdog
Copy link
Contributor Author

sevdog commented Jun 28, 2023

are we going to preserve the old way or going to have option for both new and old ways?

It depends on which mechanism are we willing to provide to select this behaviour. I can think these:

  • an attribute on filter class
  • an attribute on view
  • a settings (which would be global for every view/filter)

@auvipy
Copy link
Member

auvipy commented Jul 13, 2023

from the tests it seems that both type of search is supported in this PR, right? I would like to consider this improvement for 3.15 release

@sevdog
Copy link
Contributor Author

sevdog commented Jul 13, 2023

from the tests it seems that both type of search is supported in this PR, right? I would like to consider this improvement for 3.15 release

Yes, search_smart_split does first the split like django.contrib.admin and then splits by comma. The only case in which this would give different results is where the search input is a mixed combination of quoted strings and commas. A simple CSV vaue (ie: an,example,search) would be splitted like it was before.

@auvipy auvipy added this to the 3.15 milestone Jul 13, 2023
@@ -14,13 +13,6 @@ def unicode_http_header(value):
return value


def distinct(queryset, base):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change this in a separate PR? or it is part of the change in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be splitted, however this method was introduced in b4b2dc1 only for the search class and putted into compact package in the hope of being re-used.

IMO this should be kept togheter with the rest of the PR since this change goes in the same direction of "align the behaviour with django".

I just noticed that the code I used (which was introduced in django/django@1871182) has just been removed from the main branch in django/django@d569c1d. So the whole thing may just be changed to use .distinct() like django is going to do.

PS: I really do not understand why 8 years ago there was the need to have a different implementation of distinct only for Oracle engine and why this was implemented here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS: I really do not understand why 8 years ago there was the need to have a different implementation of distinct only for Oracle engine and why this was implemented here.

at that time django oracle back end was not properly maintained. but now it is

For [JSONField][JSONField] and [HStoreField][HStoreField] fields you can filter based on nested values within the data structure using the same double-underscore notation:

search_fields = ['data__breed', 'data__owner__other_pets__0__name']

By default, searches will use case-insensitive partial matches. The search parameter may contain multiple search terms, which should be whitespace and/or comma separated. If multiple search terms are used then objects will be returned in the list only if all the provided terms are matched.
By default, searches will use case-insensitive partial matches. The search parameter may contain multiple search terms, which should be whitespace separated. If multiple search terms are used then objects will be returned in the list only if all the provided terms are matched.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need further enhancement of the docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add some examples about search term and how it will get parsed.

docs/api-guide/filtering.md Outdated Show resolved Hide resolved
@auvipy auvipy self-requested a review July 25, 2023 07:37
@auvipy
Copy link
Member

auvipy commented Jul 25, 2023

do you intend to add anything else here or improve anything else?

@sevdog
Copy link
Contributor Author

sevdog commented Jul 25, 2023

As of now I belive that this should be fine.

I tried to figure out what could have been added to the docs, but I felt that it was just copy-pasting fron django admin docs without adding any value.

@auvipy auvipy merged commit b99df0c into encode:master Jul 25, 2023
math-a3k pushed a commit to math-a3k/django-rest-framework that referenced this pull request Jul 31, 2023
)

* Use subquery to remove duplicates in SearchFilter

* Align SearchFilter behaviour to django.contrib.admin

* Add compatibility with older django/python versions

* Allow search to split also by comma after smart split

* Use generator to build search conditions to reduce iterations

* Improve search documentation

* Update docs/api-guide/filtering.md

---------

Co-authored-by: Asif Saif Uddin <[email protected]>
@tomchristie
Copy link
Member

Really nice bit of functionality, thanks... have followed up with #9338.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants