We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URL sanitization is overly aggressive on URLs containing percent-encoded data
Run the following on v1.1:
>>> gazpacho.utils.sanitize('https://en.wikipedia.org/wiki/M%26M%27s') 'https://en.wikipedia.org/wiki/M%2526M%2527s'
Alternatively, on 7cc9488 with a tweak to get the package loadable, and logging str(url):
str(url)
>>> gazpacho.get('https://en.wikipedia.org/wiki/M%26M%27s') url http://https://en.wikipedia.org/wiki/M%2526M%2527s ...
(i.e., no change)
This results in certain URLs not being get-able. In my case en.wikipedia.org serves a 404 response.
get
To get 7cc9488 importable I changed
gazpacho/gazpacho/__init__.py
Line 2 in 7cc9488
from .soup2 import Soup
The valid URL I give is unchanged during sanitization, and is fetched successfully.
Python 3.11.3
Seems like occurrences of % are getting rewritten as %25.
%
%25
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the bug
URL sanitization is overly aggressive on URLs containing percent-encoded data
Steps to reproduce the issue
Run the following on v1.1:
Alternatively, on 7cc9488 with a tweak to get the package loadable, and logging
str(url)
:(i.e., no change)
This results in certain URLs not being
get
-able. In my case en.wikipedia.org serves a 404 response.To get 7cc9488 importable I changed
gazpacho/gazpacho/__init__.py
Line 2 in 7cc9488
Expected behavior
The valid URL I give is unchanged during sanitization, and is fetched successfully.
Environment:
Python 3.11.3
Additional information
Seems like occurrences of
%
are getting rewritten as%25
.The text was updated successfully, but these errors were encountered: