Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EmbedAPIv3: updates after QA with sphinx-hoverxref #8521

Merged
merged 12 commits into from
Sep 27, 2021

Conversation

humitos
Copy link
Member

@humitos humitos commented Sep 22, 2021

Updates required by readthedocs/sphinx-hoverxref#146 found while doing QA.

@humitos humitos requested a review from a team September 27, 2021 09:06
@humitos humitos marked this pull request as ready for review September 27, 2021 09:06
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few small suggestions.

@@ -71,12 +70,15 @@ def _download_page_content(self, url):

response = requests.get(url, timeout=settings.RTD_EMBED_API_DEFAULT_REQUEST_TIMEOUT)
if response.ok:
# NOTE: we use ``response.content`` to get its binary
# representation. Then ``selectolax`` is in charge to auto-detect
# its encoding. We trust more in selectolax for this than in requests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we trust it more? Is there a specific issue we've seen? I'm guessing that requests has a lot more eyes on it, but perhaps we're doing something different?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't have a good amount of data tested yet. However, I've already find myself doing the solution from this SO question: https://stackoverflow.com/a/52615216

r = requests.get("https://martin.slouf.name/")
# override encoding by real educated guess as provided by chardet
r.encoding = r.apparent_encoding
# access the data
r.text

Then, I read that selectolax also uses the meta-tags from the HTML (https://selectolax.readthedocs.io/en/latest/parser.html#selectolax.parser.HTMLParser) to guess the encoding and it always worked without manual interaction as with requests.

Anyways, I don't have strong opinions here. I suppose whatever we choose here, it will fail in some case that the other library wouldn't 🙃

readthedocs/embed/v3/views.py Show resolved Hide resolved
readthedocs/embed/v3/views.py Outdated Show resolved Hide resolved
readthedocs/embed/v3/views.py Outdated Show resolved Hide resolved
@@ -818,6 +818,8 @@ def DOCKER_LIMITS(self):
r'docs\.python\.org',
r'docs\.scipy\.org',
r'docs\.sympy\.org',
r'www.sphinx-doc.org',
r'numpy\.org',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely something that should be defined in the DB, or somewhere else that doesn't require a deploy to modify. It's definitely something we're going to need to iterate on pretty frequently I'm guessing, and perhaps even support customization at a project or organization level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We do have an issue opened for Django settings in the DB at https://github.com/readthedocs/readthedocs-ops/issues/1004 but I don't think that will progress too much. If that's the case, we can create a simple EmbedDomain model that we can relate to a Project and/or Organization as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #8530 to continue the conversation there.

readthedocs/settings/docker_compose.py Show resolved Hide resolved
tox.embedapi.ini Show resolved Hide resolved
@humitos humitos enabled auto-merge September 27, 2021 18:40
@humitos humitos merged commit d8651c1 into master Sep 27, 2021
@humitos humitos deleted the humitos/embed-api-v3-updates branch September 27, 2021 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants