Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page#wait_for_loaded not reliable? #51

Open
PandaWhisperer opened this issue Apr 25, 2020 · 4 comments
Open

Page#wait_for_loaded not reliable? #51

PandaWhisperer opened this issue Apr 25, 2020 · 4 comments

Comments

@PandaWhisperer
Copy link

PandaWhisperer commented Apr 25, 2020

Background

I use Apparition to automatically detect the presence of certain JavaScript frameworks on different websites. Basically, I load the site (using #visit), and then I run a script (using #evaluate_script) that tries to obtain the version number (for instance, jQuery.fn.jquery would return the jQuery version on a page that has jQuery installed).

Issue

I've noticed that this method is not always reliable. Specifically, I can sometimes re-run the same script (on a site where jQuery is present) up to three times and get different results each time (either a JavascriptError, nil, or the expected result).

Discussion

I've looked around the source code and noticed that Apparition tries to detect page load by periodically attempting to run JavaScript (in Page#wait_for_loaded), and when it succeed, it assumes the page is loaded.

Before using Apparition, I've used the ChromeRemote gem to interface with Chrome, but for many reasons, Capybara with Apparition seems preferable. However, using the other gem, I was able to detect page loads more reliably by listening for the Page.loadEventFired event.

I'm curious as to why Apparition does not use this approach, especially since a comment in Page#wait_for_loaded seems to indicate that it is known that the current method is not reliable?

@twalpole
Copy link
Owner

twalpole commented Apr 25, 2020

Because the Page.loadEventFired event hasn't been reliable either. PRs with a fully reliable method (that doesn't also hang if something in the page can't be loaded) would be great. One of the other things to consider is what exactly does page "loaded" mean??

@PandaWhisperer
Copy link
Author

PandaWhisperer commented Apr 26, 2020

Ah, I see. I'm assuming that Page.domContentEventFired' corresponds to JavaScript's DOMContentLoadedevent, andPage.loadEventFiredcorresponds to JavaScript'swindow.onload` event, so probably the same distinction applies here, i.e. the former is fired as soon as the DOM has finished parsing, and the latter fires when all content has finished loading.

I've tried replacing Page.domContentEventFired on page.rb:501 with Page.loadEventFired and changed Page#wait_for_loaded to simply block until the main frame has been marked as loaded (or the timeout is hit), and I was able to run the integration tests successfully, but when I turned on debugging, I noticed that in most cases, that handler isn't even hit, it's Page.frameStoppedLoading on page:482 that marks the frame as being ready. So from my limited understanding of the code, this change does not seem to make a huge difference.

For right now, my solution is using evaluate_async_script to set a up a window.onload handler that resolves the promise with the desired value, for instance:

evaluate_async_script(<<-JS)
    window.addEventListener('load', () => {
        arguments[0]('jQuery' in window ? jQuery.fn.jquery : null);
    });
JS

However, this does not appear to be 100% reliable in all situations either.

@andrewr224
Copy link

Similar situation here. Mostly consistent is the best I can get, here's my solution:

def wait_for_page_load
  Timeout.timeout(Capybara.default_max_wait_time) do
    loop until page.evaluate_script("window.jQuery != undefined")
  end
end

@twalpole
Copy link
Owner

twalpole commented Jun 14, 2020

@andrewr224 using Timeout.timeout (the most dangerous method in Ruby) is asking for issues - Because of the way it works (second thread aborting the main thread) it can leave the network stack in an unpredictable state causing all further tests to fail too. This may be acceptable in your use case, but should definitely not be recommended as a general solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants