Page#wait_for_loaded not reliable? #51

PandaWhisperer · 2020-04-25T06:57:34Z

Background

I use Apparition to automatically detect the presence of certain JavaScript frameworks on different websites. Basically, I load the site (using #visit), and then I run a script (using #evaluate_script) that tries to obtain the version number (for instance, jQuery.fn.jquery would return the jQuery version on a page that has jQuery installed).

Issue

I've noticed that this method is not always reliable. Specifically, I can sometimes re-run the same script (on a site where jQuery is present) up to three times and get different results each time (either a JavascriptError, nil, or the expected result).

Discussion

I've looked around the source code and noticed that Apparition tries to detect page load by periodically attempting to run JavaScript (in Page#wait_for_loaded), and when it succeed, it assumes the page is loaded.

Before using Apparition, I've used the ChromeRemote gem to interface with Chrome, but for many reasons, Capybara with Apparition seems preferable. However, using the other gem, I was able to detect page loads more reliably by listening for the Page.loadEventFired event.

I'm curious as to why Apparition does not use this approach, especially since a comment in Page#wait_for_loaded seems to indicate that it is known that the current method is not reliable?

The text was updated successfully, but these errors were encountered:

twalpole · 2020-04-25T20:38:20Z

Because the Page.loadEventFired event hasn't been reliable either. PRs with a fully reliable method (that doesn't also hang if something in the page can't be loaded) would be great. One of the other things to consider is what exactly does page "loaded" mean??

PandaWhisperer · 2020-04-26T06:03:38Z

Ah, I see. I'm assuming that Page.domContentEventFired' corresponds to JavaScript's DOMContentLoadedevent, andPage.loadEventFiredcorresponds to JavaScript'swindow.onload` event, so probably the same distinction applies here, i.e. the former is fired as soon as the DOM has finished parsing, and the latter fires when all content has finished loading.

I've tried replacing Page.domContentEventFired on page.rb:501 with Page.loadEventFired and changed Page#wait_for_loaded to simply block until the main frame has been marked as loaded (or the timeout is hit), and I was able to run the integration tests successfully, but when I turned on debugging, I noticed that in most cases, that handler isn't even hit, it's Page.frameStoppedLoading on page:482 that marks the frame as being ready. So from my limited understanding of the code, this change does not seem to make a huge difference.

For right now, my solution is using evaluate_async_script to set a up a window.onload handler that resolves the promise with the desired value, for instance:

evaluate_async_script(<<-JS)
    window.addEventListener('load', () => {
        arguments[0]('jQuery' in window ? jQuery.fn.jquery : null);
    });
JS

However, this does not appear to be 100% reliable in all situations either.

andrewr224 · 2020-06-09T06:53:19Z

Similar situation here. Mostly consistent is the best I can get, here's my solution:

def wait_for_page_load
  Timeout.timeout(Capybara.default_max_wait_time) do
    loop until page.evaluate_script("window.jQuery != undefined")
  end
end

twalpole · 2020-06-14T17:49:34Z

@andrewr224 using Timeout.timeout (the most dangerous method in Ruby) is asking for issues - Because of the way it works (second thread aborting the main thread) it can leave the network stack in an unpredictable state causing all further tests to fail too. This may be acceptable in your use case, but should definitely not be recommended as a general solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Page#wait_for_loaded not reliable? #51

Page#wait_for_loaded not reliable? #51

PandaWhisperer commented Apr 25, 2020 •

edited

Loading

twalpole commented Apr 25, 2020 •

edited

Loading

PandaWhisperer commented Apr 26, 2020 •

edited

Loading

andrewr224 commented Jun 9, 2020

twalpole commented Jun 14, 2020 •

edited

Loading

Page#wait_for_loaded not reliable? #51

Page#wait_for_loaded not reliable? #51

Comments

PandaWhisperer commented Apr 25, 2020 • edited Loading

Background

Issue

Discussion

twalpole commented Apr 25, 2020 • edited Loading

PandaWhisperer commented Apr 26, 2020 • edited Loading

andrewr224 commented Jun 9, 2020

twalpole commented Jun 14, 2020 • edited Loading

PandaWhisperer commented Apr 25, 2020 •

edited

Loading

twalpole commented Apr 25, 2020 •

edited

Loading

PandaWhisperer commented Apr 26, 2020 •

edited

Loading

twalpole commented Jun 14, 2020 •

edited

Loading