Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hit_api #20

Closed
mikemahoney218 opened this issue Feb 12, 2021 · 5 comments
Closed

hit_api #20

mikemahoney218 opened this issue Feb 12, 2021 · 5 comments
Labels
reviewer comment wontfix This will not be worked on

Comments

@mikemahoney218
Copy link
Member

hit_national_map_api.R

* I suggest using `httr::RETRY("GET", ...)`  rather then a tryCatch and counters, this will greatly simplify your logic. For example:
#Set UP
url = httr::parse_url("https://elevation.nationalmap.gov/arcgis/rest/services/3DEPElevation/ImageServer/exportImage")

query_arg <- list(
      bboxSR = 4326,
      imageSR = 4326,
      size = paste(8000, 8000, sep = ","),
      format = "tiff",
      pixelType = "F32",
      noDataInterpretation = "esriNoDataMatchAny",
      interpolation = "+RSP_BilinearInterpolation",
      f = "json"
    )

## 
first_corner <- bbox@bl
second_corner <- bbox@tr

bbox_arg <- list(bbox = paste(
    min(first_corner@lng, second_corner@lng),
    min(first_corner@lat, second_corner@lat),
    max(second_corner@lng, first_corner@lng),
    max(second_corner@lat, first_corner@lat),
    sep = ","
  ))

####======
#^all the above could be replaces with paste(st_bbox(...), collapse = ",")

You have a lot of logic wrapping your httr calls which is essentially:

res <- httr::GET(url, query = c(bbox_arg, query_arg))
content = httr::content(res, type = "application/json")

An identical, but way to build in some of that error catching and retrying would be to change it to:

res2 <- httr::RETRY("GET", url, 
                   query = c(bbox_arg, query_arg), 
                   times = 10, pause_cap = 10)
@mikemahoney218
Copy link
Member Author

I am almost entirely certain that httr::RETRY never returns for one of these endpoints, for reasons I never got to the bottom of -- worth finding the commit instead of running into the same issue again

@mikemahoney218
Copy link
Member Author

Found the commit at least, but past me was not kind enough to leave comments about the issue
f535f4b

@mikemahoney218
Copy link
Member Author

Screenshot from 2021-02-15 10-56-24

Ok, found (part of) why I removed httr::RETRY initially. There's a real performance impact when using RETRY for reasons I don't entirely understand -- there's a huge performance hit to RETRY vs my manual backoff implementation, and RETRY occasionally fails with very unfriendly (to me and the debugger) stacks (while it's downright rare to see similar failures in the current version)

@mikemahoney218
Copy link
Member Author

(top is retry rewrite, bottom is mine)

@mikemahoney218
Copy link
Member Author

Implemented httr::RETRY on branch:
https://github.com/mikemahoney218/terrainr/tree/httr_retry

CI has not been kind to this implementation (which I'm legitimately surprised by, since "it ran on my machine"). Notably, the tests are failing on different steps -- the code essentially "works", just is incredibly flaky:

Even ignoring the test suite failures, it's worth noting the time difference between the partial test suite runs with httr::RETRY versus without. I pushed a quick commit to reviewer_comments in order to get closer comparisons for the timing (adjust for server load etc) -- while it feels rude to benchmark someone else's public API, I'm fine with engineering a "natural experiment" :)

The timings below only reflect R CMD check, which isn't cached between builds.

I run the same backoff protocol as httr does, so it shouldn't be related to that.

I'm pretty sure that part of the reason for both the additional failures and the increased run time is that get_href is solving for both res and body -- the two combined get 15 tries, as if the first stage fails there's no point in advancing to the second. That said, you still see a significant time hit just by changing the last GET to RETRY (which really should be an in-place substitution) so 🤷 I honestly don't know. I'm going to stick with the thing that works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reviewer comment wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

1 participant