Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_characteristics_metadata() failed to find the file #365

Closed
lindsayplatt opened this issue Oct 30, 2023 · 20 comments
Closed

get_characteristics_metadata() failed to find the file #365

lindsayplatt opened this issue Oct 30, 2023 · 20 comments

Comments

@lindsayplatt
Copy link

I believe this is similar to #343 as it has to do with the file paths.

The issue

When I try to list the metadata attributes, I get the following error:

library(nhdplusTools)
get_characteristics_metadata()

rror in curl::curl_fetch_disk(url, x$path, handle = handle): Failed to open file C:\Users\CRAN\AppData\Roaming\R\data\R\nhdplusTools\metadata_table.tsv.
Request failed [ERROR]. Retrying in 1 seconds...
Error in curl::curl_fetch_disk(url, x$path, handle = handle): Failed to open file C:\Users\CRAN\AppData\Roaming\R\data\R\nhdplusTools\metadata_table.tsv.
Request failed [ERROR]. Retrying in 2.8 seconds...
NULL
Warning messages:
1: In dir.create(dirname(f), recursive = TRUE) :
  cannot create dir 'C:\Users\CRAN', reason 'Permission denied'
2: In get_characteristics_metadata() :
  Problem getting metadata, no internet?

I think the key here is this chunk: Failed to open file C:\Users\CRAN\AppData\Roaming\R\data\R\nhdplusTools\metadata_table.tsv.. That filepath does not exist on my computer.

I saw that you mentioned the code is using whatever is returned from tools::R_user_dir("nhdplusTools") but my nhdplusTools dir and that one are different:

> tools::R_user_dir("nhdplusTools")
[1] "C:\\Users\\lrtta\\AppData\\Roaming/R/data/R/nhdplusTools"
> nhdplusTools_data_dir()
[1] "C:\\Users\\CRAN\\AppData\\Roaming/R/data/R/nhdplusTools"

Current workaround

For now, I can run this before my command to get around this issue.

nhdplusTools::nhdplusTools_data_dir(tools::R_user_dir("nhdplusTools"))
head(get_characteristics_metadata())

Version info

I am using nhdplusTools v1.0.0 in R version 4.2.1 (2022-06-23) and RStudio 2023.09.0 Build 463.

@dblodgett-usgs
Copy link
Collaborator

Sorry to be slow here -- I've been flustered by this issue for a while as it's been tricky to reproduce.

I'm going to try something and see if it fixes it for others.

@dblodgett-usgs
Copy link
Collaborator

I now see:

Restarting R session...

> nhdplusTools::nhdplusTools_data_dir()
[1] "C:\\Users\\dblodgett\\AppData\\Roaming/R/data/R/nhdplusTools"
> library(nhdplusTools)
> nhdplusTools_data_dir()
[1] "C:\\Users\\dblodgett\\AppData\\Roaming/R/data/R/nhdplusTools"

dblodgett-usgs added a commit to dblodgett-usgs/nhdplusTools that referenced this issue Dec 8, 2023
@lindsayplatt
Copy link
Author

I installed v1.0.1 using remotes::install_github() and now those two commands produce the same output 🎉 I can run get_characteristics_metadata() without needing to set nhdplusTools_data_dir() first. Thanks for figuring this out!

@dblodgett-usgs
Copy link
Collaborator

Great!! It will be in the next CRAN release then.

@lindsayplatt
Copy link
Author

I am once again running up against the filepath issue only this time it is with get_flowline_index(..., flines = "download_nhdplusv2"). There seems to be some hardcoded path that refers to a "CRAN" user and I can't get around it. This is blocking me pretty severely right now as I need to match a new set of USGS sites to COMIDs.

I am using v1.1.0

The setup:

library(nhdplusTools)
library(sf)

example_pt_sf <- structure(list(site_no = "01095220", tz_cd = "EST", 
                                geometry = structure(list(
                                  structure(c(-71.790667835997, 42.410900479658), class = c("XY", "POINT", "sfg"))), 
                                  class = c("sfc_POINT", "sfc"), precision = 0, 
                                  bbox = structure(c(xmin = -71.790667835997, ymin = 42.410900479658, xmax = -71.790667835997, ymax = 42.410900479658), class = "bbox"), 
                                  crs = structure(list(input = "EPSG:4326", wkt = "GEOGCRS[\"WGS 84\",\n    ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n        MEMBER[\"World Geodetic System 1984 (Transit)\"],\n        MEMBER[\"World Geodetic System 1984 (G730)\"],\n        MEMBER[\"World Geodetic System 1984 (G873)\"],\n        MEMBER[\"World Geodetic System 1984 (G1150)\"],\n        MEMBER[\"World Geodetic System 1984 (G1674)\"],\n        MEMBER[\"World Geodetic System 1984 (G1762)\"],\n        MEMBER[\"World Geodetic System 1984 (G2139)\"],\n        ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n            LENGTHUNIT[\"metre\",1]],\n        ENSEMBLEACCURACY[2.0]],\n    PRIMEM[\"Greenwich\",0,\n        ANGLEUNIT[\"degree\",0.0174532925199433]],\n    CS[ellipsoidal,2],\n        AXIS[\"geodetic latitude (Lat)\",north,\n            ORDER[1],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        AXIS[\"geodetic longitude (Lon)\",east,\n            ORDER[2],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n    USAGE[\n        SCOPE[\"Horizontal component of 3D system.\"],\n        AREA[\"World.\"],\n        BBOX[-90,-180,90,180]],\n    ID[\"EPSG\",4326]]"), class = "crs"),
                                  n_empty = 0L), 
                                tar_group = 1L), 
                           sf_column = "geometry", 
                           agr = structure(c(site_no = NA_integer_, tz_cd = NA_integer_, tar_group = NA_integer_), 
                                           levels = c("constant", "aggregate", "identity"), class = "factor"), 
                           row.names = 1L, class = c("sf", "data.frame"))

The failing command:

# This is failing because it is attempting to look in a folder under a user called "CRAN"
get_flowline_index(points = example_pt_sf, 
                   flines = "download_nhdplusv2")

Spherical geometry (s2) switched off
although coordinates are longitude/latitude, st_intersects assumes that they are planar
Spherical geometry (s2) switched on
Error in file(file, mode) : cannot open the connection
In addition: Warning message:
In file(file, mode) :
  cannot open file 'C:\Users\CRAN\AppData\Roaming\R\data\R\nhdplusTools/e983b9af29673b1e': No such file or directory

I attempted to use the previous work around where I force the data directory to change, but that did not seem to work. It appears to still be looking for a filepath under "CRAN"

nhdplusTools::nhdplusTools_data_dir(tools::R_user_dir("nhdplusTools"))

# But re-running doesn't seem to do the trick
get_flowline_index(points = example_pt_sf, 
                   flines = "download_nhdplusv2")

Spherical geometry (s2) switched off
although coordinates are longitude/latitude, st_intersects assumes that they are planar
Spherical geometry (s2) switched on
Error in file(file, mode) : cannot open the connection
In addition: Warning message:
In file(file, mode) :
  cannot open file 'C:\Users\CRAN\AppData\Roaming\R\data\R\nhdplusTools/e983b9af29673b1e': No such file or directory

@dblodgett-usgs
Copy link
Collaborator

Can you try an install from github? I must have missed the fix for this in this case.

@lindsayplatt
Copy link
Author

That slightly changed the behavior! Still erroring, though. I think because the path is to a directory, not a file

Error in gzfile(file, "rb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open file 'C:\Users\lrtta\AppData\Roaming\R\data\R\nhdplusTools/e983b9af29673b1e': it is a directory

@dblodgett-usgs
Copy link
Collaborator

what in the heck... I'm working on nhdplusTools now and will see if I can get this fixed up quick here. Do you have a deeper traceback to where the error is coming from?

@lindsayplatt
Copy link
Author

Is this helpful?

> traceback()
9: gzfile(file, "rb")
8: readRDS(file = file.path(path, key))
7: x$get(key)
6: encl$`_cache`$get(key)
5: query_usgs_geoserver(AOI = AOI, ids = comid, type = "nhd", filter = streamorder_filter(streamorder), 
       t_srs = t_srs)
4: get_nhdplus(AOI = sf::st_transform(req, 4326), realization = "flowline")
3: align_nhdplus_names(get_nhdplus(AOI = sf::st_transform(req, 4326), 
       realization = "flowline"))
2: sf::st_transform(align_nhdplus_names(get_nhdplus(AOI = sf::st_transform(req, 
       4326), realization = "flowline")), sf::st_crs(points))
1: get_flowline_index(points = example_pt_sf, flines = "download_nhdplusv2")

@dblodgett-usgs
Copy link
Collaborator

very.

@dblodgett-usgs
Copy link
Collaborator

OK, this is actually a subtly different bug from before.

remotes::install_github("doi-usgs/nhdplusTools@memoise-bug")

See if that does it?

@lindsayplatt
Copy link
Author

Hmmm I still seem to be getting the same error.

Error in gzfile(file, "rb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open file 'C:\Users\lrtta\AppData\Roaming\R\data\R\nhdplusTools/e983b9af29673b1e': it is a directory

9: gzfile(file, "rb")
8: readRDS(file = file.path(path, key))
7: x$get(key)
6: encl$`_cache`$get(key)
5: query_usgs_geoserver(AOI = AOI, ids = comid, type = "nhd", filter = streamorder_filter(streamorder), 
       t_srs = t_srs)
4: get_nhdplus(AOI = sf::st_transform(req, 4326), realization = "flowline")
3: align_nhdplus_names(get_nhdplus(AOI = sf::st_transform(req, 4326), 
       realization = "flowline"))
2: sf::st_transform(align_nhdplus_names(get_nhdplus(AOI = sf::st_transform(req, 
       4326), realization = "flowline")), sf::st_crs(points))
1: get_flowline_index(points = example_pt_sf, flines = "download_nhdplusv2")

@lindsayplatt
Copy link
Author

Not sure if these versions matter but I am having trouble updating fastmap, cachem, and digest. I kept choosing to update when I installed this package but for some reason these three won't do it.

fastmap (1.1.1  -> 1.2.0 ) [CRAN]
cachem  (1.0.8  -> 1.1.0 ) [CRAN]
digest  (0.6.34 -> 0.6.35) [CRAN]

I'll keep trying but I don't know if any of them have a feature that is necessary for the updates you made to work.

@dblodgett-usgs
Copy link
Collaborator

hmmm.... I can't reproduce. Working on it here.

Does C:\Users\lrtta\AppData\Roaming\R\data\R\nhdplusTools/ exist?

Try setting

NHDPLUSTOOLS_MEMOISE_TIMEOUT=3600
NHDPLUSTOOLS_MEMOISE_CACHE=memory

in .Renviron and see what you get after a restart?

@dblodgett-usgs
Copy link
Collaborator

One other question -- what do you get when you do: (settings <- nhdplusTools_cache_settings()) ?

@lindsayplatt
Copy link
Author

Does C:\Users\lrtta\AppData\Roaming\R\data\R\nhdplusTools/ exist?

Yes, it does exist and there is a folder and one file in that directory. Inside the folder is the zip of the NHD+ Geodatabase

image

image


Try setting
NHDPLUSTOOLS_MEMOISE_TIMEOUT=3600
NHDPLUSTOOLS_MEMOISE_CACHE=memory
in .Renviron and see what you get after a restart?

I just add a .Renviron file with those settings to my current project folder and VOILA! 🎉

> get_flowline_index(points = example_pt_sf, 
+                    flines = "download_nhdplusv2")
Spherical geometry (s2) switched off
although coordinates are longitude/latitude, st_intersects assumes that they are planar
Spherical geometry (s2) switched on
  id   COMID      REACHCODE REACH_meas       offset
1  1 6078267 01070004000476    50.8154 0.0005745375

One other question -- what do you get when you do: (settings <- nhdplusTools_cache_settings()) ?

> (settings <- nhdplusTools_cache_settings())
$mode
$mode$digest
function (...) 
digest::digest(..., algo = algo)
<bytecode: 0x000001d1b8ea9af0>
<environment: 0x000001d1b8ea9620>

$mode$reset
function () 
{
    cache_files <- list.files(path, full.names = TRUE)
    file.remove(cache_files)
}
<bytecode: 0x000001d1b8e9b0e8>
<environment: 0x000001d1b8ea9620>

$mode$set
function (key, value) 
{
    saveRDS(value, file = file.path(path, key), compress = compress)
}
<bytecode: 0x000001d1b8e9acc0>
<environment: 0x000001d1b8ea9620>

$mode$get
function (key) 
{
    readRDS(file = file.path(path, key))
}
<bytecode: 0x000001d1b8e9a7f0>
<environment: 0x000001d1b8ea9620>

$mode$has_key
function (key) 
{
    file.exists(file.path(path, key))
}
<bytecode: 0x000001d1b8e9a470>
<environment: 0x000001d1b8ea9620>

$mode$drop_key
function (key) 
{
    file.remove(file.path(path, key))
}
<bytecode: 0x000001d1b8ea9e70>
<environment: 0x000001d1b8ea9620>

$mode$keys
function () 
list.files(path)
<bytecode: 0x000001d1b8ea98c0>
<environment: 0x000001d1b8ea9620>


$timeout
[1] 86400

dblodgett-usgs added a commit that referenced this issue May 20, 2024
@dblodgett-usgs
Copy link
Collaborator

OK -- one more commit incoming that might fix this? I'm super confused why the file system cache isn't working. When you get a sec, can you install from that github branch again and try it without setting the environment variables?

@lindsayplatt
Copy link
Author

Yes, that is working with your latest commit and not setting any environment variables!

@dblodgett-usgs
Copy link
Collaborator

76745-brad-pitt-happy-dance-gif-Imgu-Ln90

@dblodgett-usgs
Copy link
Collaborator

I'll get this on cran soon -- am getting something new wrapped up and will get it released at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants