-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
findNLDI returns error when API return empty json #623
Comments
Possibly related, or at least could be resolved in the same place in the code internetofwater/nldi-services#315 |
@dblodgett-usgs or @mikejohnson51 can one of you take a look? |
Here is a working option that I can send as a PR: library(dataRetrieval)
findNLDI(wqp = "TCEQMAIN-10016",
nav = "UM",
find = "nwissite",
distance_km = 2,
no_sf = FALSE)
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -101.3427 ymin: 35.7422 xmax: -101.3427 ymax: 35.7422
#> Geodetic CRS: WGS 84
#> # A tibble: 1 × 7
#> sourceName identifier comid name X Y geometry
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <POINT [°]>
#> 1 Water Quality Po… TCEQMAIN-… 1997… DIXO… -101. 35.7 (-101.3427 35.7422) Created on 2022-07-08 by the reprex package (v2.0.1) The "design" choice I used in this NLDI client was that if data didn't exist it would not be returned. So in the above case, since no upstream gages were found in 2km, there is no What do you think? |
@mikejohnson51 |
Awesome - now we're looking at something like this: library(dataRetrieval)
xx = findNLDI(wqp = "TCEQMAIN-10016",
nav = "UM",
find = "nwissite",
distance_km = 2,
no_sf = FALSE)
#> No data found for: nwissite?distance=2
xx
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -101.3427 ymin: 35.7422 xmax: -101.3427 ymax: 35.7422
#> Geodetic CRS: WGS 84
#> # A tibble: 1 × 7
#> sourceName identifier comid name X Y geometry
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <POINT [°]>
#> 1 Water Quality Po… TCEQMAIN-… 1997… DIXO… -101. 35.7 (-101.3427 35.7422) Created on 2022-07-08 by the reprex package (v2.0.1) |
The response of {
"type": "FeatureCollection",
"features": []
} I would expect findNLDI to handle it normally from there. I will create an issue in the NLDI repository to correct this. |
@mps9506 and @EthanGrahn , do you all want to try installing like this: remotes::install_github("USGS-R/dataRetrieval") and let us know if you think we can close this issue? |
I've updated the NLDI to output my previous example for empty results. This seems to introduce a different set of errors. It may be an issue with my tests because I don't typically work with R. Can anybody else confirm this? From what I can tell, |
I updated to the development version and received the following error: > findNLDI(wqp = "TCEQMAIN-10016",
+ nav = "UM",
+ find = "nwissite",
+ distance_km = 2,
+ no_sf = FALSE)
Error in if (sf::st_geometry_type(tmp)[1] == "POINT") { :
missing value where TRUE/FALSE needed As @EthanGrahn notes, I believe this is because > df <- readLines("https://labs.waterdata.usgs.gov/api/nldi/linked-data/wqp/TCEQMAIN-10016/navigation/UM/nwissite?f=json&distance=2")
> sf::read_sf(df)
Simple feature collection with 0 features and 0 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Geodetic CRS: WGS 84
# A tibble: 0 × 1
# … with 1 variable: geometry <GEOMETRY [°]>
> sf::read_sf(df) |> sf::st_coordinates()
[,1] [,2] sessionInfo: > sessionInfo()
R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dataRetrieval_2.7.11.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.8.3 compiler_4.2.1 pillar_1.7.0 class_7.3-20 remotes_2.4.2
[6] tools_4.2.1 jsonlite_1.8.0 lifecycle_1.0.1 tibble_3.1.7 pkgconfig_2.0.3
[11] rlang_1.0.3 DBI_1.1.3 cli_3.3.0 rstudioapi_0.13 curl_4.3.2
[16] e1071_1.7-11 dplyr_1.0.9 httr_1.4.3 xml2_1.3.3 generics_0.1.3
[21] vctrs_0.4.1 classInt_0.4-7 grid_4.2.1 tidyselect_1.1.2 glue_1.6.2
[26] sf_1.0-7 R6_2.5.1 fansi_1.0.3 purrr_0.3.4 magrittr_2.0.3
[31] ellipsis_0.3.2 units_0.8-0 assertthat_0.2.1 KernSmooth_2.23-20 utf8_1.2.2 |
This raises another question on my end. Would it be better for the end user to receive this empty collection, or be returned a 404 instead? As a comparison, searching for an invalid source or ID will return a 404 with an error message. https://labs.waterdata.usgs.gov/api/nldi/linked-data/wqp/fake-id?f=json |
@mps9506 I'll need to dig into why you get that error, on my end I get the desired output and dont see any glaring differences in the package versions: library(dataRetrieval)
findNLDI(wqp = "TCEQMAIN-10016",
nav = "UM",
find = "nwissite",
distance_km = 2,
no_sf = FALSE)
#> No data found for: nwissite?f=json&distance=2
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -101.3427 ymin: 35.7422 xmax: -101.3427 ymax: 35.7422
#> Geodetic CRS: WGS 84
#> # A tibble: 1 × 7
#> sourceName identifier comid name X Y geometry
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <POINT [°]>
#> 1 Water Quality Po… TCEQMAIN-… 1997… DIXO… -101. 35.7 (-101.3427 35.7422)
sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] dataRetrieval_2.7.11.1
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.8.3 pillar_1.7.0 compiler_4.1.0 highr_0.9
#> [5] class_7.3-20 R.methodsS3_1.8.2 R.utils_2.12.0 tools_4.1.0
#> [9] digest_0.6.29 jsonlite_1.8.0 evaluate_0.15 lifecycle_1.0.1
#> [13] tibble_3.1.7 R.cache_0.15.0 pkgconfig_2.0.3 rlang_1.0.3
#> [17] reprex_2.0.1 DBI_1.1.3 cli_3.3.0 rstudioapi_0.13
#> [21] curl_4.3.2 yaml_2.3.5 xfun_0.31 fastmap_1.1.0
#> [25] e1071_1.7-11 dplyr_1.0.9 withr_2.5.0 styler_1.7.0
#> [29] stringr_1.4.0 httr_1.4.3 knitr_1.39 xml2_1.3.3
#> [33] generics_0.1.3 fs_1.5.2 vctrs_0.4.1 tidyselect_1.1.2
#> [37] grid_4.1.0 classInt_0.4-7 glue_1.6.2 sf_1.0-7
#> [41] R6_2.5.1 fansi_1.0.3 rmarkdown_2.14 purrr_0.3.4
#> [45] magrittr_2.0.3 units_0.8-0 ellipsis_0.3.2 htmltools_0.5.2
#> [49] assertthat_0.2.1 KernSmooth_2.23-20 utf8_1.2.2 proxy_0.4-27
#> [53] stringi_1.7.6 crayon_1.5.1 R.oo_1.25.0 Created on 2022-07-11 by the reprex package (v2.0.1) |
I was just testing this and I think I can reproduce both results. Initially I didn't have > findNLDI(wqp = "TCEQMAIN-10016",
+ nav = "UM",
+ find = "nwissite",
+ distance_km = 2,
+ no_sf = FALSE)
sourceName identifier comid name
1 Water Quality Portal TCEQMAIN-10016 19970652 DIXON CREEK NE OF BORGER
X Y
1 -101.3427 35.7422 When I installed > xx = findNLDI(wqp = "TCEQMAIN-10016",
+ nav = "UM",
+ find = "nwissite",
+ distance_km = 2,
+ no_sf = FALSE)
Error in if (sf::st_geometry_type(tmp)[1] == "POINT") { :
missing value where TRUE/FALSE needed |
I have this on todays todo list. I have sf installed though so I think I'm missing something else 😭 Here is a reprex using when
|
Work with 0 length responses more appropriatly in #623
@mps9506 and @EthanGrahn can you try installing again and see if it looks better? remotes::install_github("USGS-R/dataRetrieval") |
I think this works well. I hesitate to ask since I'm not doing any of the work here. But should Valid result: findNLDI(wqp = "TCEQMAIN-10032",
nav = "UM",
find = "nwissite",
distance_km = 2,
no_sf = TRUE) |>
str()
# List of 2
# $ origin :'data.frame': 1 obs. of 6 variables:
# ..$ sourceName: chr "Water Quality Portal"
# ..$ identifier: chr "TCEQMAIN-10032"
# ..$ comid : chr "19972126"
# ..$ name : chr "CANADIAN RIVER AT US 60-83"
# ..$ X : num -100
# ..$ Y : num 35.9
# $ UM_nwissite:'data.frame': 1 obs. of 8 variables:
# ..$ sourceName: chr "NWIS Surface Water Sites"
# ..$ identifier: chr "USGS-07228000"
# ..$ comid : chr "19972126"
# ..$ measure : num 79.5
# ..$ reachcode : chr "11090106000027"
# ..$ name : chr "Canadian Rv nr Canadian, TX"
# ..$ X : num -100
# ..$ Y : num 35.9 compared to: findNLDI(wqp = "TCEQMAIN-10016",
nav = "UM",
find = "nwissite",
distance_km = 2,
no_sf = FALSE) |>
str()
# sf [1 × 7] (S3: sf/tbl_df/tbl/data.frame)
# $ sourceName: chr "Water Quality Portal"
# $ identifier: chr "TCEQMAIN-10016"
# $ comid : chr "19970652"
# $ name : chr "DIXON CREEK NE OF BORGER"
# $ X : num -101
# $ Y : num 35.7
# $ geometry :sfc_POINT of length 1; first list element: 'XY' num [1:2] -101.3 35.7
# - attr(*, "sf_column")= chr "geometry"
# - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA
# ..- attr(*, "names")= chr [1:6] "sourceName" "identifier" "comid" "name" ...
# Warning message:
# No data returned for: https://labs.waterdata.usgs.gov/api/nldi/linked-data/wqp/TCEQMAIN-10016/navigation/UM/nwissite?f=json&distance=2
|
I think it probably should -- I've only ever worked with no_sf TRUE or FALSE in isolation though so never really thought about the difference. Good catch. What do you think @ldecicco-USGS ? |
This issue here is that in |
Right -- but why not return a list with one element when you just have that origin to return? |
🤷 personal preference haha. I don't like the extra structure when its not needed. I'm happy to change it though if there consensus it makes more sense. |
Sorry I didn't really voice an opinion on this one. I think @dblodgett-usgs 's suggestion of the list of one makes sense because then scripts won't break if you have one or many returns. @mikejohnson51 does that sound OK? (and more importantly, can you make that change?) |
@dblodgett-usgs @mikejohnson51 I'm hoping to do an update to dataRetrieval in the next couple of weeks. Should we switch this to a list of 1 element so that the user doesn't can assume a consistent type of return no matter what the size? |
I think that is the right approach, yeah. |
PR incoming 😃 library(dataRetrieval)
findNLDI(wqp = "TCEQMAIN-10016",
nav = "UM",
find = "nwissite",
distance_km = 20,
no_sf = FALSE,
warn = FALSE)
#> $origin
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -101.3427 ymin: 35.7422 xmax: -101.3427 ymax: 35.7422
#> Geodetic CRS: WGS 84
#> # A tibble: 1 × 7
#> sourceName ident…¹ comid name X Y geometry
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <POINT [°]>
#> 1 Water Quality Portal TCEQMA… 1997… DIXO… -101. 35.7 (-101.3427 35.7422)
#> # … with abbreviated variable name ¹identifier
#>
#> $UM_nwissite
#> Simple feature collection with 1 feature and 8 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -101.351 ymin: 35.66476 xmax: -101.351 ymax: 35.66476
#> Geodetic CRS: WGS 84
#> # A tibble: 1 × 9
#> sourceName identifier comid measure reach…¹ name X Y
#> <chr> <chr> <chr> <dbl> <chr> <chr> <dbl> <dbl>
#> 1 NWIS Surface Water Sites USGS-07227920 1997… 56.0 110901… Dixo… -101. 35.7
#> # … with 1 more variable: geometry <POINT [°]>, and abbreviated variable name
#> # ¹reachcode
findNLDI(wqp = "TCEQMAIN-10016",
nav = "UM",
find = "nwissite",
distance_km = 2,
no_sf = TRUE,
warn = FALSE)
#> $origin
#> sourceName identifier comid name
#> 1 Water Quality Portal TCEQMAIN-10016 19970652 DIXON CREEK NE OF BORGER
#> X Y
#> 1 -101.3427 35.7422 Created on 2022-11-01 by the reprex package (v2.0.1) |
… USGS-R/dataRetrieval DOI-USGS#623
Hi,
This isn't fully a bug.
Currently
findNLDI
returns a somewhat cryptic message and errors when making a query that returns nothing (valid query but nothing is found).For example I know the following will return nothing:
https://labs.waterdata.usgs.gov/api/nldi/linked-data/wqp/TCEQMAIN-10016/navigation/UM/nwissite?f=json&distance=2
Here is what
findNLDI
returns when it encounters the same thing:I think for a user expecting a response, the current error and warning messages are hard to troubleshoot. The previous iteration of
navigate_nldi
in nhdPlusTools returned an empty tibble in a similar scenario: DOI-USGS/nhdplusTools#97. Having a message indicating that no features were returned and returning a NULL or empty object/list might be beneficial for users.As always, thanks for this awesome package!!! Happy to make a pull request if there is a way you prefer this handled.
The text was updated successfully, but these errors were encountered: