Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to fix erase_water() #172 + #151 + #184 + improve error checking w/ rlang #173

Open
wants to merge 93 commits into
base: master
Choose a base branch
from

Conversation

elipousson
Copy link

@elipousson elipousson commented Aug 4, 2023

This is the updated scope of changes in this pull request:

Run `roxygen2md::roxygen2md()` to apply markdown formatting
For consistency w/ docs + other functions
Also adds is_sf helper
- Add set_tigris_year and check_tigris_year functions to set default year + validate input year parameters
- Add check_tigris_resolution to consolidate checks for resolution parameter
- Add allow_null parameter to validate_state and validate_county
- Reverse prior commit exposing the cb parameter in erase_water
- Pass integer values to sprintf and substr (appears to work fine)
@elipousson
Copy link
Author

After catching the two outdated default years in erase_water() and coastline(), I realized that it might be easier to maintain the package if you didn't need to update that number in so many places.

After a close look at the code, I think I was able to consolidate as of the basic year validation into two small helper functions (set_tigris_year() and check_tigris_year()). I also eliminated the distinction between the cyear variable and the year variable since sprintf and substr both seem to handle integer inputs fine and check_tigris_year is now handling validation so it should always be a valid integer.

I also added the allow_null parameter to validate_state() and validate_county() to remove duplicative error checking code and the check_tigris_resolution() helper function to do the same thing.

Hope this doesn't complicate review for the pull request! Let me know if you want me to make an changes or restart with a clean pull request that only addresses the issue with erase_water().

@elipousson
Copy link
Author

The one other thing worth mentioning is that since you are already importing {dplyr} it may be worth swapping all of your stop() functions for cli::cli_abort() or rlang::abort().

You could also potentially integrate the standalone input checking functions that can be imported with usethis::use_standalone("r-lib/rlang", "types-check").

I've adopted those type checking functions in several packages now and they are really handy – happy to open an issue if you wanted to discuss or make the changes on this pull request if it is a welcome suggestion.

@walkerke
Copy link
Owner

Awesome! A lot of the tigris infrastructure is old, dating back to when I first wrote the package several years ago. I like what you are doing to modernize it. I should have some time to review later this week.

Also swap misnamed check_resolution for check_tigris_resolution
Simplifies logic in `input_to_wkt()` since filters were all converted to sfc regardless
@elipousson
Copy link
Author

I caught a typo in my original pull request but couldn't leave well enough alone and did some more (hopefully welcome) refactoring. I noticed there was a lot of duplication in the way the TIGER URLs were built so I added a helper function url_tiger() that I think makes the URL logic a bit easier to read.

I also spotted that input_to_wkt() could be modified to support sfc input object pretty easily (both bbox and sf objects were converted to sfc before being converted to well-known text) so I went ahead and made that tweak as well.

Also add call parameter to internal helper functions (and start incorporating abort, warn, and inform from rlang)

- set_tigris_year
- check_tigris_year
- input_to_wkt
Also:

- drop uuid from Imports (unused) + add rlang to imports
- update version + date to match main
- add Eli Pousson as contributor
Note that tigris-package replaces tigris but tigris is used as alias
Also add pad_str helper function and replace str_trim w/ trimws (allows dropping stringr as dependency)
- Replace stop, warning, and message throughout
- Use check_cb_year function to reduce duplication
Sets max year and default year to 2021 for core_based_statistical_areas
Adds validation for benchmark and vintage
Only check resolution in congressional_districts when cb=TRUE

Also fix possible regression for state_legislative_district when cb=TRUE and year is not 2010 (url was being set and then over-written)

Also validate year in voting_districts w/ min of 2012, max of 2020, and default of 2020
Use new arguments to improve checks w/in legislative functions
@elipousson
Copy link
Author

I went a bit over-board for what started as a small pull request but I'm hoping to review and address the merge conflicts next week. The hopefully welcome addition is that I'm building out the tests to double-check that the refactored error checking is all working. I'll update the PR description and name with a better summary.

I think it actually should be pretty helpful in making maintenance easier but I'd welcome feedback or suggestions.

Fix urban_areas to handle 2020 data when cb = TRUE
Adds new test for shift_geometry and swaps basic class tests for snapshot tests for states + landmarks
@elipousson
Copy link
Author

I thought it was ready for review before but now I really think it is ready for review.

I updated NEWS to include a summary of changes in this PR (but not a full summary of all changes since the 2.0.0 version).

I also continued to expand test coverage up to >50%. This was helpful in catching some regressions or bugs in the new features.

Also fix bug created by using `is_coverage = TRUE`

Also swap warn for inform
Add handling for invalid geometry

Also prep filter_by geometry before passing to county and area_water repeatedly (for modest performance benefit, I think)

Also expose arg and call parameter for erase_water

And minor clean-up for error in load_tiger + fix spacing for arguments
@elipousson
Copy link
Author

@walkerke Did you have a chance to take a look at this?

Happy to make further changes or just open a new PR with a more limited scope if you wanted part but not all of these changes. I could open a PR with just the tests, for example, and then you could use them to make sure the larger PR doesn't break anything that works with the existing package as it is. Thanks for your work on this package as always!

Clean up excess line lengths, single quotes, and spacing issues flagged by lintr
Excess line lengths, single quotes, and spacing issues flagged by lintr
@walkerke
Copy link
Owner

Thanks @elipousson - just need to get a couple projects behind me and I should be able to go through this in early June. Thanks for your work on this!

@elipousson
Copy link
Author

Thanks for the update! I’d seen online that you had a ton of workshops and events this spring but wanted to make sure I didn’t cross wires with your review if I caught any more bugs or adding any more tests.

elipousson added 16 commits May 20, 2024 23:57
Allow multiple states or multiple counties w/in a single state

Allow mixing FIPS codes, names and abbreviations in a single input parameter

Improve errors on invalid inputs using arg_match

Also fix missing match for side parameter in pad_str
Also fix issue w/ how validate_county would warn in case of invalid inputs

Also fix typo (used error_call instead of caller_env) and add `check_tigris_arg()` helper function
Add `tiger_download()` and `tiger_read()` helper functions to reduce duplication
Also add list_check_all_crs + st_is_all_valid helper functions

Update docs for additional arguments to include crs and query arguments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants