Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frictionless: Read and Write Frictionless Data Packages #495

Closed
15 of 28 tasks
peterdesmet opened this issue Jan 3, 2022 · 52 comments
Closed
15 of 28 tasks

frictionless: Read and Write Frictionless Data Packages #495

peterdesmet opened this issue Jan 3, 2022 · 52 comments
Assignees

Comments

@peterdesmet
Copy link
Member

peterdesmet commented Jan 3, 2022

Date accepted: 2022-02-10
Submitting Author Name: Peter Desmet
Submitting Author Github Handle: @peterdesmet
Other Package Authors Github handles: @damianooldoni
Repository: https://github.com/frictionlessdata/frictionless-r
Version submitted: 0.9.0
Submission type: Standard

Editor: @melvidoni
Reviewers: @zambujo, @beatrizmilz

Due date for @zambujo: 2022-02-06

Due date for @beatrizmilz: 2022-02-09
Archive: TBD
Version accepted: TBD
Language: en


  • Paste the full DESCRIPTION file inside a code block below:
Package: frictionless
Title: Read and Write Frictionless Data Packages
Version: 0.9.0.9000
Authors@R: c(
    person("Peter", "Desmet", , "[email protected]", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0002-8442-8025")),
    person("Damiano", "Oldoni", , "[email protected]", role = "aut",
           comment = c(ORCID = "0000-0003-3445-7562")),
    person("Research Institute for Nature and Forest (INBO)", , , 
           "[email protected]", role = c("cph"))
  )
Description: Read and write Frictionless Data Packages. A Data Package 
  (<https://specs.frictionlessdata.io/data-package/>) is a simple container 
  format and standard to describe and package a collection of (tabular) data. 
  It is typically used to publish FAIR and open datasets.
License: MIT + file LICENSE
URL: https://github.com/frictionlessdata/frictionless-r,
    https://frictionlessdata.github.io/frictionless-r/
BugReports: https://github.com/frictionlessdata/frictionless-r/issues
Imports:
    assertthat,
    dplyr,
    glue,
    httr,
    jsonlite,
    purrr,
    readr (>= 2.1.0),
    stringr
Suggests:
    knitr,
    hms,
    lubridate,
    testthat (>= 3.0.0),
    rmarkdown
Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.2
VignetteBuilder: knitr

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • data validation and testing (listed as category, but not in issue template)
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

frictionless allows users to read and write Frictionless Data Packages, an open and general-purpose standard to structure and describe (tabular) datasets, typically used to publish FAIR datasets. The package allows users to read (local and remote) Data Packages (data retrieval), load its data resources in data frames (data extraction), return errors if the Data Package is malformed (data validation and testing), add data frames as new resources (data munging) and write Data Packages back to disk (Data deposition).

  • Who is the target audience and what are scientific applications of this package?

Anyone who wants to read or create datasets structured as Frictionless Data Packages. The community is referred to as the Frictionless Data community and typical includes researchers, data scientists and data engineers, often interested in (publishing) open data.

Yes, datapackage.r: it has an object-oriented design (using a Package class) and offers validation. frictionless on the other hand allows users to quickly read and write Data Package data to and from R data frames, getting out of your way for the rest of your analysis. It is designed to be lightweight, follows tidyverse principles and supports piping. The main functionality (reading data into data frame, adding a data frame as a resource to a package, writing a Data Package to disk) is offered as functions, rather than the class properties in datapackage.r.

Not applicable

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Not applicable

Technical checks

Confirm each of the following by checking the box.

Note that the link to guide for authors above (in the issue template) returns a 404. It should be https://devguide.ropensci.org/authors-guide.html. I tried to use pkgcheck but I got package ‘pkgcheck’ is not available for this version of R

This package:

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

Note that this package falls under the Frictionless Data Code of Conduct.

@ropensci-review-bot
Copy link
Collaborator

Missing values: author1, repourl, submission-type, language

@peterdesmet
Copy link
Member Author

@ropensci-review-bot I have now included the missing <!--> tags in the issue body.

@mpadge
Copy link
Member

mpadge commented Jan 3, 2022

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for frictionless (v0.9.0.9000)

git hash: dc9daa6a

  • ✔️ Package name is available
  • ✔️ has a 'CITATION' file.
  • ✖️ does not have a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 100%.
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE


1. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 15 files) and
  • 2 authors
  • 1 vignette
  • 1 internal data file
  • 8 imported packages
  • 8 exported functions (median 18 lines of code)
  • 26 non-exported functions in R (median 24 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 15 73.0
files_vignettes 1 68.4
files_tests 10 90.7
loc_R 528 49.2
loc_vignettes 119 31.1
loc_tests 1192 89.2
num_vignettes 1 64.8
data_size_total 1364 61.3
data_size_median 1364 66.0
n_fns_r 34 44.1
n_fns_r_exported 8 38.3
n_fns_r_not_exported 26 48.5
n_fns_per_file_r 1 21.7
num_params_per_fn 2 11.9
loc_per_fn_r 20 59.8
loc_per_fn_r_exp 18 42.5
loc_per_fn_r_not_exp 24 70.4
rel_whitespace_R 13 42.4
rel_whitespace_vignettes 41 38.1
rel_whitespace_tests 13 80.4
doclines_per_fn_exp 30 34.5
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 46 64.6

1a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


2. goodpractice and other checks

Details of goodpractice and other checks (click to open)

3a. Continuous Integration Badges

R-CMD-check

GitHub Workflow Results

name conclusion sha date
pages build and deployment success 96a3d1 2022-01-03
pkgdown success dc9daa 2022-01-03
R-CMD-check success dc9daa 2022-01-03
test-coverage success dc9daa 2022-01-03

3b. goodpractice results

R CMD check with rcmdcheck

rcmdcheck found no errors, warnings, or notes

Test coverage with covr

Package coverage: 100

Cyclocomplexity with cyclocomp

No functions have cyclocomplexity >= 15

Static code analyses with lintr

lintr found the following 12 potential issues:

message number of times
Lines should not be more than 80 characters. 12


Package Versions

package version
pkgstats 0.0.3.59
pkgcheck 0.0.2.205


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@stefaniebutland
Copy link
Member

@lwinfree just fyi, here's the (start of the) rOpenSci software peer review thread for the "frictionless" R package.
Folks, Lilly Winfree is Product Manager @ Frictionless Data.

@peterdesmet
Copy link
Member Author

A codemeta.json file has now been added.

@ldecicco-USGS
Copy link

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for frictionless (v0.9.0.9000)

git hash: 794ca7f6

  • ✔️ Package name is available
  • ✔️ has a 'CITATION' file.
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 100%.
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Package License: MIT + file LICENSE


1. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 15 files) and
  • 2 authors
  • 1 vignette
  • 1 internal data file
  • 8 imported packages
  • 8 exported functions (median 18 lines of code)
  • 26 non-exported functions in R (median 24 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 15 73.0
files_vignettes 1 68.4
files_tests 10 90.7
loc_R 528 49.2
loc_vignettes 119 31.1
loc_tests 1192 89.2
num_vignettes 1 64.8
data_size_total 1364 61.3
data_size_median 1364 66.0
n_fns_r 34 44.1
n_fns_r_exported 8 38.3
n_fns_r_not_exported 26 48.5
n_fns_per_file_r 1 21.7
num_params_per_fn 2 11.9
loc_per_fn_r 20 59.8
loc_per_fn_r_exp 18 42.5
loc_per_fn_r_not_exp 24 70.4
rel_whitespace_R 13 42.4
rel_whitespace_vignettes 41 38.1
rel_whitespace_tests 13 80.4
doclines_per_fn_exp 30 34.5
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 46 64.6

1a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


2. goodpractice and other checks

Details of goodpractice and other checks (click to open)

3a. Continuous Integration Badges

R-CMD-check

GitHub Workflow Results

name conclusion sha date
pages build and deployment success 48cca6 2022-01-04
pkgdown success 794ca7 2022-01-04
R-CMD-check success 794ca7 2022-01-04
test-coverage success 794ca7 2022-01-04

3b. goodpractice results

R CMD check with rcmdcheck

rcmdcheck found no errors, warnings, or notes

Test coverage with covr

Package coverage: 100

Cyclocomplexity with cyclocomp

No functions have cyclocomplexity >= 15

Static code analyses with lintr

lintr found the following 12 potential issues:

message number of times
Lines should not be more than 80 characters. 12


Package Versions

package version
pkgstats 0.0.3.72
pkgcheck 0.0.2.205


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

@mpadge
Copy link
Member

mpadge commented Jan 4, 2022

@peterdesmet Thanks for the submission - an editor will be assigned as soon as possible, but it may take a few days.

@mpadge
Copy link
Member

mpadge commented Jan 6, 2022

@ropensci-review-bot assign @melvidoni as editor

@ropensci-review-bot
Copy link
Collaborator

Assigned! @melvidoni is now the editor

@melvidoni
Copy link
Contributor

Hello @peterdesmet, I'll be the handling editor. I'll start looking for reviewers, and let you know once they are assigned. Please, bare with me for a bit.

@peterdesmet
Copy link
Member Author

Hi @melvidoni, 2 questions:

  1. I'm about to merge a PR with updated functionality into the package. Would it be ok if the reviewers review the resulting 0.10.0 version?
  2. Can I add the peer review badge? rOpenSci

@melvidoni
Copy link
Contributor

Hello @peterdesmet . 1) They will. None of those contacted replied yet, so they will review the latest once they accept. 2) Not yet, once the reviewing process has finished.

@melvidoni
Copy link
Contributor

@ropensci-review-bot assign @zambujo as reviewer

@ropensci-review-bot
Copy link
Collaborator

@zambujo added to the reviewers list. Review due date is 2022-02-06. Thanks @zambujo for accepting to review! Please refer to our reviewer guide.

@ropensci-review-bot
Copy link
Collaborator

@zambujo: If you haven't done so, please fill this form for us to update our reviewers records.

@melvidoni
Copy link
Contributor

melvidoni commented Jan 16, 2022

Hello @peterdesmet I'm still searching for another reviewer. The reviewing deadline for @zambujo is 2022-02-06

@peterdesmet
Copy link
Member Author

@melvidoni @zambujo Thanks!

Version 0.10.0 of the package has just been released, which would be the preferred version for review.

@melvidoni
Copy link
Contributor

Version 0.10.0 of the package has just been released, which would be the preferred version for review.

Yes, that would be the version to review. Could you please make the link clearer and/or merge to master?

@peterdesmet
Copy link
Member Author

@melvidoni version 0.10.0 has been merged to the default branch (main), but that branch is also used for further development.

To install 0.10.0 specifically (recommended):

devtools::install_github("frictionlessdata/[email protected]")

To install the latest development version (0.10.0.9000):

devtools::install_github("frictionlessdata/frictionless-r")

@melvidoni
Copy link
Contributor

@ropensci-review-bot assign @beatrizmilz as reviewer

@zambujo
Copy link

zambujo commented Feb 8, 2022

Dear all, apologies for the delay. Please find my comments below:

Package Review

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s): demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported functions
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 4h

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

The package integrates the frictionlessdata framework library collection which facilitates the packaging of tabular text data along with their schemas across different programming languages. The framework provides a set of tools intended to facilitate the creation of "FAIR-compliant" datasets. The umbrella project is led by the Open Knowledge Foundation and the framework is most known for its command-line tool written in Python.

Regarding the R package, the documentation is well organised and complete. The same applies to the unit tests. (Apropos, I like seeing how the authors handle errors with abundant assertions directly on the main code.) I was unable to find any relevant issues and have only a few minor optional suggestions as well as some open points/questions for discussion. All in all, the package has been beautifully crafted. Well done!

Optional suggestions/questions:

(in no particular order)

  • check_path(path) (in utils.R): would it make sense to include other protocols other than http by using something like grepl("^http://|^https://|^ftp://|sftp://", path) - instead of starts_with("http")
  • in unique_sorted() (in utils.R), I was wondering whether stats::aggregate() is necessary. Intuitively I would have used table() from base R: names(sort(table(x), decreasing = TRUE)), provided it passes the unit test.
  • Is there a reason to pass a default value to the file name parameter in read_package()? Certain functions such as bmp() provide names by default with automatic file numbering, but I think it is more common not to provide any default value.
  • would it make sense to replace {readr} with {vroom} to improve reading and writing times for large files?
  • if I understand correctly, when creating a data package, factors are converted to strings of characters. Would it make sense to package extra dedicated lookup tables to account for the information contained factor levels and their order?
  • in the future, would it make sense to develop {frictionless-r} more towards being a frictionless-py wrapper (somewhat similarly to spacyr/spacy), to make it easier to keep the R package in sync with the Python project?

@melvidoni
Copy link
Contributor

@ropensci-review-bot submit review #495 (comment) time 3

@ropensci-review-bot
Copy link
Collaborator

Logged review for beatrizmilz (hours: 3)

@melvidoni
Copy link
Contributor

@ropensci-review-bot submit review #495 (comment) time 4

@ropensci-review-bot
Copy link
Collaborator

Logged review for zambujo (hours: 4)

@melvidoni
Copy link
Contributor

Thank you both @beatrizmilz and @zambujo for the thoughtful reviews!

@peterdesmet please proceed with the outstanding changes whenever you have time. I'll ask both reviewers to stay tuned to see how your changes are being addressed.

@peterdesmet
Copy link
Member Author

Thanks @zambujo for your review. My feedback:

  • Other protocols for check_path(): Other protocols like FTP could indeed be implemented, but the specs for "URL or path" states that only http, https, and local POSIX paths are allowed. It makes sense though to allow (S)FTP, so have asked the Frictionless community for guidance.
  • Using table() in unique_sorted(): Nice! More elegant indeed. Only had to update to handle all NA_character_ values.
  • No default value for read_package(): Although it is unlikely that a datapackage.json will be present in the working directory, I'm tempted to keep it, because according to the specs a descriptor file must be named datapackage.json, so the default name hints that the user should provide a path to such a file.
  • vroom: {readr} 2.0.0 and up use {vroom} under the hood. frictionless-r requires readr >= 2.1.0 and thus {vroom}, so I don't think there is going to be a speed difference using {vroom} directly. Even though many functions in {readr} can be exchanged for {vroom} functions, I'll keep {readr} because 1) I communicate in the documentation that {readr} is used by some functions and that is a more well-known package to end users than {vroom} and 2) I rely on readr::guess_encoding().
  • factors: when reading a data package, str/integer/numeric values that have an enum in their schema are converted to factors, with the enum values as levels, in the order they are listed in enum. When writing a data package, factors keep their data type and the levels are written in an enum field. No (re)ordering is done.
    {
      "name": "str_factor",
      "type": "string",
      "constraints": {
        "enum": ["foo", "bar"]
      }
    },
    {
      "name": "num_factor",
      "type": "number",
      "constraints": {
        "enum": [3.1, 3.2, 3.3]
      }
    },
    {
      "name": "int_factor",
      "type": "integer",
      "constraints": {
        "enum": [3, 4, -1]
      }
    }
  • r package as wrapper around Python: An option indeed, but:
    1. Currently out of my area of expertise
    2. Python dev is going fast and currently a bit of a moving target
    3. The main target to keep in sync with are the specs and they luckily don't change that fast.
    4. In my opinion the previous R package {datapackage.r} suffers from being un-R-like, introducing OO concepts that are foreign to most R users, which might happen again on a wrapper.
      However, for Data Package validation specifically (currently not in scope for {frictionless-r}), wrapping around the Python toolbox is likely useful!

@zambujo
Copy link

zambujo commented Feb 9, 2022

Many thanks @peterdesmet. You have addressed all my comments and questions. Impressive work!

Ps. I have to confess that I had to update my packages to be able to review frictionless-r. Incidentally, I did notice a huge improvement in the performance of {readr} when I ran some code this morning. 🤓

@peterdesmet
Copy link
Member Author

Thanks @zambujo. The suggested change for unique_values() is implemented in frictionlessdata/frictionless-r#101.

@melvidoni, the comments suggested by @beatrizmilz are addressed in #495 (comment) and where actionable, all implemented in the latest version of the package. Both reviewers were included with rev roles: thanks to you both!

One lingering question I have for the reviewers is the use of the word package. I'm copy/pasting my question from higher up:

  • Masking of usethis::create_package(): Yeah, it is a bit unfortunate that the term package is used for different things in Frictionless vs R (as explained at the start of the vignette). Luckily in R it is often referred to as pkg in function names, reducing masking. In the Frictionless Community "Data Package" does seem to be consistently referred to as package in implementations in other languages, not dp, seldom as datapackage, which is why I adopted that term for frictionless functions and parameters. I think alternatives like create_datapackage(), create_data_package(), create_dataset() are less desirable, but 👉 feedback welcome 👈. Was the term package confusing in any way?

Since you both didn't remark on that, I assume that the word package was not confusing in read_package(), create_package() or write_package(), but I want to make sure.

@melvidoni
Copy link
Contributor

Okay, given that @zambujo gave the okay, we are only missing @beatrizmilz's comments on the latest changes, and the answer for your question. Let's wait for her, then.

@beatrizmilz
Copy link

beatrizmilz commented Feb 10, 2022

Hi! Peter, the word package was not confusing for me since there was an explaination in the documentation! I pointed out about the masking with usethis because for users that uses only library() and are not familiar with the possibility of conflict between functions with the same names, can eventually encounter errors and ask for help (and that is not a problem with the package!). I think i thought about that because I answer a lot of questions in foruns in portuguese, and is frequent questions about errors caused by masking.

You have addressed all the questions, and as @zambujo said, this is an impressive work. Congratulations!

@melvidoni
Copy link
Contributor

@ropensci-review-bot approve frictionless

@ropensci-review-bot
Copy link
Collaborator

Approved! Thanks @peterdesmet for submitting and @zambujo, @beatrizmilz for your reviews! 😁

To-dos:

  • Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so.
  • After transfer write a comment @ropensci-review-bot finalize transfer of <package-name> where <package-name> is the repo/package name. This will give you admin access back.
  • Fix all links to the GitHub repo to point to the repo under the ropensci organization.
  • Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file
  • If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
    • deactivate the automatic deployment you might have set up
    • remove styling tweaks from your pkgdown config but keep that config file
    • replace the whole current pkgdown website with a redirecting page
    • replace your package docs URL with https://docs.ropensci.org/package_name
    • In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
  • Fix any links in badges for CI and coverage to point to the new repository URL.
  • Increment the package version to reflect the changes you made during review. In NEWS.md, add a heading for the new version and one bullet for each user-facing change, and each developer-facing change that you think is relevant.
  • We're starting to roll out software metadata files to all rOpenSci packages via the Codemeta initiative, see https://docs.ropensci.org/codemetar/ for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.
  • You can add this installation method to your package README install.packages("<package-name>", repos = "https://ropensci.r-universe.dev") thanks to R-universe.

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

@peterdesmet
Copy link
Member Author

peterdesmet commented Feb 10, 2022

@melvidoni Is it required to transfer the {frictionless-r} repository to rOpenSci? Because this is also a question of branding. Ping @lwinfree @sapetti9 @roll from Frictionless Data.

  • Package repository URL: I would prefer to keep https://github.com/frictionlessdata/frictionless-r .That way 1) it is clearer that it is endorsed/maintained by Frictionless Data and 2) it keeps living alongside other implementations of Frictionless Data standards, such Python and JS. -> Keep under frictionlessdata
  • Package website: currently https://frictionlessdata.github.io/frictionless-r/ with generic pkgdown (Bootstrap v5). We could adopt the rOpenSci central docs building and branding, so it looks more like https://docs.ropensci.org/wateRinfo/. That will visually tie it to rOpenSci and required an update of its URL (including a redirect of the old URL). Branding wise, my opinion is that it is fine to adopt the rOpenSci branding, because I hope to add a logo to the package soon, which will visually tie it to Frictionless Data.
  • rOpenSci package family: how can I get the package listed under https://ropensci.org/packages/all/ ?
  • rOpenSci R universe: how can I get the package added to the rOpenSci R universe?
  • Code of conduct: Which code of conduct should we adopt? Frictionless Data vs rOpenSci I think Frictionless Data, since the package is maintained there and conflicts should be resolved there. -> Use Frictionless

Once those questions are answered I can make the necessary changes and then hopefully submit to CRAN! 🎉🤞

TODO based on #495 (comment)

  • Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so.
  • After transfer write a comment @ropensci-review-bot finalize transfer of <package-name> where <package-name> is the repo/package name. This will give you admin access back.
  • Fix all links to the GitHub repo to point to the repo under the ropensci organization.
  • Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file
  • Customize sidebar so COC appears there.
  • If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
    • deactivate the automatic deployment you might have set up
    • remove styling tweaks from your pkgdown config but keep that config file
    • replace the whole current pkgdown website with a redirecting page
    • replace your package docs URL with https://docs.ropensci.org/package_name
    • In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
  • Fix any links in badges for CI and coverage to point to the new repository URL.
  • Increment the package version to reflect the changes you made during review. In NEWS.md, add a heading for the new version and one bullet for each user-facing change, and each developer-facing change that you think is relevant. Done in https://frictionlessdata.github.io/frictionless-r/news/index.html
  • We're starting to roll out software metadata files to all rOpenSci packages via the Codemeta initiative, see https://docs.ropensci.org/codemetar/ for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.
  • You can add this installation method to your package README install.packages("<package-name>", repos = "https://ropensci.r-universe.dev") thanks to R-universe.

@lwinfree
Copy link

Hi all! First of all, it has been really lovely to watch this process unfold, so a big thank you to everyone that has been involved!

Speaking as product manager of Frictionless Data:

  • we would like to keep the repo under the Frictionless organization as Peter suggests
  • we would like to keep the Frictionless code of conduct to not confuse potential users if that is OK

everything else looks great to me!

Thanks!

@peterdesmet
Copy link
Member Author

Thanks @lwinfree!

@melvidoni What would be the instructions to do the remaining points in #495 (comment) i.e. using the rOpenSci CI and website building for a repo not under rOpenSci (cf. https://github.com/CornellLabofOrnithology/auk/)?

@melvidoni
Copy link
Contributor

Hello all. Please, bear with me while I discuss with the other Associate Editors. In the meantime, complete what you can, please.

@melvidoni
Copy link
Contributor

Update @peterdesmet @lwinfree. We are discussing the CoC issue. Will get back to you soon-ish, please bear with us.

@maelle
Copy link
Member

maelle commented Feb 15, 2022

Thanks for your work on this package. 😸

@maelle
Copy link
Member

maelle commented Feb 15, 2022

screenshot from rOpenSci website where one gets a card for the frictionless package

https://ropensci.org/packages/all/

@peterdesmet
Copy link
Member Author

Thanks @maelle!

  • I've added a small .github/CODE_OF_CONDUCT.md page that points to Frictionless Data COC. This makes it appear in sidebar.
  • Rather than setting up a frictionlessdata.github.io repository (required when the repository moves organizations - not the case here), I have placed a redirect index.html on the gh-pages branch of the repository, disabled automatic pkgdown building, but kept GitHub Pages active. https://frictionlessdata.github.io/frictionless-r/ now successfully redirects
  • I have kept some minor styling tweaks to _pkgdown.yml for local pkgdown building. I don't think it will affect building at rOpenSci docs.
  • I have kept the other GitHub Actions (e.g. test-coverage.yaml) intact. Or are those checks provided by rOpenSci CI and thus not necessary in the repository?

@maelle
Copy link
Member

maelle commented Feb 15, 2022

Thank you!

I have kept some minor styling tweaks to _pkgdown.yml for local pkgdown building. I don't think it will affect building at rOpenSci docs.

Indeed, we override those in https://github.com/ropensci-org/rotemplate

I have kept the other GitHub Actions (e.g. test-coverage.yaml) intact. Or are those checks provided by rOpenSci CI and thus not necessary in the repository?

It is good to keep them indeed. R-universe does build the package but you wouldn't get notified and you can't share credentials for instance. You can see the R-universe status of your package at https://ropensci.r-universe.dev/ui#builds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants