Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go SDK: Update memfs to parse the List() pattern as a glob, not a regexp #21943

Merged
merged 2 commits into from
Jul 3, 2022

Conversation

gonzojive
Copy link
Contributor

This makes the memfs implementation consistent with the filesystem
implementation of List().

The docstring for filesystem.Interface.List does not specify how the pattern
should be interpretted. It says: "List expands a pattern to a list of
filenames." Perhaps that docstring should be updated to be more specific.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

…exp.

This makes the memfs implementation consistent with the filesystem
implementation of List().

The docstring for filesystem.Interface.List does not specify how the pattern
should be interpretted. It says: "List expands a pattern to a list of
filenames." Perhaps that docstring should be updated to be more specific.
@asf-ci
Copy link

asf-ci commented Jun 18, 2022

Can one of the admins verify this patch?

4 similar comments
@asf-ci
Copy link

asf-ci commented Jun 18, 2022

Can one of the admins verify this patch?

@asf-ci
Copy link

asf-ci commented Jun 18, 2022

Can one of the admins verify this patch?

@asf-ci
Copy link

asf-ci commented Jun 18, 2022

Can one of the admins verify this patch?

@asf-ci
Copy link

asf-ci commented Jun 18, 2022

Can one of the admins verify this patch?

@codecov
Copy link

codecov bot commented Jun 18, 2022

Codecov Report

Merging #21943 (ee8521d) into master (525a169) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #21943      +/-   ##
==========================================
+ Coverage   73.98%   74.00%   +0.02%     
==========================================
  Files         702      703       +1     
  Lines       92845    92934      +89     
==========================================
+ Hits        68687    68777      +90     
+ Misses      22903    22892      -11     
- Partials     1255     1265      +10     
Flag Coverage Δ
go 50.98% <100.00%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdks/go/pkg/beam/io/filesystem/memfs/memory.go 96.15% <100.00%> (+4.15%) ⬆️
sdks/go/pkg/beam/io/fhirio/execute_bundles.go 56.71% <0.00%> (ø)
sdks/go/pkg/beam/runners/dataflow/dataflow.go 59.77% <0.00%> (+1.11%) ⬆️
sdks/go/pkg/beam/core/runtime/graphx/serialize.go 27.51% <0.00%> (+1.23%) ⬆️
sdks/go/pkg/beam/io/fhirio/read.go 82.75% <0.00%> (+2.75%) ⬆️
sdks/go/pkg/beam/core/runtime/graphx/user.go 42.30% <0.00%> (+42.30%) ⬆️
sdks/go/pkg/beam/io/fhirio/common.go 59.45% <0.00%> (+59.45%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 525a169...ee8521d. Read the comment docs.

@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @damccorm for label go.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll ask @lostluck to weigh in here since I'm not super up to date on how aggressively we want to work to unify our IO behavior, but I don't think we can take this since it will break any existing users (especially since its breaking in a silent/potential data-lossy way).

In a vacuum I agree that unifying behavior is a good idea, so this could be a good candidate if we upgrade major versions eventually, or if we can find a less breaking way to do this (which seems kinda unlikely)

@lostluck
Copy link
Contributor

In this instance, I think we should change it.

  1. We don't really demonstrate usage of the memfs anywhere. https://github.com/apache/beam/search?l=Go&q=memfs
  2. We don't have properly defined behavior.
  3. Regex in this case is surprising, especially if coming from the local or gcs file systems.

Looking at the local and gcs implementations, they use filepath.Match/Glob instead of plain path.Match. filepath is OS separator aware.

A github search doesn't reveal any usage, and neither does the pkg page for v2. Checking the pre-modules version has 1 package that hasn't updated in 3 years, and doesn't use it explicitly (but does use local and gcs).

As a result, I think a move towards consistency, and documenting the behavior is the right one.

Copy link
Contributor

@lostluck lostluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small change request, but otherwise this LGTM.

sdks/go/pkg/beam/io/filesystem/memfs/memory.go Outdated Show resolved Hide resolved
Also, accept input without a "memfs://" prefix.
@gonzojive
Copy link
Contributor Author

I'm not sure what the etiquette is for resolving comments on github, but I think I addressed the comment, so I resolved it.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 2, 2022

Reminder, please take a look at this pr: @damccorm

Copy link
Contributor

@lostluck lostluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank your for your patience and sorry for the delay! This one slipped through the cracks.

Beam's usual standard is if the comment is to mark the comment resolved, which at least also emails folks subscribed to the issue.

@lostluck lostluck merged commit 385ee7f into apache:master Jul 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants