-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
paths: Revisit path parsing, fix trailing slash behavior #1046
Conversation
This should be a superset of the previous test_paths test suite, while covering a couple more things (more APIs, more path synonyms, utf8, non-printable ascii, non-utf8, etc). Not yet tested are some corner cases with known bugs, mainly around trailing slashes.
As expected these are failing and will need some work to pass. The issue with lfs_file_open allowing trailing slashes was found by rob-zeno, and the issue with lfs_mkdir disallowing trailing slashes was found by XinStellaris, PoppaChubby, pavel-kirienko, inf265, Xywzel, steverpalmer, and likely others.
aaa7a7d
to
09d8aae
Compare
Tests passed ✓, Code: 17188 B (+0.7%), Stack: 1440 B (+0.0%), Structs: 812 B (+0.0%)
|
09d8aae
to
5b75910
Compare
Tests passed ✓, Code: 17188 B (+0.7%), Stack: 1440 B (+0.0%), Structs: 812 B (+0.0%)
|
Overall, this seems like good stuff, based on the description.
This one doesn't seem right to me. Consider the following cases:
I think maybe all of these should return the same error? |
Yeah, I noticed this too and thought it was a bit funny:
But it is consistent with lfs_stat("reg_a/") => LFS_ERR_NOTDIR If Actually, I think
Then it would be consistent with the only other function that can "create" files, lfs_rename("reg_a", "missing_a/") => LFS_ERR_NOTDIR For these weird corner cases I was mostly going off what my Linux + ext4 machine does, but reading the POSIX spec, I think this might be one of the cases where, as a retroactive spec, POSIX leaves this up to the implementation. But I'll be honest, sometimes it's hard to know what exactly the spec is saying:
Though POSIX Will update. |
But
Fair.
|
This would be LFS_ERR_NOENT. To make things more confusing, it depends on which flags you specify: lfs_file_open("missing_a/", LFS_O_RDWR | LFS_O_CREAT) => LFS_ERR_NOTDIR
lfs_file_open("missing_a/", LFS_O_RDWR) => LFS_ERR_NOENT A more complete table:
My understanding of POSIX/Linux/Unix is that EINVAL is generally left as a last resort, otherwise it's too easy to use as a blanket error when more info would be useful to the user.
That's true, though they do share the code path that decides if a trailing-slash is LFS_ERR_NOTDIR. Thoughts on I think if we change
This doesn't really explain the motivation though. It could have just been an implementation quirk. I need to think about this a bit more, I'm not entirely opposed to Another way of looking at LFS_ERR_NOTDIR is that it's the file that you're attempting to create that is not a directory. Though I can understand where this could be confusing. On the other hand, these are realistically going to be pretty rare errors for users to encounter unintentionally... |
- lfs_mkdir now accepts trailing slashes: - before: lfs_mkdir("a/") => LFS_ERR_NOENT - after: lfs_mkdir("a/") => 0 - lfs_stat, lfs_getattr, etc, now reject trailing slashes if the file is not a directory: - before: lfs_stat("reg_a/") => 0 - after: lfs_stat("reg_a/") => LFS_ERR_NOTDIR Note trailing slashes are accepted if the file is a directory: - before: lfs_stat("dir_a/") => 0 - after: lfs_stat("dir_a/") => 0 - lfs_file_open now returns LFS_ERR_NOTDIR if the file exists but the path contains trailing slashes: - before: lfs_file_open("reg_a/") => LFS_ERR_NOENT - after: lfs_file_open("reg_a/") => LFS_ERR_NOTDIR To make these work, the internal lfs_dir_find API required some interesting changes: - lfs_dir_find no longer sets id=0x3ff on not finding a parent entry in the path. Instead, lfs_path_islast can be used to determine if the modified path references a parent entry or child entry based on the remainder of the path string. Note this is only necessary for functions that create new entries (lfs_mkdir, lfs_rename, lfs_file_open). - Trailing slashes mean we can no longer rely on the modified path being NULL-terminated. lfs_path_namelen provides an alternative to strlen that stops at slash or NULL. - lfs_path_isdir also tells you if the modified path must reference a dir (contains trailing slashes). I considered handling this entirely in lfs_dir_find, but the behavior of entry-creating functions is too nuanced. At least lfs_dir_find returns LFS_ERR_NOTDIR if the file exists on disk. Like strlen, lfs_path_namelen/islast/isdir are all O(n) where n is the name length. This isn't great, but if you're using filenames large enough for this to actually matter... uh... open an issue on GitHub and we might improve this in the future. --- There are a couple POSIX incompatibilities that I think are not worth fixing: - Root modifications return EINVAL instead of EBUSY: - littlefs: remove("/") => EINVAL - POSIX: remove("/") => EBUSY Reason: This would be the only use of EBUSY in the system. - We accept modifications of directories with trailing dots: - littlefs: remove("a/.") => 0 - POSIX: remove("a/.") => EBUSY Reason: Not worth implementing. - We do not check for existence of directories followed by dotdots: - littlefs: stat("a/missing/..") => 0 - POSIX: stat("a/missing/..") => ENOENT Reason: Difficult to implement non-recursively. - We accept modifications of directories with trailing dotdots: - littlefs: rename("a/b/..", "c") => 0 - POSIX: rename("a/b/..", "c") => EBUSY Reason: Not worth implementing. These are at least now documented in tests/test_paths.toml, which isn't the greatest location, but it's at least something until a better document is created. Note that these don't really belong in SPEC.md because path parsing is a function of the driver and has no impact on disk.
This changes the behavior of paths that attempt to navigate above root to now return LFS_ERR_INVAL: - before: lfs_stat("/../a") => 0 - after: lfs_stat("/../a") => LFS_ERR_INVAL This is a bit of an opinionated change while making other path resolution tweaks. In terms of POSIX-compatibility, it's a bit unclear exactly what dotdots above the root should do. POSIX notes: > As a special case, in the root directory, dot-dot may refer to the > root directory itself. But the word choice of "may" implies it is up to the implementation. I originally implement this as a root-loop simply because that is what my Linux machine does, but I now think that's not the best option. Since we're making other path-related tweaks, we might as well try to adopt behavior that is, in my opinion, safer and less... weird... This should also help make paths more consistent with future theoretical openat-list APIs, where saturating at the current directory is sort of the least expected behavior.
Unlike normal files, dots (".") should not change the depth when attempting to skip dotdot ("..") entries. A weird nuance in the path parser, but at least it had a relatively easy fix. Added test_paths_dot_dotdots to prevent a regression.
Before this, the empty path ("") was treated as an alias for the root. This was unintentional and just a side-effect of how the path parser worked. Now, the empty path should always result in LFS_ERR_INVAL: - before: lfs_stat("") => 0 - after: lfs_stat("") => LFS_ERR_INVAL
These flags change the behavior of open quite significantly. It's useful to cover these in our path tests so the behavior is locked down.
- test_paths_noent_trailing_slashes - test_paths_noent_trailing_dots - test_paths_noent_trailing_dotdots These managed to slip through our path testing but should be tested, if anything just to know exactly what errors these return.
- before: lfs_file_open("missing/") => LFS_ERR_ISDIR - after: lfs_file_open("missing/") => LFS_ERR_NOTDIR As noted by bmcdonnell-fb, returning LFS_ERR_ISDIR here was inconsistent with the case where the file exists: case before after lfs_file_open("dir_a") => LFS_ERR_ISDIR LFS_ERR_ISDIR lfs_file_open("dir_a/") => LFS_ERR_ISDIR LFS_ERR_ISDIR lfs_file_open("reg_a/") => LFS_ERR_NOTDIR LFS_ERR_NOTDIR lfs_file_open("missing_a/") => LFS_ERR_ISDIR LFS_ERR_NOTDIR Note this is consistent with the behavior of lfs_stat: lfs_file_open("reg_a/") => LFS_ERR_NOTDIR lfs_stat("reg_a/") => LFS_ERR_NOTDIR And the only other function that can "create" files, lfs_rename: lfs_file_open("missing_a/") => LFS_ERR_NOTDIR lfs_rename("reg_a", "missing_a/") => LFS_ERR_NOTDIR There is some ongoing discussion about if these should return NOTDIR, ISDIR, or INVAL, but this is at least an improvement over the rename/open mismatch.
21de7dc
to
999ef66
Compare
I've gone ahead and adopted LFS_ERR_NOTDIR for missing + trailing slashes, since it's at least an improvement over the open/rename mismatch. Also updated tests, added the missing tests over NOENT /EXIST + trailing slashes, and updated the PR comment to match changes. |
Tests passed ✓, Code: 17188 B, Stack: 1440 B, Structs: 812 B
|
I think I'm going to leave POSIX has baggage and sometimes doesn't make the most sense, but it's what users are expecting. If there are no other comments I'll bring this in and make a v2.10 release soon. |
From experiece I'd expect both |
Experience with what? |
Fair.
I don't love the Thanks as always @geky for your work and care on all this. :) |
With other filesystems/OS. At least this would be my expectation. First test if it exists, then if it's also what you expect it to be. |
@BenBE To be clear LFS_ERR_NOENT does take priority. See the big table in #1046 (comment). But clearly my explanation of the change is not worded well. Will update to hopefully make it more clear. |
Should be updated now. I added an explicit note that |
LGTM. Thanks all for the feedback. It's useful to know what people's thoughts are on these sort of changes. |
This PR includes a number of tweaks to path parsing behavior to better align with POSIX. Better aligning with POSIX should make things less surprising for users, which is always a good thing.
This is the relevant section in POSIX: 4.16 Pathname Resolution.
Though note there are still some differences, listed below.
In addition to these changes, this PR includes a rewrite of the
tests/test_paths.toml
(+7269/-207 lines, +20 cases, +260 perms) to more aggressively cover path parsing corner cases. That being said, it's unreasonable to expect me to think of every possible corner case on the first (or nth?) try, so if you notice anything missing, please let me know.I believe this also fixes all path-related issues reported so far (#1040, #729, #428), though it's likely I've overlooked some.
My current thinking is that most users are expecting POSIX behavior anyways, so it is reasonable to make these changes in a minor release, as long as there is sufficient documentation in the release notes. I haven't heard of any complaints related to the last set of tiny breaking changes (in v2.9) yet, though feedback is welcome.
Path changes
lfs_mkdir
now accepts trailing slashes:lfs_stat
,lfs_getattr
, etc, now reject trailing slashes if the file is not a directory:Note trailing slashes are accepted if the file is a directory:
lfs_file_open
now returns LFS_ERR_NOTDIR if the path contains trailing slashes and the file is not a directory, or LFS_O_CREAT is specified and the file does not exist:Note trailing slashes return LFS_ERR_ISDIR if the file is a directory:
Note LFS_ERR_NOENT takes priority if LFS_O_CREAT is not specified:
Attempting to navigate above the root directory now returns LFS_ERR_INVAL:
This actually moves away from the behavior of other filesystems, but towards, in my opinion, more reasonable behavior. It should also be more consistent with future theoretical
openat
-like APIs.The empty path ("") no longer aliases the root directory, now returns LFS_ERR_INVAL:
POSIX deviations
While this gets us closer to POSIX, there are still some subtle differences in behaviors. These shouldn't affect most users, but are worth documenting. Most of these are due to 1. littlefs not actually creating
.
and..
entries on disk and 2. not using recursion during path resolution.In my opinion these simply aren't worth trying to match POSIX exactly, but I'm curious if others disagree.
Root modifications return EINVAL instead of EBUSY:
Reason: This would be the only use of EBUSY in the system.
We accept modifications of directories with trailing dots:
Reason: Not worth implementing.
We do not check for existence of directories followed by dotdots:
Reason: Difficult to implement non-recursively.
We accept modifications of directories with trailing dotdots:
Reason: Not worth implementing.
Related issues: #1040, #729, #428
Should supersede: #679
Related path issues found by: @rob-zeno, @XinStellaris, @PoppaChubby, @pavel-kirienko, @inf265, @Xywzel, @steverpalmer, and likely others
Feedback: Welcome
EDIT: Changed filenames in examples to hint the filetype.
EDIT: Tried to make the root ascension example a bit more clear.
EDIT: Changed
lfs_file_open("missing/")
to return LFS_ERR_NOTDIREDIT: Noted
lfs_file_open("dir_a/")
returns LFS_ERR_ISDIREDIT: Clarified that
lfs_file_open("missing/")
without LFS_O_CREAT returns LFS_ERR_NOENT