Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On non-Windows OS, hold file lock while renaming WAL directory #6269

Merged
merged 1 commit into from
Aug 26, 2016

Conversation

aaronlehmann
Copy link
Contributor

Windows requires this lock to be released before the directory is
renamed. But on unix-like operating systems, releasing the lock and
trying to reacquire it immediately can be flaky if a process is forked
around the same time. The file descriptors are marked as close-on-exec
by the Go runtime, but there is a window between the fork and exec where
another process will be holding the lock.

@aaronlehmann
Copy link
Contributor Author

Seems like this made TestOpenOnTornWrite flaky in CI, but I can't reproduce locally.

This change just restores the code that existed before 5991209 for non-Windows platforms. Any ideas?

w.Close()
if err := os.Rename(tmpdirpath, dirpath); err != nil {
return nil, err
if runtime.GOOS == "windows" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split these separate code paths into the files wal_windows.go and wal_unix.go with windows and !windows build constraints? Each file would provide a platform specific implementation of func (w *WAL) renameWal(tmpdirpath string) (*WAL, error) for Create to call.

@heyitsanthony
Copy link
Contributor

fn := w.tail().Name() in TestOpenOnTornWrite is probably getting a path with the old tmpdirpath since the reopen is gone. Try fn := path.Join(p, path.Base(w.tail().Name()))

@aaronlehmann aaronlehmann force-pushed the hold-lock-while-renaming branch from 5a39edf to 1f5347c Compare August 26, 2016 02:31
@aaronlehmann
Copy link
Contributor Author

Updated with these changes, thanks.

I named the files wal_windows.go and wal_other.go so that the non-Windows file is not unix-specific, but I'm happy to change this if desired.

@heyitsanthony
Copy link
Contributor

A suffix that's used by go is preferable...

find /usr/lib/go/src/ | grep '_' | grep -v test | cut -f2 -d'_' | sort | grep "\.go" | uniq -c | sort -n | tail
     14 freebsd.go
     14 posix.go
     15 nacl.go
     15 openbsd.go
     16 darwin.go
     19 solaris.go
     26 linux.go
     37 unix.go
     45 plan9.go
     51 windows.go

So, _posix.go?

@@ -0,0 +1,23 @@
// +build !windows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the contents of $REPO_ROOT/.header here or it won't pass CI

also keep the build constraint a line separate from the package declaration because it confuses godoc

<HEADER>

// +build !windows

package wal

@heyitsanthony
Copy link
Contributor

Please fix the last few comments and change the commit title to something like "wal: hold file lock while renaming WAL directory on non-windows" so it fits the commit title format for CI. Should be good to merge after that. Thanks!

@aaronlehmann
Copy link
Contributor Author

In the Go source, when there's a *_windows.go file, they seem to use _unix.go and _plan9.go as counterpart suffixes. For example src/syscall/env_{windows,unix,plan9}.go and src/os/path_{windows,unix,plan9}.go. So I guess wal_unix.go seems like the best choice after all. I'll change it to that name if that sounds good. I'll also make the other modifications you suggested.

@heyitsanthony
Copy link
Contributor

Either _unix.go or _posix.go is fine. Thanks!

Windows requires this lock to be released before the directory is
renamed. But on unix-like operating systems, releasing the lock and
trying to reacquire it immediately can be flaky if a process is forked
around the same time. The file descriptors are marked as close-on-exec
by the Go runtime, but there is a window between the fork and exec where
another process will be holding the lock.
@aaronlehmann aaronlehmann force-pushed the hold-lock-while-renaming branch from 1f5347c to af4f822 Compare August 26, 2016 16:28
@aaronlehmann
Copy link
Contributor Author

Updated the PR, thanks.

@heyitsanthony
Copy link
Contributor

lgtm

// See the License for the specific language governing permissions and
// limitations under the License.

// +build !windows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need this build tag? the file name is already _unix.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The *_unix.go files in the Go standard library all seem to have build tags, but the *_windows.go files do not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. thanks.

@xiang90
Copy link
Contributor

xiang90 commented Aug 26, 2016

lgtm

@xiang90
Copy link
Contributor

xiang90 commented Aug 26, 2016

The test failure is unrelated to this pr. Merging.

@xiang90 xiang90 merged commit 3a49cbb into etcd-io:master Aug 26, 2016
@aaronlehmann aaronlehmann deleted the hold-lock-while-renaming branch August 26, 2016 17:19
@heyitsanthony heyitsanthony mentioned this pull request Aug 31, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants