-
-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative to wake_with() #22
Comments
Guy suggested that:
So basically the only change we need to make to code is to improve the scheduler's handling of this case NOTE: In the condvar case, using this new idiom will mean regaining the lock at the end of the wait. Since the mutex code also uses atomic increments (and more) this will offset any performance saving of not using wake_with(). So I don't believe this new idiom will give any better performance. It will be perhaps just easier for users to use. |
If a crashed OSv guest is restarted, ZFS mount causes a GPF in early startup: VFS: mounting zfs at /usr zfs: mounting osv/usr from device /dev/vblk1 Aborted GDB backtrace points finger at zfs_rmnode(): #0 processor::halt_no_interrupts () at ../../arch/x64/processor.hh:212 #1 0x00000000003e7f2a in osv::halt () at ../../core/power.cc:20 #2 0x000000000021cdd4 in abort (msg=0x636df0 "Aborted\n") at ../../runtime.cc:95 #3 0x000000000021cda2 in abort () at ../../runtime.cc:86 #4 0x000000000044c149 in osv::generate_signal (siginfo=..., ef=0xffffc0003ffe7008) at ../../libc/signal.cc:44 #5 0x000000000044c220 in osv::handle_segmentation_fault (addr=72, ef=0xffffc0003ffe7008) at ../../libc/signal.cc:55 #6 0x0000000000366df3 in page_fault (ef=0xffffc0003ffe7008) at ../../core/mmu.cc:876 #7 <signal handler called> #8 0x0000000000345eaa in zfs_rmnode (zp=0xffffc0003d1de400) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:611 #9 0x000000000035650c in zfs_zinactive (zp=0xffffc0003d1de400) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1355 #10 0x0000000000345be1 in zfs_unlinked_drain (zfsvfs=0xffffc0003ddfe000) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:523 #11 0x000000000034f45c in zfsvfs_setup (zfsvfs=0xffffc0003ddfe000, mounting=true) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:881 #12 0x000000000034f7a4 in zfs_domount (vfsp=0xffffc0003de02000, osname=0x6b14cb "osv/usr") at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1016 #13 0x000000000034f98c in zfs_mount (mp=0xffffc0003de02000, dev=0x6b14d7 "/dev/vblk1", flags=0, data=0x6b14cb) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1415 #14 0x0000000000406852 in sys_mount (dev=0x6b14d7 "/dev/vblk1", dir=0x6b14a3 "/usr", fsname=0x6b14d3 "zfs", flags=0, data=0x6b14cb) at ../../fs/vfs/vfs_mount.c:171 #15 0x00000000003eff97 in mount_usr () at ../../fs/vfs/main.cc:1415 #16 0x0000000000203a89 in do_main_thread (_args=0xffffc0003fe9ced0) at ../../loader.cc:215 #17 0x0000000000448575 in pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, pthread_private::thread_attr const*)::{lambda()#1}::operator()() const () at ../../libc/pthread.cc:59 #18 0x00000000004499d3 in std::_Function_handler<void(), pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::__lambda0>::_M_invoke(const std::_Any_data &) (__functor=...) at ../../external/gcc.bin/usr/include/c++/4.8.1/functional:2071 #19 0x000000000037e602 in std::function<void ()>::operator()() const (this=0xffffc0003e170038) at ../../external/gcc.bin/usr/include/c++/4.8.1/functional:2468 #20 0x00000000003bae3e in sched::thread::main (this=0xffffc0003e170010) at ../../core/sched.cc:581 #21 0x00000000003b8c92 in sched::thread_main_c (t=0xffffc0003e170010) at ../../arch/x64/arch-switch.hh:133 #22 0x0000000000399c8e in thread_main () at ../../arch/x64/entry.S:101 The problem is that ZFS tries to check if the znode is an attribute directory and trips over zp->z_vnode being NULL. However, as explained in commit b7ee91e ("zfs: port vop_lookup"), we don't even support extended attributes so drop the check completely for OSv.
If a crashed OSv guest is restarted, ZFS mount causes a GPF in early startup: VFS: mounting zfs at /usr zfs: mounting osv/usr from device /dev/vblk1 Aborted GDB backtrace points finger at zfs_rmnode(): #0 processor::halt_no_interrupts () at ../../arch/x64/processor.hh:212 #1 0x00000000003e7f2a in osv::halt () at ../../core/power.cc:20 #2 0x000000000021cdd4 in abort (msg=0x636df0 "Aborted\n") at ../../runtime.cc:95 #3 0x000000000021cda2 in abort () at ../../runtime.cc:86 #4 0x000000000044c149 in osv::generate_signal (siginfo=..., ef=0xffffc0003ffe7008) at ../../libc/signal.cc:44 #5 0x000000000044c220 in osv::handle_segmentation_fault (addr=72, ef=0xffffc0003ffe7008) at ../../libc/signal.cc:55 #6 0x0000000000366df3 in page_fault (ef=0xffffc0003ffe7008) at ../../core/mmu.cc:876 #7 <signal handler called> #8 0x0000000000345eaa in zfs_rmnode (zp=0xffffc0003d1de400) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:611 #9 0x000000000035650c in zfs_zinactive (zp=0xffffc0003d1de400) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1355 #10 0x0000000000345be1 in zfs_unlinked_drain (zfsvfs=0xffffc0003ddfe000) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:523 #11 0x000000000034f45c in zfsvfs_setup (zfsvfs=0xffffc0003ddfe000, mounting=true) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:881 #12 0x000000000034f7a4 in zfs_domount (vfsp=0xffffc0003de02000, osname=0x6b14cb "osv/usr") at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1016 #13 0x000000000034f98c in zfs_mount (mp=0xffffc0003de02000, dev=0x6b14d7 "/dev/vblk1", flags=0, data=0x6b14cb) at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1415 #14 0x0000000000406852 in sys_mount (dev=0x6b14d7 "/dev/vblk1", dir=0x6b14a3 "/usr", fsname=0x6b14d3 "zfs", flags=0, data=0x6b14cb) at ../../fs/vfs/vfs_mount.c:171 #15 0x00000000003eff97 in mount_usr () at ../../fs/vfs/main.cc:1415 #16 0x0000000000203a89 in do_main_thread (_args=0xffffc0003fe9ced0) at ../../loader.cc:215 #17 0x0000000000448575 in pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, pthread_private::thread_attr const*)::{lambda()#1}::operator()() const () at ../../libc/pthread.cc:59 #18 0x00000000004499d3 in std::_Function_handler<void(), pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::__lambda0>::_M_invoke(const std::_Any_data &) (__functor=...) at ../../external/gcc.bin/usr/include/c++/4.8.1/functional:2071 #19 0x000000000037e602 in std::function<void ()>::operator()() const (this=0xffffc0003e170038) at ../../external/gcc.bin/usr/include/c++/4.8.1/functional:2468 #20 0x00000000003bae3e in sched::thread::main (this=0xffffc0003e170010) at ../../core/sched.cc:581 #21 0x00000000003b8c92 in sched::thread_main_c (t=0xffffc0003e170010) at ../../arch/x64/arch-switch.hh:133 #22 0x0000000000399c8e in thread_main () at ../../arch/x64/entry.S:101 The problem is that ZFS tries to check if the znode is an attribute directory and trips over zp->z_vnode being NULL. However, as explained in commit b7ee91e ("zfs: port vop_lookup"), we don't even support extended attributes so drop the check completely for OSv.
Fix mincore() to deal with unmapped addresses like msync() does. This fixes a SIGSEGV in libunwind's access_mem() when leak detector is enabled: (gdb) bt #0 page_fault (ef=0xffffc0003ffe7008) at ../../core/mmu.cc:871 #1 <signal handler called> #2 ContiguousSpace::block_start_const (this=<optimized out>, p=0x77d2f3968) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/oops/oop.inline.hpp:411 #3 0x00001000008ae16c in GenerationBlockStartClosure::do_space (this=0x2000001f9100, s=<optimized out>) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/memory/generation.cpp:242 #4 0x00001000007f097c in DefNewGeneration::space_iterate (this=0xffffc0003fb68c00, blk=0x2000001f9100, usedOnly=<optimized out>) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/memory/defNewGeneration.cpp:480 #5 0x00001000008aca0e in Generation::block_start (this=<optimized out>, p=<optimized out>) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/memory/generation.cpp:251 #6 0x0000100000b06d2f in os::print_location (st=st@entry=0x2000001f9560, x=32165017960, verbose=verbose@entry=false) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/runtime/os.cpp:868 #7 0x0000100000b11b5b in os::print_register_info (st=0x2000001f9560, context=0x2000001f9740) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:839 #8 0x0000100000c6cde8 in VMError::report (this=0x2000001f9610, st=st@entry=0x2000001f9560) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:551 #9 0x0000100000c6da3b in VMError::report_and_die (this=this@entry=0x2000001f9610) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:984 #10 0x0000100000b1109f in JVM_handle_linux_signal (sig=11, info=0x2000001f9bb8, ucVoid=0x2000001f9740, abort_if_unrecognized=<optimized out>) at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:528 #11 0x000000000039f242 in call_signal_handler (frame=0x2000001f9b10) at ../../arch/x64/signal.cc:69 #12 <signal handler called> #13 0x000000000057d721 in access_mem () #14 0x000000000057cb1d in dwarf_get () #15 0x000000000057ce51 in _ULx86_64_step () #16 0x00000000004315fd in backtrace (buffer=0x1ff9d80 <memory::alloc_tracker::remember(void*, int)::bt>, size=20) at ../../libc/misc/backtrace.cc:16 #17 0x00000000003b8d99 in memory::alloc_tracker::remember (this=0x1777ae0 <memory::tracker>, addr=0xffffc0004508de00, size=54) at ../../core/alloctracker.cc:59 #18 0x00000000003b0504 in memory::tracker_remember (addr=0xffffc0004508de00, size=54) at ../../core/mempool.cc:43 #19 0x00000000003b2152 in std_malloc (size=54) at ../../core/mempool.cc:723 #20 0x00000000003b259c in malloc (size=54) at ../../core/mempool.cc:856 #21 0x0000100001615e4c in JNU_GetStringPlatformChars (env=env@entry=0xffffc0003a4dc1d8, jstr=jstr@entry=0xffffc0004591b800, isCopy=isCopy@entry=0x0) at ../../../src/share/native/common/jni_util.c:801 #22 0x000010000161ada6 in Java_java_io_UnixFileSystem_getBooleanAttributes0 (env=0xffffc0003a4dc1d8, this=<optimized out>, file=<optimized out>) at ../../../src/solaris/native/java/io/UnixFileSystem_md.c:111 #23 0x000020000021ed8e in ?? () #24 0x00002000001faa58 in ?? () #25 0x00002000001faac0 in ?? () #26 0x00002000001faa50 in ?? () #27 0x0000000000000000 in ?? () Spotted by Avi Kivity.
Right now, code like condvar_wake_one/all use wake_with() to allow waking a thread which might exit as soon as we wake it.
Avi doesn't like wake_with() for two reasons: It is unorthodox API (so users need to know to use it, when and how), and it slows down every wake() with two atomic fetch-add.
Why does condvar_wake_one/all() need to use wake_with at all?
It already does
m.lock();
waiter = wr->t;
wr->t = nullptr;
waiter->wake();
m.unlock();
If the waiter would regain the lock when waking up, then we would know the thread cannot exit before wake_one/all releases this lock, and the problem would be solved. But we didn't regain the lock (except in the case of timeout), because we feared that the waiter will often go back to sleep immediately after waking, to wait for the lock which the waker is still holding.
The idea (suggested by Avi) is, instead of finding wake_with() as a replacement to the above solution - just make the above solution efficient.
As explained above, the waiter will regain the lock after waiting. The waker will do:
m.lock();
preempt_disable();
waiter->wr->t;
wr->t = nullptr;
waiter->wake();
m.unlock();
preempt_enable();
If the waiter thread is on the same CPU as the waker, this is efficient - the waiter will not start to run (and try to regain the lock) until the preempt_enable() after we release the lock. This is more efficient than the wake_with() case because now we have a thread-local increment (preempt_disable()) instead of the atomic increment used by wake_with().
If the waiter thread is on a different CPU, the preempt_disable() won't help, and it can start running before we finish m.unlock(). In the usual case, it will take a short time until m.unlock() completes, so all we need to do is to add to mutex_lock() the ability to do a bit of spinning before going to sleep. We want this feature anyway. This will be a performance improvement over wake_with() only if in the typical case, m.unlock() will finish more quickly than the waiter thread starting to run so we'll have no spinning iterations at all. We'll need to verify that this is actually the case - otherwise all of this won't be an improvement at all.
The text was updated successfully, but these errors were encountered: