-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intermittent segmentation faults after upgrading to 2.5.3 #251
Comments
See #248 |
Do you have any other information you can provide? Maybe a repro or a core file? The message #248 seems like a fix that should be applied (it looks like IOW 507c199 could cause SEGVs, but I'm not convinced it's causing this one. |
@tenderlove - I solved the issue in This means that the Ruby interpreter assumes that the Ruby Object is valid immediately after allocation, even before initialization... which might not be true for C extension objects. Personally I believe only an initialized object should be marked as an "existing" object unless all I didn't read the Ruby code base except in bits and pieces, but perhaps a C-data structure should have a bit set to mark that it was initialized before a Just a meandering thought of post-incident review. I might be completely wrong and maybe my patch solved the issue only only specific machine. Good luck! |
@boazsegev sorry, this doesn't make sense. The error clearly states
If applying the patch fixes this error, great. But you didn't explain how introducing |
I don't think that My working theory was (is?) that the issue occurs when (in multi-threaded environments):
P.S. If we're looking at Ruby C extensions, maybe a documentation note is all we need - something to better explain the new GC behavior and the desired use of |
Are you saying that nio4r is allocating objects inside a thread that isn't holding the GVL lock? |
Nope. From what I know of the |
@tenderlove - this is the only place where the GVL is released, which is the polling system call inside The threads allocate objects safely, within the GVL. |
Okay, I have released v2.5.4 can you please check it. |
@tenderlove it looks like There was another error involving a SEGV inside of pure Ruby code: I'm not sure by what mechanism object allocation in a GC pass might manifest, but I think it's safe to say there was some rather severe memory corruption at play here. |
Personally I think it's a documentation issue, where we might need to clarify this part of the documentation for C extension developers. It should be explicitly stated that C object initialization MUST be performed during the allocation function rather than the Ruby object initialization function. i.e., that It would be nice (but perhaps redundant) if developers knew the reason was that the GC mark callback (this code, in our case) might be called immediately after memory allocation and before Ruby object initialization. Then again, I find that the documentation for C extensions is generally scarce. Half the time I am digging in the Ruby source code to find what I want to know. I totally understand this, it's no big deal. Documentation for C extension development is a huge undertaking... but there's already a section here about |
We should probably migrate nio4r to use |
@Willianvdv can you please try the latest release and report back? |
I work with @Willianvdv. I bumped nio4r to v2.5.4 this afternoon when I saw this fix up hoping it was related. With v2.5.3, it was roughly 50% failure rate on our full CI run and with v2.5.4 I've had 7 full runs without any failures. That gives me enough confidence to say that it's fix 😀 Thanks for the quick turn around! 🙏 |
@boazsegev should take all the credit for this, but thanks everyone for working so quickly to get this resolved. When I woke up and saw this, I had a moment of stress (unknown segv X millions of installs = bad day), but the team really came together to help out. What a fantastic example of collaboration. |
Update ruby-nio4r to 2.5.4. ## 2.5.4 (2020-09-16) * [#251](socketry/nio4r#251) Intermittent SEGV during GC. ([@boazsegev])
What operating system + architecture are you using?
Alpine 3.11
Ubuntu 18.04 (Bionic)
Ruby 2.6
What version of nio4r are you using?
v2.5.3
What did you do?
This error happens intermittently while running selenium tests through rspec. We’re importing nio4r through the “puma” gem. This incident began occurring immediately after updating nio4r from 2.5.2 to 2.5.3.
This version included this commit (507c199) which appears to be related.
What error did you see?
Segmentation Fault Examples
/.../activesupport-5.2.4.3/lib/active_support/notifications/instrumenter.rb:25: [BUG] object allocation during garbage collection phase
/.../nokogiri-1.10.9/lib/nokogiri/xml/node.rb:767: [BUG] Segmentation fault at 0x0000000000000000
/.../nokogiri-1.10.9/lib/nokogiri/xml/node.rb:767: [BUG] Segmentation fault at 0x0000000000000080
/.../nokogiri-1.10.9/lib/nokogiri/xml/node.rb:767: [BUG] Segmentation fault at 0x00007f42744d464b
/.../nokogiri-1.10.9/lib/nokogiri/html/document.rb:205: [BUG] Segmentation fault at 0x0000000000000000
/.../nokogiri-1.10.9/lib/nokogiri/html/document.rb:205: [BUG] Segmentation fault at 0x0000000000000080
/.../nokogiri-1.10.9/lib/nokogiri/html/document.rb:205: [BUG] Segmentation fault at 0x00000000000000a0
/.../nokogiri-1.10.9/lib/nokogiri/xml/document.rb:219: [BUG] Segmentation fault at 0x0000000000000000
/.../puma-4.3.3/lib/puma/reactor.rb:166: [BUG] Segmentation fault at 0x0000000000000000
/.../nokogiri-1.10.9/lib/nokogiri/xml/searchable.rb:190: [BUG] Segmentation fault at 0x0000000000000080
Example Stacktrace (from [BUG] object allocation during garbage collection phase)
[2020-09-09T18:23:27.748Z] -- Control frame information -----------------------------------------------
[2020-09-09T18:23:27.748Z] c:0067 p:---- s:0469 e:000468 CFUNC :name
[2020-09-09T18:23:27.748Z] c:0066 p:0029 s:0465 e:000462 RESCUE /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/notifications/instrumenter.rb:25
[2020-09-09T18:23:27.748Z] c:0065 p:0019 s:0459 e:000458 METHOD /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/notifications/instrumenter.rb:24
[2020-09-09T18:23:27.748Z] c:0064 p:0027 s:0451 e:000450 METHOD /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/notifications.rb:168
[2020-09-09T18:23:27.748Z] c:0063 p:0076 s:0445 e:000444 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/abstract_renderer.rb:43
[2020-09-09T18:23:27.748Z] c:0062 p:0013 s:0439 e:000438 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/partial_renderer.rb:319
[2020-09-09T18:23:27.748Z] c:0061 p:0083 s:0435 e:000434 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/partial_renderer.rb:310
[2020-09-09T18:23:27.748Z] c:0060 p:0023 s:0428 e:000427 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/renderer.rb:49
[2020-09-09T18:23:27.748Z] c:0059 p:0017 s:0421 e:000420 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/renderer.rb:23
[2020-09-09T18:23:27.748Z] c:0058 p:0082 s:0415 e:000414 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/helpers/rendering_helper.rb:33
[2020-09-09T18:23:27.749Z] c:0057 p:0010 s:0408 e:000407 BLOCK /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/debug_exceptions.rb:46
[2020-09-09T18:23:27.749Z] c:0056 p:0033 s:0405 e:000404 METHOD /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/logger_silence.rb:21
[2020-09-09T18:23:27.749Z] c:0055 p:0033 s:0399 e:000398 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/debug_exceptions.rb:46
[2020-09-09T18:23:27.749Z] c:0054 p:0013 s:0393 e:000391 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/routing/inspector.rb:198
[2020-09-09T18:23:27.749Z] c:0053 p:0064 s:0387 e:000386 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/routing/inspector.rb:73
[2020-09-09T18:23:27.749Z] c:0052 p:0195 s:0379 e:000377 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/templates/rescues/routing_error.html.erb:28
[2020-09-09T18:23:27.749Z] c:0051 p:0025 s:0370 e:000369 BLOCK /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/template.rb:159
[2020-09-09T18:23:27.749Z] c:0050 p:0039 s:0367 e:000366 METHOD /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/notifications.rb:170
[2020-09-09T18:23:27.749Z] c:0049 p:0024 s:0361 e:000360 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/template.rb:354
[2020-09-09T18:23:27.749Z] c:0048 p:0008 s:0356 e:000355 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/template.rb:157
[2020-09-09T18:23:27.749Z] c:0047 p:0051 s:0347 e:000346 METHOD /usr/src/app/config/initializers/i18n.rb:264
[2020-09-09T18:23:27.749Z] c:0046 p:0050 s:0340 e:000339 METHOD /usr/src/app/config/initializers/template_streaming.rb:114
[2020-09-09T18:23:27.749Z] c:0045 p:0014 s:0333 e:000332 BLOCK /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/template_renderer.rb:54
[2020-09-09T18:23:27.749Z] c:0044 p:0005 s:0330 e:000329 BLOCK /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/abstract_renderer.rb:44
[2020-09-09T18:23:27.749Z] c:0043 p:0011 s:0326 e:000325 BLOCK /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/notifications.rb:168
[2020-09-09T18:23:27.749Z] c:0042 p:0018 s:0323 e:000322 METHOD /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/notifications/instrumenter.rb:23
[2020-09-09T18:23:27.749Z] c:0041 p:0027 s:0315 e:000314 METHOD /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/notifications.rb:168
[2020-09-09T18:23:27.749Z] c:0040 p:0076 s:0309 e:000308 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/abstract_renderer.rb:43
[2020-09-09T18:23:27.749Z] c:0039 p:0020 s:0303 e:000302 BLOCK /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/template_renderer.rb:53
[2020-09-09T18:23:27.749Z] c:0038 p:0032 s:0299 e:000298 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/template_renderer.rb:61
[2020-09-09T18:23:27.749Z] c:0037 p:0030 s:0290 e:000289 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/template_renderer.rb:52
[2020-09-09T18:23:27.749Z] c:0036 p:0093 s:0282 e:000281 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/template_renderer.rb:16
[2020-09-09T18:23:27.749Z] c:0035 p:0020 s:0275 e:000274 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/renderer.rb:44
[2020-09-09T18:23:27.749Z] c:0034 p:0026 s:0269 e:000268 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/renderer/renderer.rb:25
[2020-09-09T18:23:27.749Z] c:0033 p:0082 s:0263 e:000262 METHOD /home/docker/.gem/2.6.6/gems/actionview-5.2.4.3/lib/action_view/helpers/rendering_helper.rb:33
[2020-09-09T18:23:27.749Z] c:0032 p:0010 s:0256 e:000255 BLOCK /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/debug_exceptions.rb:46
[2020-09-09T18:23:27.749Z] c:0031 p:0033 s:0253 e:000252 METHOD /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/logger_silence.rb:21
[2020-09-09T18:23:27.749Z] c:0030 p:0033 s:0247 e:000246 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/debug_exceptions.rb:46
[2020-09-09T18:23:27.749Z] c:0029 p:0068 s:0241 e:000240 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/debug_exceptions.rb:102
[2020-09-09T18:23:27.749Z] c:0028 p:0078 s:0231 e:000230 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/debug_exceptions.rb:87
[2020-09-09T18:23:27.749Z] c:0027 p:0039 s:0222 e:000221 RESCUE /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/debug_exceptions.rb:71
[2020-09-09T18:23:27.749Z] c:0026 p:0127 s:0218 e:000217 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/debug_exceptions.rb:59
[2020-09-09T18:23:27.749Z] c:0025 p:0024 s:0207 e:000206 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/show_exceptions.rb:33
[2020-09-09T18:23:27.749Z] c:0024 p:0042 s:0200 E:0007f0 METHOD /home/docker/.gem/2.6.6/gems/railties-5.2.4.3/lib/rails/rack/logger.rb:38
[2020-09-09T18:23:27.749Z] c:0023 p:0051 s:0190 e:000189 METHOD /home/docker/.gem/2.6.6/gems/railties-5.2.4.3/lib/rails/rack/logger.rb:28
[2020-09-09T18:23:27.749Z] c:0022 p:0053 s:0184 e:000183 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/remote_ip.rb:81
[2020-09-09T18:23:27.749Z] c:0021 p:0019 s:0178 e:000177 METHOD /home/docker/.gem/2.6.6/gems/request_store-1.5.0/lib/request_store/middleware.rb:19
[2020-09-09T18:23:27.749Z] c:0020 p:0091 s:0171 e:000170 METHOD /usr/src/app/app/middleware/request_context_generator.rb:49
[2020-09-09T18:23:27.749Z] c:0019 p:0091 s:0160 e:000159 METHOD /home/docker/.gem/2.6.6/gems/rack-2.2.3/lib/rack/method_override.rb:24
[2020-09-09T18:23:27.749Z] c:0018 p:0059 s:0154 e:000153 METHOD /usr/src/app/app/middleware/prevent_non_multipart_parse.rb:33
[2020-09-09T18:23:27.749Z] c:0017 p:0020 s:0149 e:000148 METHOD /home/docker/.gem/2.6.6/gems/rack-2.2.3/lib/rack/runtime.rb:22
[2020-09-09T18:23:27.749Z] c:0016 p:0033 s:0139 e:000138 METHOD /home/docker/.gem/2.6.6/gems/activesupport-5.2.4.3/lib/active_support/cache/strategy/local_cache_middleware.rb:29
[2020-09-09T18:23:27.749Z] c:0015 p:0016 s:0132 e:000131 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/executor.rb:14
[2020-09-09T18:23:27.749Z] c:0014 p:0085 s:0124 e:000123 METHOD /home/docker/.gem/2.6.6/gems/actionpack-5.2.4.3/lib/action_dispatch/middleware/static.rb:127
[2020-09-09T18:23:27.749Z] c:0013 p:0008 s:0116 e:000115 METHOD /home/docker/.gem/2.6.6/gems/rack-2.2.3/lib/rack/sendfile.rb:110
[2020-09-09T18:23:27.749Z] c:0012 p:0020 s:0104 e:000103 METHOD /home/docker/.gem/2.6.6/gems/railties-5.2.4.3/lib/rails/engine.rb:524 [FINISH]
[2020-09-09T18:23:27.749Z] c:0011 p:---- s:0098 e:000097 CFUNC :public_send
[2020-09-09T18:23:27.749Z] c:0010 p:0028 s:0092 e:000091 METHOD /home/docker/.gem/2.6.6/gems/railties-5.2.4.3/lib/rails/railtie.rb:190
[2020-09-09T18:23:27.749Z] c:0009 p:0009 s:0085 e:000084 BLOCK /usr/src/app/spec/selenium/test_setup/selenium_driver_setup.rb:445
[2020-09-09T18:23:27.749Z] c:0008 p:0022 s:0082 e:000081 METHOD /usr/local/lib/ruby/2.6.0/monitor.rb:235
[2020-09-09T18:23:27.749Z] c:0007 p:0054 s:0078 e:000077 BLOCK /usr/src/app/spec/selenium/test_setup/selenium_driver_setup.rb:445
[2020-09-09T18:23:27.749Z] c:0006 p:0144 s:0072 e:000071 BLOCK /usr/src/app/spec/selenium/test_setup/selenium_driver_setup.rb:470
[2020-09-09T18:23:27.749Z] c:0005 p:0263 s:0065 e:000064 METHOD /home/docker/.gem/2.6.6/gems/puma-4.3.3/lib/puma/server.rb:682
[2020-09-09T18:23:27.749Z] c:0004 p:0029 s:0040 e:000039 METHOD /home/docker/.gem/2.6.6/gems/puma-4.3.3/lib/puma/server.rb:472
[2020-09-09T18:23:27.749Z] c:0003 p:0065 s:0026 e:000025 BLOCK /home/docker/.gem/2.6.6/gems/puma-4.3.3/lib/puma/server.rb:328
[2020-09-09T18:23:27.749Z] c:0002 p:0107 s:0016 e:000015 BLOCK /home/docker/.gem/2.6.6/gems/puma-4.3.3/lib/puma/thread_pool.rb:134 [FINISH]
[2020-09-09T18:23:27.749Z] c:0001 p:---- s:0003 e:000002 (none) [FINISH]
Current workaround?
We were able to circumvent this error by pinning back to v2.5.2
The text was updated successfully, but these errors were encountered: