-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The 1.10 Android mystery crash thread! #13057
Comments
For Download, we just do this: if (thread_.joinable())
thread_.join(); I figured this was equivalent to converting it to a pointer and doing a nullcheck, but maybe there's some difference? As long as STL is behaving as advertised, this makes me worry about a double free or memory corruption? -[Unknown] |
I don't know if we maybe have to do |
Yeah, we start both downloads immediately, so there should always be a thread. And nothing else joins it. I think we still have exceptions off - I wonder what happens if the thread fails to start... but I mean, according to the trace thread was destructing after Download destructed, and STL should:
And a postcondition of join(), which we call, is that joinable() becomes false. -[Unknown] |
Indeed, weird. Don't really have a lot of data yet (rollout at 1.5%), got 4 reports so far of it, 4 that got coalesced and 2 separate ones (only those two got auto-marked with NEW). I'll take a look at the code myself tomorrow but I don't expect to figure out much more than you did... Not seeing anything else dramatic so far. I'm gonna ramp up the rollout to 5%. |
The Download thing continues to be the most common new crash. This one in PPGe is the runner-up so far:
|
I have a theory about the Download thing. Apparently we send in a shared_ptr to the thread, to keep "self" alive. But if that is the only owner of self.. the destruction seems to contain contradictions. It seems in that case the destructor would run on the thread itself when the thread function exits, and a thread joining itself doesn't sound healthy... |
Oh, that's a good theory. Actually sounds likely it would become the last owner... PPGe - the DisplaySaveDataInfo1 crash is new? I guess it must be related to the text drawer, maybe it's some save list with 100 saves and we ran out of space? Hm. -[Unknown] |
Yeah, that one's new. Far less common than the Download one. |
Here's one that's not new, but it's pretty high in the chart:
That's in the Adhoc matchmaking code it seems. |
Here's another interesting one, pretty rare though:
|
And an old one that seems suspicious, though can also be a driver bug. Don't completely trust our descriptor set handling code....
Anyway, should already be at an acceptable level, will probably release 1.10.1 tomorrow or the day after. |
Did a quiet 10% rollout of 1.10.1 on Android. Here's one I hadn't spotted before, 19 times on a single device "A44 (itel-A44)" (so probably not much to worry about, but got to be annoying for that person):
Download crashes are gone. This oldie but goodie is rising through the ranks:
This guy is still around:
Another new one I spotted while going through stuff that was buried before:
And a real oddball that's also not new:
|
Here's an ANR that might be slightly serious, seems to be from using joypad on android while downloads are cancelling. one thread:
Another thread:
Third thread:
Downloader::Do:
Not all of these are necessarily involved in the likely deadlock though.
|
Hm, I think it's possible that the above one is simply a Buffer::Read blocking for a long time, making cancellation kinda not happen. |
Agreed, that's what it looks like. We'd need to convert to non-blocking IO or send a signal to cancel. -[Unknown] |
InstallZipScreen???
|
Found another oddity, I think things have gone pretty wrong already when we get something like this:
|
Here's an interesting one that I got a single one of:
|
Closing this for #14082. -[Unknown] |
Here we go again, analyzing new-looking crashes from Google Play to see if it's worth doing a 1.10.1 release.
First up is this ~Download crash:
This one feels like we still have some Vulkan mistake on shutdown, so not so critical (or it's just a driver bug, god knows there are enough of those):
To be continued!
The text was updated successfully, but these errors were encountered: