Retry delete when deadlocked #91

aaronjensen · 2014-05-16T16:19:03Z

Fixes #63

Unlike #90, this does not retry reservations because that is handled by delayed_job: https://github.com/collectiveidea/delayed_job/blob/master/lib/delayed/worker.rb#L281

joshuapinter · 2014-05-16T20:17:45Z

Have been using this in production for the last few hours. Seems to be working great.

Fixes collectiveidea#63

betamatt · 2014-07-11T13:38:45Z

This fix looks specific to one database. MySQL perhaps?

aaronjensen · 2014-07-11T15:16:42Z

yes, it is, it should be safe on other databases, and other databases do not have the problem (to my knowledge). If we want to retry on deadlocks w/ other databases we can look for these: https://github.com/mperham/deadlock_retry/blob/master/lib/deadlock_retry.rb#L16-L20

joshgoebel · 2014-07-30T07:38:37Z

Are 100 retires really necessary? What about a small sleep? Are there any plans to merge this?

aaronjensen · 2014-07-30T07:56:06Z

Are 100 retires really necessary? What about a small sleep? Are there any plans to merge this?

No, probably not. I don't remember why I picked 100 but it shouldn't matter as if it's a deadlock it should go through eventually and it should have every opportunity w/o being infinite. A small sleep would be ok.

joshgoebel · 2014-07-30T08:02:59Z

@danielmorrison Any chance of getting this or something similar merged?

klausmeyer · 2014-08-18T14:58:45Z

+1 We also had problems with this in the past and had to downgrade to v0.3.3 which is working fine.

snackycracky · 2014-12-10T20:14:57Z

did you all see this approach ? ndbroadbent@d311b55 as discussed in #63

klausmeyer · 2014-12-10T20:21:34Z

Since we got #89 merged and using a monkey-patch to use plain sql instead of the "optimized" mysql-query version everything works just fine for us. No more deadlocks.

snackycracky · 2014-12-10T20:48:45Z

so then close this one here, no?

klausmeyer · 2014-12-10T20:56:42Z

👍 It's a hack that maybe solves the symptom but not the actual problem.
It may be ok to use in certain circumstances but IMO it shouldn't be shipped with the gem. Better approach would be to provide a clean option to use the code-path created in #89 w/o monkey-patching.

aaronjensen · 2014-12-10T21:40:58Z

I'm not sure that I would agree that this is a hack. If the method of reservation is faster but occasionally deadlocks (and the deadlocks could not be prevented because locking order cannot be specified for the particular locks that are being acquired) then retrying is the right thing to do. If there were a solution that were faster and did not have the deadlock issue then that would be better, of course, but if not then this solution is best. Retrying on deadlock in general is not a hack.

snackycracky · 2014-12-10T21:47:56Z

well it's not a hack but its no solution as @klausmeyer said. The pr feels like a "try-catch" because you don't trust the code. This responsibility should be implemented by the host-application.

klausmeyer · 2014-12-10T21:48:35Z

Or at least there should be a switch to turn it on or off.

snackycracky · 2014-12-10T21:56:50Z

The failed job will automatically be retried as usual with Dj.
So if you think this to the end, then you also want such a switch for every special exception.
that's just not good.

And I don't like prs which just cure the symptoms like you said.

klausmeyer · 2014-12-10T22:07:09Z

That's true. I just wanted to say that in case this would be included in the official gem I'd like to have a way to turn it off but I still would prefer to not have it included at all.

aaronjensen · 2014-12-11T03:14:43Z

well it's not a hack but its no solution as @klausmeyer said. The pr feels like a "try-catch" because you don't trust the code. This responsibility should be implemented by the host-application.

Sorry, but I still disagree. Either the optimization for selecting records needs to be changed completely to something that cannot possibly deadlock or we need to just accept the fact that this particular query can and will deadlock, and that is OK. You cannot always prevent deadlocks and the thing to do when you cannot is to retry. This is how it works. This should not be monkey patched in on the host application. The only solutions I know of are:

Leave the job locking code as is and retry on deadlock.
Replace the job locking code with something that cannot possibly deadlock.

There are no other solutions. If you have any others I'm open to them. Making the code easier to monkey patch is not a solution. The job locking algorithm is either susceptible to deadlocks or it isn't. This one is.

The failed job will automatically be retried as usual with Dj.

I'm not sure if you're referring to with this, but if the job was retried at this point that would be a bad thing since the job already happened. This retry is on the delete. I don't actually remember the effect of this delete failing (if the job retries, the row just gets stuck, or what).

That's true. I just wanted to say that in case this would be included in the official gem I'd like to have a way to turn it off but I still would prefer to not have it included at all.

Would you mind saying what you're concerned about here? This retry logic is specifically around deleting the delayed job record and only responds to deadlocks (deadlocks which are known to happen and to my knowledge unpreventable given the order in which mysql acquires locks and the way that this gem is selecting the next job.)

snackycracky · 2014-12-11T13:37:01Z

I agree that deleting the job after successful execution is a concern of this gem.

Version 4.0.3

defeated · 2015-05-02T20:48:28Z

+1 for this fix, we've applied it as a monkeypatch and it seems to have smoothed out our issue w/ deadlocks-on-delete crashing workers. (Our environment is multiple workers on multiple machines, with a large queue of fast-running jobs.)

MaciekLesiczka · 2020-06-24T09:14:03Z

+1 for the PR, 100% valid strategy for deadlocks, not a hack. For heavy throughput with many workers, this is a big deal. From my experience, this problem is one of the reasons people move out from DJs.

aaronjensen mentioned this pull request May 16, 2014

Mysql2::Error: Deadlock when attempting to lock a job #63

Open

Retry delete when deadlocked

cc268de

Fixes collectiveidea#63

sferik force-pushed the master branch 2 times, most recently from b3f2a9b to b3adaee Compare October 8, 2014 16:39

albus522 force-pushed the master branch from 17eeb1e to 7b7dddb Compare October 24, 2014 20:47

sferik force-pushed the master branch 2 times, most recently from 1240e8d to 639c9e5 Compare December 22, 2014 17:16

Merge tag 'v4.0.3' into retry-deadlock-on-delete

b88c313

Version 4.0.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry delete when deadlocked #91

Retry delete when deadlocked #91

aaronjensen commented May 16, 2014

joshuapinter commented May 16, 2014

betamatt commented Jul 11, 2014

aaronjensen commented Jul 11, 2014

joshgoebel commented Jul 30, 2014

aaronjensen commented Jul 30, 2014

joshgoebel commented Jul 30, 2014

klausmeyer commented Aug 18, 2014

snackycracky commented Dec 10, 2014

klausmeyer commented Dec 10, 2014

snackycracky commented Dec 10, 2014

klausmeyer commented Dec 10, 2014

aaronjensen commented Dec 10, 2014

snackycracky commented Dec 10, 2014

klausmeyer commented Dec 10, 2014

snackycracky commented Dec 10, 2014

klausmeyer commented Dec 10, 2014

aaronjensen commented Dec 11, 2014

snackycracky commented Dec 11, 2014

defeated commented May 2, 2015

MaciekLesiczka commented Jun 24, 2020 •

edited

Loading

Retry delete when deadlocked #91

Are you sure you want to change the base?

Retry delete when deadlocked #91

Conversation

aaronjensen commented May 16, 2014

joshuapinter commented May 16, 2014

betamatt commented Jul 11, 2014

aaronjensen commented Jul 11, 2014

joshgoebel commented Jul 30, 2014

aaronjensen commented Jul 30, 2014

joshgoebel commented Jul 30, 2014

klausmeyer commented Aug 18, 2014

snackycracky commented Dec 10, 2014

klausmeyer commented Dec 10, 2014

snackycracky commented Dec 10, 2014

klausmeyer commented Dec 10, 2014

aaronjensen commented Dec 10, 2014

snackycracky commented Dec 10, 2014

klausmeyer commented Dec 10, 2014

snackycracky commented Dec 10, 2014

klausmeyer commented Dec 10, 2014

aaronjensen commented Dec 11, 2014

snackycracky commented Dec 11, 2014

defeated commented May 2, 2015

MaciekLesiczka commented Jun 24, 2020 • edited Loading

MaciekLesiczka commented Jun 24, 2020 •

edited

Loading