Reduce unnecessary producerStateTable publish notifications #258

jipanyang · 2019-02-01T23:54:29Z

Signed-off-by: Jipan Yang [email protected]

The typical use case that this PR is trying to deal with is related to route flapping.

Assuming such a scenario:
<1> 4K routes learned on an interface.

<2> interface down, 4k routes withdraw.
<3> fpmsyncd inserts 4k route keys into producerStateTabk keyset and generates 4k redis publish notifications

<4> Interface goes up

<4.A> 
Interface may go up and learn the 4k routes back  before orchagent has chance to pick up any notification and process the keyset.  That is fine, change in https://github.com/Azure/sonic-swss-common/pull/257 handled this case, no new redis publish notification will be created since the route keys already exist in producerStateTabk keyset and are to be processed. 

<4.B> 
Interface may go up and learn the 4k routes back after orchagent has processed a few notifications but not all the redis notifications (say, 1k notification processed),  with the first notification,  consumer in orchagent batch popped all the 4k route keys from producerStateTabk keyset,  (orchagent has set popBatchSize as 8192),  for the other 999 notifications processed, orchagent just call consumer_state_table_pops.lua 999 times without seeing any real data.
4k - 1k = 3k more redis notifications pending in redis queue now.

<5> Interface goes down

<5.A>  If <4.A> was hit during previous interface up, good, no new redis publish notification will be generated

<5.B>  If <4.B> was hit during previous interface up. 4K more redis notifications put into the redis queue.

<6> Interface goes up, loop back to <4>

The exact scenarios vary, but from <4> to <6>, number of pending notifications keeps increasing.

The issue is that orchagent may be abled to handle all the route changes with the batch processing (popBatchSize = 8192), redis publish notification was generated for each key and orchagent has to respond to each of the notifications with current select framework though most of them bring no value.

The changes in this PR has two parts:

<1> Only generate redis publish notification when absolutely necessary.
In prodcueStateTable, generate notification for the first key in an empty keyset.
In consumerStateTable, consume all the data in keyset upon redis notification. If by any chance, the number of keys in keyset is larger than popBatchSize, generate one notification to itself to come back again.

<2> consumerStateTable.pop() processing
pop() method is not used by orchagent. but some unit test cases only. User of consumerStateTable.pop() process one key for each notification. So the work around is we explicitly create m_buffer.size() of redis publish notifications for select() frame to pickup all the buffered keys.

jipanyang · 2019-02-01T23:57:50Z

This is the implementation done together with @zhenggen-xu

lguohan · 2019-02-02T00:00:00Z

@jipanyang , can you tell us which one you prefer?

jipanyang · 2019-02-02T00:09:01Z

@lguohan This one probabally will be able to handle the most extreme redis usage scenario, but the change requires more careful review.

I need to check and fix the pyext part, forgot to run this test locally.

qiluo-msft · 2019-02-05T22:47:48Z

Could you explain the problem and the high level design in this PR? That may help us understand your code.

Signed-off-by: Jipan Yang <[email protected]>

…them atomic. Signed-off-by: Jipan Yang <[email protected]>

Signed-off-by: Jipan Yang <[email protected]>

jipanyang · 2019-02-12T01:57:14Z

@qiluo-msft problem description and high level design updated.

zhenggen-xu · 2019-02-25T06:20:05Z

common/producerstatetable.cpp

        "for i = 0, #KEYS - 3 do\n"
        "    redis.call('HSET', KEYS[3 + i], ARGV[3 + i * 2], ARGV[4 + i * 2])\n"
        "end\n"
-        " if added > 0 then \n"
+        "local num = redis.call('SCARD', KEYS[2])\n"


It is probably better to move this check before we add the current key itself (before "redis.call('SADD', KEYS[2], ARGV[2])\n"), and we check whether "num == 0". With the code as is, in case the current key is the same as the exist key before, we could end up publishing more notifications than needed.

zhenggen-xu · 2019-02-25T06:20:54Z

common/producerstatetable.cpp

        "redis.call('SADD', KEYS[4], ARGV[2])\n"
        "redis.call('DEL', KEYS[3])\n"
-        "if added > 0 then \n"
+        "local num = redis.call('SCARD', KEYS[2])\n"


Same comment as before, to check the keyset number before we add the new key to be more precise.

zhenggen-xu · 2019-02-25T06:29:04Z

common/producerstatetable.cpp

        "for i = 0, #KEYS - 3 do\n"
        "    redis.call('HSET', KEYS[3 + i], ARGV[3 + i * 2], ARGV[4 + i * 2])\n"
        "end\n"
-        " if added > 0 then \n"
+        "local num = redis.call('SCARD', KEYS[2])\n"
+        " if num == 1 then \n"


Also, since we have only one notification at producer side for the keyset now, we might want to be cautious of the case where the notification could get dropped. In such case, the keyset will never be served to consumer even we may update keys/values in the keyset, as we will never generate notifications at producer again. Some mechanism might be needed here to prevent such cases.

Unlikely redis will drop the notification, in that case probably a lot of bad things happened before the error.

But I agree this is a valid concern, we should avoid single point of failure here. Currently I'm considering this extra check:

producerStateTable side, together with notification, it also leaves one field/value pair under _PRODUCESTATE_PUBLISH_REC
ex.

127.0.0.1:6380> hset _PRODUCESTATE_PUBLISH_REC ROUTE_TABLE_CHANNEL 1 (integer) 1 127.0.0.1:6380> hset _PRODUCESTATE_PUBLISH_REC PORT_TABLE_CHANNEL 1 (integer) 1 127.0.0.1:6380> hgetall _PRODUCESTATE_PUBLISH_REC 1) "ROUTE_TABLE_CHANNEL" 2) "1" 3) "PORT_TABLE_CHANNEL" 4) "1"

consumerStateTable side, after consuming all the data, it deletes the field/value pair

127.0.0.1:6380> hdel _PRODUCESTATE_PUBLISH_REC ROUTE_TABLE_CHANNEL (integer) 1 127.0.0.1:6380> hdel _PRODUCESTATE_PUBLISH_REC PORT_TABLE_CHANNEL (integer) 1

The caller of consumerStateTable, which is orchagent, do a check of _PRODUCESTATE_PUBLISH_REC key after each Select::TIMEOUT event.

127.0.0.1:6380> hgetall _PRODUCESTATE_PUBLISH_REC (empty list or set)

If it is not empty, likely the notification has been lost, generate new channel notification for each FV under the key so the pending data could be picked up.

Due to race condition, there is very small possibility of false positive. We generate one extra notification, that is ok.

The mechanism looks ok to me. A few questions:
1, "If it is not empty, likely the notification has been lost, generate new channel notification for each FV under the key so the pending data could be picked up".
Any reason we do notification per FV? I would think we should generate the notification per keyset as before for consumer to pick up.

2, "Due to race condition"
Where exactly the race condition? If after time-out there could be a notification before we check the " _PRODUCESTATE_PUBLISH_REC", we could check the notification all together there atomically?

FV here is the FV under _PRODUCESTATE_PUBLISH_REC:
ex.

1) "ROUTE_TABLE_CHANNEL" 2) "1" 3) "PORT_TABLE_CHANNEL" 4) "1"

each FV represents one keyset which has pending data.

For race condition, consider such a scenario:

Orchagent has been idle for 1 second, timeout signal will kick in.

neighsyncd has discovered a new neighbor and programmed NEIGH_TABLE, then generated redis notification

At the exact moment, if timeout signal is processed by orchagent before redis notification while neighsyncd already put data under _PRODUCESTATE_PUBLISH_REC key, orchagent will create one extra reidis publish notification for NEIGH_TABLE, though immediately after that the real notification from neighsyncd will be processed.

This will be really rare scenario, the extra notification is harmless other than causing orchagent to perform one more check.

lguohan requested a review from qiluo-msft February 1, 2019 23:59

qiluo-msft mentioned this pull request Feb 5, 2019

Optimized ProducerStateTable set/del notification processing to avoid… #257

Merged

jipanyang added 3 commits February 6, 2019 14:12

Reduce unnecessary producerStateTable publish notifications

c5eb157

Signed-off-by: Jipan Yang <[email protected]>

Fix ConsumerStateTable swig interface

43af582

Signed-off-by: Jipan Yang <[email protected]>

Move channel publish call into consumer_state_table_pops.lua to make …

c3341c3

…them atomic. Signed-off-by: Jipan Yang <[email protected]>

jipanyang force-pushed the producerState_publish branch from 29d86b7 to c3341c3 Compare February 6, 2019 22:27

Minor updates for unit test

6dfd3fc

Signed-off-by: Jipan Yang <[email protected]>

zhenggen-xu reviewed Feb 25, 2019

View reviewed changes

qiluo-msft mentioned this pull request Mar 10, 2019

Refine publish notifications #271

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce unnecessary producerStateTable publish notifications #258

Reduce unnecessary producerStateTable publish notifications #258

jipanyang commented Feb 1, 2019 •

edited

Loading

jipanyang commented Feb 1, 2019

lguohan commented Feb 2, 2019

jipanyang commented Feb 2, 2019 •

edited

Loading

qiluo-msft commented Feb 5, 2019

jipanyang commented Feb 12, 2019

zhenggen-xu Feb 25, 2019

zhenggen-xu Feb 25, 2019

zhenggen-xu Feb 25, 2019

jipanyang Feb 25, 2019 •

edited

Loading

zhenggen-xu Feb 28, 2019

jipanyang Feb 28, 2019

Reduce unnecessary producerStateTable publish notifications #258

Are you sure you want to change the base?

Reduce unnecessary producerStateTable publish notifications #258

Conversation

jipanyang commented Feb 1, 2019 • edited Loading

jipanyang commented Feb 1, 2019

lguohan commented Feb 2, 2019

jipanyang commented Feb 2, 2019 • edited Loading

qiluo-msft commented Feb 5, 2019

jipanyang commented Feb 12, 2019

zhenggen-xu Feb 25, 2019

Choose a reason for hiding this comment

zhenggen-xu Feb 25, 2019

Choose a reason for hiding this comment

zhenggen-xu Feb 25, 2019

Choose a reason for hiding this comment

jipanyang Feb 25, 2019 • edited Loading

Choose a reason for hiding this comment

zhenggen-xu Feb 28, 2019

Choose a reason for hiding this comment

jipanyang Feb 28, 2019

Choose a reason for hiding this comment

jipanyang commented Feb 1, 2019 •

edited

Loading

jipanyang commented Feb 2, 2019 •

edited

Loading

jipanyang Feb 25, 2019 •

edited

Loading