Activator should wait for active requests to drain before terminating #4654

mattmoor · 2019-07-09T03:40:32Z

In what area(s)?

/area autoscale
/area networking
/kind good-first-issue

Describe the feature

The activator receives the stop signal here, we should hook in a new activation handler here that manages a sync.WaitGroup{} that increments/decrements on new requests, and then does a wg.Wait() after the stopCh signal is received.

The text was updated successfully, but these errors were encountered:

vagababov · 2019-07-09T05:27:21Z

From the Server.Shutdown docs:

Shutdown gracefully shuts down the server without interrupting any active connections. Shutdown works by first closing all open listeners, then closing all idle connections, and then waiting indefinitely for connections to return to idle and then shut down.

So, the way I read it, once we call shutdown

we won't get any new requests
all the current ones will be processed

Thus, I think this issue is unnecessary?

markusthoemmes · 2019-07-09T05:54:37Z

Agree with @vagababov. Have we seen this not working as expected?

mattmoor · 2019-07-09T22:03:53Z

You may be right, but two concerns come to mind:

What happens with streaming connections (e.g. websocket or GRPC)?
Even once the code handles graceful termination, I'm concerned that terminationGracePeriodSeconds is too low for our activator pods (read: it is the default of 30s), which is much shorter than even our default request timeout.

What I describe above is likely handled by the Shutdown internally (great!), but we likely still need some config for 2.

@mattmoor

This will permit us to let activator linger longer, before K8s forcefully kills it to process the requests that might take more time (e.g. streaming). /assign @mattmoor For knative#4654

tcnghia · 2019-07-09T23:06:16Z

/assign @vagababov

looks like this will be closed by #4671

@mattmoor

This will permit us to let activator linger longer, before K8s forcefully kills it to process the requests that might take more time (e.g. streaming). /assign @mattmoor For #4654

mattmoor added the kind/feature Well-understood/specified features, ready for coding. label Jul 9, 2019

knative-prow-robot added area/autoscale area/networking kind/good-first-issue labels Jul 9, 2019

mattmoor added this to the Serving 0.8 milestone Jul 9, 2019

vagababov mentioned this issue Jul 9, 2019

Bump the activator grace timeout to the default request timeout #4671

Merged

knative-prow-robot assigned vagababov Jul 9, 2019

knative-prow-robot closed this as completed in #4671 Jul 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Activator should wait for active requests to drain before terminating #4654

Activator should wait for active requests to drain before terminating #4654

mattmoor commented Jul 9, 2019

vagababov commented Jul 9, 2019

markusthoemmes commented Jul 9, 2019

mattmoor commented Jul 9, 2019

tcnghia commented Jul 9, 2019

Activator should wait for active requests to drain before terminating #4654

Activator should wait for active requests to drain before terminating #4654

Comments

mattmoor commented Jul 9, 2019

In what area(s)?

Describe the feature

vagababov commented Jul 9, 2019

markusthoemmes commented Jul 9, 2019

mattmoor commented Jul 9, 2019

tcnghia commented Jul 9, 2019