-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Activator Pod keep in CrashLoopBackoff #4407
Comments
The logs of activator is here:
|
@zxDiscovery can you help check this? |
@yuxiaoba
|
@zxDiscovery Thank you for your help.
|
@yuxiaoba Please show the information about
|
@zxDiscovery My environment doesn't seem to have VirtualService
|
The activator is attempting to reach the autoscaler, and won't report healthy until it does. If you are restricting network traffic, then you have to allow this for it to ever become healthy. This is what the activator logs tell me. You should also share a describe of that pod, which will hopefully corroborate that hypothesis. |
@mattmoor Thank you for your help。
And I am a fresh man about istio and knative, I do not know how I have set restricting network traffic, is this ?
|
as I suspected the activator's readiness probes are failing, it seems because it cannot open a websocket to the autoscaler's metrics endpoint. |
I'm not sure why... |
@mattmoor I am also confusing about this . Is this the code bug? Or the deployment yaml of Knative Serving has some error? |
I don't see any indication that this is a Knative problem (yet). I'd launch an |
we are seeing similar issue when knative is under some load (around 450 ksvc). Currently we are running knative 0.6.1. Some commends results mentioned above:
we can see some error logs from knative autoscaler container which failed to talk with kube-apiserver but at the same time, we were able to talk with kube-apiserver without problem from local and all the other part of the cluster were just running fine.
|
after further looking into our issue, it seems to be related with sidecar container resources limit for |
@yuxiaoba Any chance you are doing this at scales similar to @patrickshan ? |
@yuxiaoba could you try starting a container in the I notice it doesn't have a readiness probe itself, which is troubling. |
@mattmoor I have tried this, but I am a freshman about this , may be I would do some incorrect action. First, Starting a container
Second, show the staus of pods
Curl the autoscaler and shows the result
|
What if you add |
@mattmoor Does change the yaml like this?
|
Actually nevermind, the autoscaler doesn't respond to network probes. |
@mattmoor I think this problem may be on the sidecar, because when I install the istio without sidecar, the activator pod runs normally |
@yuxiaoba did you enable authorization in your Istio installation? You may want to try using https://istio.io/docs/concepts/security/#enabling-authorization and exclude the namespace |
@tcnghia # cat /etc/resolv.conf |
Thanks, I believe the https://github.com/knative/serving/blob/master/pkg/network/domain.go#L69 didn't handle this right. We need to strip the trailing dot. |
@yuxiaoba You were able to fix this? I am still getting this in macosx local |
@majuansari I have fixed this by change |
I am having this issue with OpenShift 3.11. I should note that I am using the multitenant CNI from Red Hat which won't allow project-to-project communications except from the default namespace. Could this be the issue? |
Can you report the actual error you're getting when the activator fails?
…On Mon, Aug 26, 2019 at 2:17 PM Joseph G. Noonan ***@***.***> wrote:
I am having this issue with OpenShift 3.11. I should note that I am using
the multitenant CNI from Red Hat which won't allow project-to-project
communications except from the default namespace. Could this be the issue?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4407?email_source=notifications&email_token=AAF2WX2AHJYROFFG4RYBAATQGRB75A5CNFSM4HY74L72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5FWXYY#issuecomment-525036515>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAF2WX24XCNI6REU5BL6UUTQGRB75ANCNFSM4HY74L7Q>
.
|
Yup, that was it. I ran the following command to join the project networks and both pods are now running. Again, the worked because I am using the mult-tenant sdn from Red Hat:
|
Just want to comment that this ended up being our solution, as we had tested our system to over 2000 ksvcs. Fortunately our entire service mesh didn't require the full service discovery so we limited the egress via sidecar definitions. But it's probably not scalable to keep increasing the resources on the activator pod so we're hoping in the future we can reduce the number of routes kept in the sidecar.
|
In what area(s)?
What version of Knative?
Knative 0.6
Expected Behavior
Install the component of Knative Serving successfully
Actual Behavior
Steps to Reproduce the Problem
Install Kubernetes 1.13
Install Istio 1.1.3
Install Knative Serving
The text was updated successfully, but these errors were encountered: