Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Enabling Addon "metallb" #3530

Open
zenhighzer opened this issue Oct 27, 2022 · 11 comments
Open

Error Enabling Addon "metallb" #3530

zenhighzer opened this issue Oct 27, 2022 · 11 comments
Labels

Comments

@zenhighzer
Copy link

zenhighzer commented Oct 27, 2022

Summary

  • 3 x Raspberry Pi Ubuntu 22.10 / MicroK8s via. snap

Enabling Addon metallb throws error:

deployment.apps/controller condition met
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "ipaddresspoolvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-ipaddresspool?timeout=10s": context deadline exceeded
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "l2advertisementvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-l2advertisement?timeout=10s": context deadline exceeded

Services with Type "LoadBalancer" are in state "pending":
k get svc:
default test LoadBalancer 10.152.183.52 <pending> 80:31110/TCP 44m

What Should Happen Instead?

No errors while activating metallb-Addon and Services with type LoadBalancer should get an IP

Reproduction Steps

Following the Guide: https://ubuntu.com/tutorials/how-to-kubernetes-cluster-on-raspberry-pi#1-overview

  1. Install Ubuntu on PIs
  2. Edited cmdline-file with
    cgroup_enable=memory cgroup_memory=1 -> whole line looks like:
    cgroup_enable=memory cgroup_memory=1 console=serial0,115200 dwc_otg.lpm_enable=0 console=tty1 root=LABEL=writable rootfstype=ext4 rootwait fixrtc quiet splash

3)reboot
4) install microk8s via snap
5) Build Cluster via microk8s
6) Enabling Microk8s Addons:

  • dns
  • rbac
  • metallb (range 192.168.80.20-192.168.80.30)

Pods seem to run fine:

NAMESPACE        NAME                                       READY   STATUS    RESTARTS   AGE   IP              NODE   NOMINATED NODE   READINESS GATES
kube-system      calico-kube-controllers-5d7dbf4c7d-vtgzs   1/1     Running   0          65m   10.1.166.193    k8s1   <none>           <none>
kube-system      coredns-d489fb88-8wtb9                     1/1     Running   0          53m   10.1.219.1      k8s3   <none>           <none>
ingress          nginx-ingress-microk8s-controller-bf7r7    1/1     Running   0          48m   10.1.219.2      k8s3   <none>           <none>
ingress          nginx-ingress-microk8s-controller-cc8c5    1/1     Running   0          48m   10.1.166.194    k8s1   <none>           <none>
ingress          nginx-ingress-microk8s-controller-9sv6t    1/1     Running   0          48m   10.1.109.66     k8s2   <none>           <none>
default          test-75d6d47c7f-rrrgx                      1/1     Running   0          43m   10.1.109.67     k8s2   <none>           <none>
default          test-75d6d47c7f-8r9s2                      1/1     Running   0          43m   10.1.219.3      k8s3   <none>           <none>
default          test-75d6d47c7f-75ttz                      1/1     Running   0          43m   10.1.166.195    k8s1   <none>           <none>
metallb-system   controller-56c4696b5-gsxpc                 1/1     Running   0          16m   10.1.109.69     k8s2   <none>           <none>
metallb-system   speaker-bx9c8                              1/1     Running   0          16m   192.168.80.12   k8s2   <none>           <none>
metallb-system   speaker-h5h4x                              1/1     Running   0          16m   192.168.80.11   k8s1   <none>           <none>
metallb-system   speaker-xstvh                              1/1     Running   0          16m   192.168.80.13   k8s3   <none>           <none>
kube-system      calico-node-pm6cw                          1/1     Running   0          57m   192.168.80.11   k8s1   <none>           <none>
kube-system      calico-node-krhc2                          1/1     Running   0          56m   192.168.80.12   k8s2   <none>           <none>
kube-system      calico-node-bszf9                          1/1     Running   0          56m   192.168.80.13   k8s3   <none>           <none>

Introspection Report

After inspect there is an error:
The memory cgroup is not enabled, but it should be
-> (please look at Reproduction Steps)

Can you suggest a fix?

I tried the same setup, but with Ubuntu 20.04.5: no Errors, Services with Type LoadBalancer are receiving an IP.
So the the error must have something to do with Ubuntu 22.10

Are you interested in contributing with a fix?

I would like to help, but dont know how

inspection-report-20221027_133229.tar.gz

@zacbayhan
Copy link

I was recently working on a similar issue, and I determined the proxy wasn't set correctly in the /etc/environment or the /etc/profile.d/proxy.sh, do you get anything running https://webhook-service.metallb-system.svc -vvv or dig https://webhook-service.metallb-system.svc

@panlinux
Copy link

I'm seeing the same issue on two 22.10 ubuntu systems, I opened a thread in discourse: https://discuss.kubernetes.io/t/error-enabling-metallb-internal-error-context-deadline-exceeded/22092/

@panlinux
Copy link

panlinux commented Nov 27, 2022

I repeated my same steps on an Ubuntu 22.04 LTS install, and this time it all worked.

More specifically, I retried a simpler case in VMs first, without involving metallb, and found out that the connection to a service ip was flaky, and only worked quickly when the endpoint it was hitting happened to be on the same node. I retested that scenario with ubuntu 22.10 and 22.04, and it consistently failed when the OS was ubuntu 22.10.

@gcraenen
Copy link

I'm having the same issues with 3 miniforum nucs and Ubuntu 22.10.

@zacbayhan
Copy link

looking at the discussion panlinux posted it looks like you were having webhook error

Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "ipaddresspoolvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-ipaddresspool?timeout=10s": context deadline exceeded
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "l2advertisementvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-l2advertisement?timeout=10s": context deadline exceeded

try running
kubectl -n metallb get validatingwebhookconfiguration -o yaml

and see if failurePolicy is set to Fail, I believe you can set it to ignore.

It looks like it might be getting hung on proxy? so that might be another option to look into

@panlinux
Copy link

and see if failurePolicy is set to Fail, I believe you can set it to ignore.

It's Fail indeed, but before ignoring an error it's important to understand why it's happening, and only in ubuntu kinetic. It works in jammy.

It looks like it might be getting hung on proxy? so that might be another option to look into

No proxy. This can be easily replicated in a kinetic vm. I just did it now, with microk8s 1.25.4 and two kinetic vms.

@neoaggelos
Copy link
Contributor

Hi @panlinux, this could be related to a vxlan bug that breaks checksum calculation.

Could you try to see whether:

microk8s kubectl patch felixconfigurations default --patch '{"spec":{"featureDetectOverride":"ChecksumOffloadBroken=true"}}' --type=merge

helps with your issue?

@kathoef
Copy link

kathoef commented Feb 17, 2023

It seems to help to put e.g. metallb-system.svc into the set ofno_proxy variables

$ cat /etc/environment
...
NO_PROXY=127.0.0.1,::1,localhost,10.152.183.0/24,10.1.0.0/16,metallb-system.svc
no_proxy=127.0.0.1,::1,localhost,10.152.183.0/24,10.1.0.0/16,metallb-system.svc
...

as suggested somewhere above and in this metallb repo issue.(I had activated the DNS addon before activating the metallb addon, i.e. microk8s enable dns if that is important.)

@natarajmb
Copy link

@neoaggelos thanks for the workaround. I just tried your fix and it works. My setup

4 x RPI 4B running on 22.10 server with DNS addon.

I patched felixconfigurations and enabled metallb and not seeing any errors.
I had existing BGP configurations and all are working as expected. Thank you 👍

@risha700
Copy link

risha700 commented Jan 8, 2024

spec:
KUBE_VER="v1.29"
METALLB_VER="v0.13.12"
CALICO_VER="v3.27.0"
Ubuntu 22.04.3 LTS
The same unreachable error
Cause: Networking misconfiguration
Check your firewall and connectivity before proceeding with any installations.
in my case it was ICMP unreachable with direct ip and isnt routable from master to slaves.
that answers the why @panlinux

Copy link

stale bot commented Dec 5, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the inactive label Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants