Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Arg --disable-component-controllers still not working properly #1212

Closed
misoknr opened this issue Jun 18, 2024 · 18 comments · Fixed by #1222
Closed

bug: Arg --disable-component-controllers still not working properly #1212

misoknr opened this issue Jun 18, 2024 · 18 comments · Fixed by #1222

Comments

@misoknr
Copy link

misoknr commented Jun 18, 2024

Describe the issue

When helm value for "operator.disableComponentControllers" is provided when installing fluent-operator, it fails to properly pass it to operator runtime. Following error will appear in operator log:

2024-06-18T15:20:52Z ERROR setup {"error": "incorrect value for -disable-component-controllers and it will not be proceeded (possible values are: fluent-bit, fluentd)"}

To Reproduce

Install/upgrade fluent-operator and provide following value in value file for example

operator:
  disableComponentControllers: "fluentd"

Expected behavior

When correct value is provided for operator.disableComponentControllers, it is correctly propagated to operator

Your Environment

- Fluent Operator version: 2.9.0
- Container Runtime: 
- Operating system: centos rhel fedora
- Kernel version: 5.10.214-202.855.amzn2.x86_64

How did you install fluent operator?

Via helm chart:

helm upgrade --install fluent-operator fluent/fluent-operator --version 2.9.0  --namespace fluent --create-namespace -f k8s_addons/fluent/values.yml  --set fluentbit.image.tag=v3.0.7

Values file contents:

containerRuntime: docker
Kubernetes: false
fluentd:
  crdsEnable: false
fluentbit:
  enable: true
  image:
    repository: ***
  imagePullSecrets:
  - name: artifactory-connection-docker-secret
operator:
  disableComponentControllers: "fluentd"
  container:
    repository: ***
  resources:
    requests:
      cpu: 100m
      memory: 450Mi
    limits:
      cpu: 100m
      memory: 450Mi
  imagePullSecrets:
  - name: artifactory-connection-docker-secret
  initcontainer:
    repository: ***

Additional context

No response

@cw-Guo
Copy link
Collaborator

cw-Guo commented Jun 21, 2024

I tried but can't reproduce it in my local environment. Some debug suggestions:

  1. Can you try to check the manifests generated by helm?
  2. Can you check the running fluent-operator pod's manifests?

The correct ones should have the following:
args: - "--disable-component-controllers=fluentd"

@misoknr
Copy link
Author

misoknr commented Jun 21, 2024

This is in operator deployment manifest:

containers:
  - name: fluent-operator
    image: >-
      srsng-docker.artifactory.healthcare.siemens.com/kubesphere/fluent-operator:v2.9.0
    args:
      - '--disable-component-controllers="fluentd"'

@benjaminhuo
Copy link
Member

      - '--disable-component-controllers="fluentd"'

Try - '--disable-component-controllers=fluentd'

@misoknr
Copy link
Author

misoknr commented Jun 21, 2024

      - '--disable-component-controllers="fluentd"'

Try - '--disable-component-controllers=fluentd'

Thanks, but that's not the point. The point is that probably the helm template may be wrong and it will construct final helm chart with wrong value.

No matter if I set this

operator:
  disableComponentControllers: fluentd

or this

operator:
  disableComponentControllers: "fluentd"

to my values file, the result is still the same

@SvenThies
Copy link
Contributor

Hey,

same issue here. The following appears in the deployment manifest:

containers:
      - args:
        - --disable-component-controllers="fluentd

@cw-Guo
Copy link
Collaborator

cw-Guo commented Jun 24, 2024

I did see a recent change about this feature. see templates/fluent-operator-deployment.yaml

Can you please check whether your template is the same with the latest one?

@SvenThies
Copy link
Contributor

Hey,

thanks for the swift reply. From my helm release version, the fix you mentioned should already be there.

helm-release version: 2.9.0

values.yaml:

disableComponentControllers: "fluentd"

Rendered manifest:

args:
    - --disable-component-controllers="fluentd"

Error:

2024-06-24T20:28:46Z	ERROR	setup		{"error": "incorrect value for `-disable-component-controllers` and it will not be proceeded (possible values are: fluent-bit, fluentd)"}

Using your suggestion, patching the deployment manifest with

args: - "--disable-component-controllers=fluentd"

works fine.

@cw-Guo
Copy link
Collaborator

cw-Guo commented Jun 26, 2024

~/playground/fluent-operator master* 9s ❯ helm version                                                                                                                 23:23:53
version.BuildInfo{Version:"v3.15.2", GitCommit:"1a500d5625419a524fdae4b33de351cc4f58ec35", GitTreeState:"clean", GoVersion:"go1.22.4"}
~/playground/fluent-operator master* ❯ helm template fluent-operator charts/fluent-operator --version 2.9.0 --namespace fluent -f charts/fluent-operator/values.yaml | grep args -A 5 -B 5
          - name: NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
        args:
          - "--disable-component-controllers=fluentd"
        volumeMounts:
        - name: env
          mountPath: /fluent-operator
      serviceAccountName: fluent-operator

I just tried again with the latest version helm, generated manifest is correct.

@SvenThies
Copy link
Contributor

Hmm, that's weird.

Especially because the mentioned fix was released with 2.9.0 but is not in the deployment template of the artifacthub.com 2.9.0 chart, which still shows:

        args:
          - --disable-component-controllers={{ .Values.operator.disableComponentControllers | quote }}

Any idea where this comes from?

@mritunjaysharma394
Copy link
Contributor

I am facing the same issue @SvenThies but in my case the quotes seem to be fine but empty:

kubectl get deployment.apps/fluent-operator -n fluent -o yaml | grep args -A 5 -B 5                        
      labels:
        app.kubernetes.io/component: operator
        app.kubernetes.io/name: fluent-operator
    spec:
      containers:
      - args:
        - --disable-component-controllers=""
        env:
        - name: NAMESPACE
          valueFrom:
            fieldRef:

Is this expected, should it not be fluentd @cw-Guo ?

Although except for this error, I think my rest of the logs of operator seem to look fine:

kubectl logs -n fluent pod/fluent-operator-58ff575d4c-twg7w
Defaulted container "fluent-operator" out of: fluent-operator, setenv (init)
2024-06-28T13:03:00Z    INFO    controller-runtime.metrics      Metrics server is starting to listen    {"addr": ":8080"}
2024-06-28T13:03:00Z    ERROR   setup           {"error": "incorrect value for `-disable-component-controllers` and it will not be proceeded (possible values are: fluent-bit, fluentd)"}
main.main
        /workspace/main.go:121
runtime.main
        /usr/local/go/src/runtime/proc.go:267
2024-06-28T13:03:00Z    INFO    setup   starting manager
2024-06-28T13:03:00Z    INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
2024-06-28T13:03:00Z    INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.FluentBit"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.FluentBit"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1.Secret"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1.ServiceAccount"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.ClusterFluentBitConfig"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1.DaemonSet"}
2024-06-28T13:03:00Z    INFO    Starting Controller     {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.FluentBitConfig"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.ClusterInput"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.ClusterFilter"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.ClusterOutput"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.ClusterParser"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.ClusterMultilineParser"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.Filter"}
2024-06-28T13:03:00Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.Output"}

@mritunjaysharma394
Copy link
Contributor

Also an update to it, if I do:

helm upgrade fluent-operator fluent/fluent-operator --version 2.9.0 --namespace fluent --set operator.disableComponentControllers="fluent-bit"

I do get the updated deployment:

kubectl get deployment.apps/fluent-operator -n fluent -o yaml | grep args -A 5 -B 5
      labels:
        app.kubernetes.io/component: operator
        app.kubernetes.io/name: fluent-operator
    spec:
      containers:
      - args:
        - --disable-component-controllers="fluent-bit"
        env:
        - name: NAMESPACE
          valueFrom:
            fieldRef:

However, the error still surfaces in logs:

kubectl logs -n fluent pod/fluent-operator-85f648d4cb-j7hp8      
Defaulted container "fluent-operator" out of: fluent-operator, setenv (init)
2024-06-28T14:28:50Z    INFO    controller-runtime.metrics      Metrics server is starting to listen    {"addr": ":8080"}
2024-06-28T14:28:50Z    ERROR   setup           {"error": "incorrect value for `-disable-component-controllers` and it will not be proceeded (possible values are: fluent-bit, fluentd)"}
main.main
        /workspace/main.go:121
runtime.main
        /usr/local/go/src/runtime/proc.go:267
2024-06-28T14:28:50Z    INFO    setup   starting manager
2024-06-28T14:28:50Z    INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
2024-06-28T14:28:50Z    INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.FluentBit"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1.ServiceAccount"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1.DaemonSet"}
2024-06-28T14:28:50Z    INFO    Starting Controller     {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluentd", "controllerGroup": "fluentd.fluent.io", "controllerKind": "Fluentd", "source": "kind source: *v1alpha1.Fluentd"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluentd", "controllerGroup": "fluentd.fluent.io", "controllerKind": "Fluentd", "source": "kind source: *v1.ServiceAccount"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluentd", "controllerGroup": "fluentd.fluent.io", "controllerKind": "Fluentd", "source": "kind source: *v1.DaemonSet"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.FluentBit"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1.Secret"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.ClusterFluentBitConfig"}
2024-06-28T14:28:50Z    INFO    Starting EventSource    {"controller": "fluen

@SvenThies
Copy link
Contributor

I think there is a problem with the latest release (v2.9.0) to the helm registry. As mentioned by @cw-Guo, the argument should look like this in the deployment template:

  - '--disable-component-controllers=fluentd'

As I saw, there was some problems with the release of v2.8.0.

@mritunjaysharma394
Copy link
Contributor

I think I have identified the problem and trying to work on a fix too. it seems like the value from helm is being parsed incorrectly:
I added a small change in code and built a custom image to test with chart, while the manager binary itself worked fine without using chart but on parsing the value with chart, I got this logged:

2024-06-28T15:10:37Z    INFO    setup   Value of disabledControllers    {"value": "\"fluentd\""}
2024-06-28T15:10:37Z    ERROR   setup           {"error": "incorrect value for `-disable-component-controllers` and it will not be proceeded (possible values are: fluent-bit, fluentd)"}
main.main
        /workspace/main.go:122
runtime.main

Which is the reason why it is reporting it, it reads it as "\"fluentd\""

@mritunjaysharma394
Copy link
Contributor

Created a fix #1222 and it works fine with helm install locally now

kubectl logs -n fluent pod/fluent-operator-7df5b4d96b-scphc            ✔  kind-kind ⎈ 
Defaulted container "fluent-operator" out of: fluent-operator, setenv (init)
2024-06-28T15:45:05Z    INFO    controller-runtime.metrics      Metrics server is starting to listen    {"addr": ":8080"}
2024-06-28T15:45:05Z    INFO    setup   starting manager
2024-06-28T15:45:05Z    INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
2024-06-28T15:45:05Z    INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
2024-06-28T15:45:05Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.FluentBit"}
2024-06-28T15:45:05Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1.ServiceAccount"}
2024-06-28T15:45:05Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1.DaemonSet"}
2024-06-28T15:45:05Z    INFO    Starting Controller     {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit"}
2024-06-28T15:45:05Z    INFO    Starting EventSource    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "source": "kind source: *v1alpha2.FluentBit"}
2024-06-28T15:45:05Z    INFO    Starting EventSource    {"controller": "fluentd", "controllerGroup": "fluentd.fluent.io", "controllerKind": "Fluentd", "source": "kind source: *v1alpha1.Fluentd"}

@SvenThies
Copy link
Contributor

IMHO fixing this in the code base doesn't make any sense. The chart from the repo (main and tag v2.9.0) works just fine. We need to fix the release.

@cw-Guo
Copy link
Collaborator

cw-Guo commented Jun 28, 2024

i do think this is a release issue, but i am not familiar with the release process. @benjaminhuo can you please help take a look at the helm release v2.9? Thanks!

@SvenThies
Copy link
Contributor

I think this issue needs to be prioritised - is there a way/process in place in this repo? Basically v2.9.0 was not yet released, as it does not differ from v2.8.0:
Screenshot 2024-07-01 at 20 18 00

@benjaminhuo
Copy link
Member

IMHO fixing this in the code base doesn't make any sense. The chart from the repo (main and tag v2.9.0) works just fine. We need to fix the release.

@wenchajun You'll need to sync the fluent operator chart to https://github.com/fluent/helm-charts/tree/main/charts/fluent-operator without any modification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants