Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd: start netdata after network is online #17906

Merged
merged 1 commit into from
Jun 17, 2024
Merged

Conversation

k0ste
Copy link
Contributor

@k0ste k0ste commented Jun 15, 2024

Added network-online.target conditions, because when expected that netdata web server will be avail on two interfaces

[web]
  bind to = 127.0.0.1 192.168.250.1
  allow connections from = 127.0.0.1 localhost 192.168.250.1

Netdata just ignore IP address that is not avail

Jun 15 19:42:03 netdata.example.com netdata[496]: LISTENER: IPv4 bind() on ip '192.168.250.1' port 19999, socktype 1 failed.
Jun 15 19:42:03 netdata.example.com netdata[496]: LISTENER: Cannot bind to ip '192.168.250.1', port 19999

nss-lookup.target is a 'DNS' availability. For go.d/ping module, when host is a hostname


Before

netdata.service +295ms
└─network.target @3.642s
  └─frr.service @2.691s +950ms
    └─network-pre.target @2.673s
      └─iptables.service @2.535s +134ms
        └─ipset.service @2.420s +101ms
          └─basic.target @2.396s
            └─dbus-broker.service @2.372s +19ms
              └─dbus.socket @2.342s
                └─sysinit.target @2.339s
                  └─systemd-update-utmp.service @2.312s +26ms
                    └─systemd-tmpfiles-setup.service @2.202s +96ms
                      └─local-fs.target @2.186s
                        └─boot.mount @1.925s +259ms
                          └─systemd-fsck@dev-disk-by\x2duuid-f0151639\x2d7418\x2d4ac6\x2db1c6\x2d68d378c40519.service @1.755s +156ms
                            └─dev-disk-by\x2duuid-f0151639\x2d7418\x2d4ac6\x2db1c6\x2d68d378c40519.device @1.734s

After

netdata.service +293ms
└─network-online.target @7.576s
  └─systemd-networkd-wait-online.service @3.716s +3.859s
    └─systemd-networkd.service @3.306s +269ms
      └─network-pre.target @3.280s
        └─iptables.service @3.112s +166ms
          └─ipset.service @2.966s +133ms
            └─basic.target @2.942s
              └─dbus-broker.service @2.919s +19ms
                └─dbus.socket @2.895s
                  └─sysinit.target @2.892s
                    └─systemd-update-done.service @2.872s +19ms
                      └─systemd-journal-catalog-update.service @2.802s +49ms
                        └─systemd-tmpfiles-setup.service @2.660s +128ms
                          └─local-fs.target @2.635s
                            └─boot.mount @2.249s +385ms
                              └─systemd-fsck@dev-disk-by\x2duuid-f0151639\x2d7418\x2d4ac6\x2db1c6\x2d68d378c40519.service @2.115s +113ms
                                └─dev-disk-by\x2duuid-f0151639\x2d7418\x2d4ac6\x2db1c6\x2d68d378c40519.device @2.093s

P.S.: wants is a weak (not hard) dependency

Added network-online.target conditions, because currently netdata can start before all IP for bind is avail

```
Jun 15 19:42:03 netdata.example.com netdata[496]: LISTENER: IPv4 bind() on ip '192.168.250.1' port 19999, socktype 1 failed.
Jun 15 19:42:03 netdata.example.com netdata[496]: LISTENER: Cannot bind to ip '192.168.250.1', port 19999
```

P.S.: `wants` is a weak (not hard) dependency
@k0ste k0ste requested a review from a team as a code owner June 15, 2024 17:40
@github-actions github-actions bot added the area/packaging Packaging and operating systems support label Jun 15, 2024
Copy link
Member

@Ferroin Ferroin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that in this case the correct configuration is to bind to 0.0.0.0 and rely on Netdata’s access control mechanisms instead of binding to specific IP addresses, but I think this is probably still a reasonable change since it’s going to be more likely that users want the network active when Netdata starts up than that they want Netdata started as early as possible on startup.

@Ferroin Ferroin merged commit de9acbb into netdata:master Jun 17, 2024
147 checks passed
@k0ste
Copy link
Contributor Author

k0ste commented Jun 17, 2024

I would argue that in this case the correct configuration is to bind to 0.0.0.0 and rely on Netdata’s access control mechanisms instead of binding to specific IP addresses, but I think this is probably still a reasonable change since it’s going to be more likely that users want the network active when Netdata starts up than that they want Netdata started as early as possible on startup.

I'm not found a way to restrict access to netdata dashboard to 127.0.0.1 and Prometheus endpoint to 192.168.250.1
Seems Prometheus endpoint == dashboard

@vkalintiris
Copy link
Collaborator

@Ferroin So an observability, debugging tool/service should be started after everything works correctly?

@Ferroin
Copy link
Member

Ferroin commented Jun 18, 2024

@vkalintiris We fail to work correctly right now if the agent starts up before the network is online. This should of course be fixed in the C code, but unless and until that happens, delaying startup until we can work correctly is a perfectly reasonable workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/packaging Packaging and operating systems support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants