Netbox HA #7065

vladsker · 2021-08-30T15:24:55Z

vladsker
Aug 30, 2021

Hello, folks,

I'm relying on Netbox for my automation, and different scripts and robots are using it more and more frequently.

So, the natural question came - how are you managing highly available installations? It does not look like there is much information on this matter.

Thanks for creating such a great product!

Any advice is much appreciated!

Kind regards,
Vlad

Answered by candlerb

Sep 4, 2021

80% of making Netbox HA is making Postgres HA. Once you've done that, you can have multiple frontends all pointing to the same Postgres database or cluster, stick them behind a load-balancer, and they'll happily share the load or take over from each other.

10% is making Redis HA - since multiple netbox and netbox-rq workers will need to point to the same Redis instance.

The remaining 10% is sharing the media directory (/opt/netbox/netbox/media), and the reports and scripts directories. The media directory can be shared using NFS (in which case you will need a HA NFS server), or you can configure Netbox to store its media on S3 instead. The reports and scripts directories can be periodical…

View full answer

candlerb · 2021-09-04T17:10:46Z

candlerb
Sep 4, 2021

80% of making Netbox HA is making Postgres HA. Once you've done that, you can have multiple frontends all pointing to the same Postgres database or cluster, stick them behind a load-balancer, and they'll happily share the load or take over from each other.

10% is making Redis HA - since multiple netbox and netbox-rq workers will need to point to the same Redis instance.

The remaining 10% is sharing the media directory (/opt/netbox/netbox/media), and the reports and scripts directories. The media directory can be shared using NFS (in which case you will need a HA NFS server), or you can configure Netbox to store its media on S3 instead. The reports and scripts directories can be periodically rsync'd from a master location, or pushed out using your configuration management system.

If you just want disaster recovery rather than HA, then it's somewhat simpler. Configure your master postgres server to replicate to a backup postgres server in another data centre, and sync the media/reports/scripts directories. When you need to cutover in disaster, then promote the replica database to master. You may lose any in-flight background tasks or webhook events in Redis; if that's important you could configure Redis leader-follower replication too. However to be honest, if the whole primary data centre goes down, and you're only doing asynchronous replication between sites, you may lose the most recently-written data anyway.

Alternatively you could outsource this, and use one of the cloud-managed Postgres services (e.g. AWS RDS) and Redis services (e.g. AWS MemoryDB). Soon you'll be able to outsource your entire Netbox instance to NS1 :-)

2 replies

mayuresh82 Aug 29, 2024

This is factually incorrect. You can totally have multiple netbox instances fronted by an LB, pointing to a single instance of Postgres AND Redis and it will work just fine. Of course you need a beefy Postgres and Redis machine. You dont need to share any media directories or anything. Each Netbox instance acts as an independent standalone install, with its own WSGI workers.

candlerb Aug 29, 2024

Having multiple Netbox front-ends behind a LB is fine, but does not give you "high availability" for any useful sense of the term unless Postgres and Redis are themselves HA.

You certainly do need to share media/script/report directories, unless you are pulling those from S3 or git.

robertlynch3 · 2022-05-18T16:42:17Z

robertlynch3
May 18, 2022

I was looking through this and I don't mind an active/backup setup, so I am really only looking to get Postgres in HA mode. That being said, how does this work with upgrading the two instances. I am sure instance 1 will upgrade no problem, but what about instance two?

2 replies

candlerb May 18, 2022

As long as you don't upgrade both instances at once, you should be fine. The second one will notice that the database migrations have already been applied, and not apply them again.

Alternatively, you could modify the ./upgrade.sh script on the second host to comment out those steps which touch the database.

If the upgrade is going to take a while, then make sure you shut down the second instance before you start - because you don't want to be running an old version of Netbox which is talking to a database which has been migrated to a newer schema.

robertlynch3 May 18, 2022

Awesome, thanks for this info!

robertlynch3 · 2022-05-23T17:21:22Z

robertlynch3
May 23, 2022

For any future readers. If you configure Netbox with Postgres in a hot/standby config, you need to add the following config so that the backup system will allow user sign in. (credits to #3118 and #3196 )

In netbox/netbox/netbox/configuration.py

MAINTENANCE_MODE=True # this allows the server to connect to the local read-only db
SESSION_FILE_PATH="/opt/netbox/netbox/sessionFile" # this will have netbox write sessions to local storage instead of the database. you will need to run `mkdir /opt/netbox/netbox/sessionFile` which needs to be owned by the netbox user

In netbox/netbox/netbox/ldap_config.py
And if you have LDAP configured, you will need to add

AUTH_LDAP_ALWAYS_UPDATE_USER = False #stops ldap from updating the database, needed for readonly databases
``

4 replies

chicks-net May 23, 2022

Is this config only for the backup host and not the primary host? If so, does anything special need to be configured on the primary host?

If /opt/netbox/netbox/sessionFile is a directory it might be less confusing to name it /opt/netbox/netbox/sessions or something else that leaves the File out of a directory name.

candlerb May 23, 2022

Is this config only for the backup host and not the primary host?

He is describing a situation where:

There is postgres-level replication from a primary database to a secondary database (and therefore the secondary database is read-only)
The primary host points at the primary database
The secondary host points at the secondary database
You want to be able to login to the secondary host (e.g. for testing DR)

In that situation, you make those special config changes to the secondary host only. In the event of a disaster, you'd promote the secondary database to primary, and turn off maintenance mode.

I'd suggest a more conventional approach would be to point the secondary Netbox host at the primary database. In this case, active/active works fine. In the event of a disaster, you'd repoint the secondary host to the secondary database, and promote the secondary database to primary.

robertlynch3 May 23, 2022

If /opt/netbox/netbox/sessionFile is a directory it might be less confusing to name it /opt/netbox/netbox/sessions or something else that leaves the File out of a directory name.

Yes you can call it whatever you'd like. You need to make sure that it is copied between updates or it will complain.

He is describing a situation where:

There is postgres-level replication from a primary database to a secondary database (and therefore the secondary database is read-only)

The primary host points at the primary database

The secondary host points at the secondary database

You want to be able to login to the secondary host (e.g. for testing DR)

Sorry I didn't make this clearer before. My use case is a failure at the main site (where Netbox is located) and we need Netbox up to bring said site back online. In my case, we are only going to use the backup image for that one situation, all day to day interactions with Netbox will be with the primary instance.

I didn't want to overcomplicate it as my use case is strictly for disaster recovery, but mainly that critical time to bring the main core infrastructure up.

alihamidzadeh Jun 26, 2024

For any future readers. If you configure Netbox with Postgres in a hot/standby config, you need to add the following config so that the backup system will allow user sign in. (credits to #3118 and #3196 )

In netbox/netbox/netbox/configuration.py
MAINTENANCE_MODE=True # this allows the server to connect to the local read-only db
SESSION_FILE_PATH="/opt/netbox/netbox/sessionFile" # this will have netbox write sessions to local storage instead of the database. you will need to run `mkdir /opt/netbox/netbox/sessionFile` which needs to be owned by the netbox user
In netbox/netbox/netbox/ldap_config.py And if you have LDAP configured, you will need to add
AUTH_LDAP_ALWAYS_UPDATE_USER = False #stops ldap from updating the database, needed for readonly databases
``

Hi there, this method is okay for V.3.2.7?
I set that for our HA, but it doesn't work and still says cannot execute UPDATE in a read-only transaction!

@jeremystretch @robertlynch3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Netbox HA #7065

{{title}}

Replies: 3 comments 8 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Netbox HA #7065

Replies: 3 comments · 8 replies

Replies: 3 comments 8 replies