-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postgres schema updates #3860
base: master
Are you sure you want to change the base?
Postgres schema updates #3860
Conversation
- Conversion of UNIQUE KEY constraints to PRIMARY KEY - in pg a PRIMARY KEY is a UNIQUE KEY where all columns are non-null, so this patch makes it more explicit. - Removal of useless indexes - a btree index of (A,B,C) implies fast indexing of (A,B) and (A) but directly specifying these indexes causes much IO overhead on record modification - Allow more than 2b items in spool/pubsub by switching SERIAL to BIGSERIAL - Add some TODOs where it looks like columns should be marked NOT NULL but are not actually. As I'm not particularly familiar with the ejabberd I don't know if this should change or not.
This reduces 3 IO's per update to 1 IO per update in a typical case.
Hi @mzealey, many thanks for your contribution! In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. |
Do tell more. 😉 LE: also, why remove so many indexes? |
You did it @mzealey! Thank you for signing the ProcessOne Contribution License Agreement. We will have a look at your contribution! |
The key ones are
Sorry only saw this now - see commit message for 22d49ad about the rationale behind this; let me know if you need more detail? |
The most obviously beneficial part of this PR is the elimination of redundant indexes. Those changes are applicable to ALL schemas - old and new, and for all DB types. Commit 06ffe99 in PR #3980 addressed those. It could be argued either way which of UNIQUE INDEX and PRIMARY KEY constraints is more explicit/logical/understandable, but again any change should be made across all schemas for consistency. I'd argue that there are limited benefits and sticking with the status-quo is preferred here. Any change would also have knock-on effects on the schema upgrade scripts and ongoing maintenance for existing installations. Changing columns from SERIAL to BIGSERIAL is another reasonable suggestion (and again needs to apply to both old and 'new' schemas). Commit 4f0e426 in PR #3980 addressed this. I'm not convinced the fillfactor performance changes are useful to include in a general purpose schema - premature optimization can be less than helpful, and I feel these might be workload dependent. |
Looks reasonable thanks. I don't think the |
To follow |
@mzealey: Can you look for pg.new.sql too? There is a second PR too: |
Please see the individual commit messages. I have not attempted to apply this to the 'new' schema and there are a couple of TODOs where I think something should be NOT NULL but I'm not 100% sure.
Combined with some server-side postgres performance optimizations this schema is supporting a cluster containing millions of simultaneous active users on relatively small hardware.