Postgres schema updates #3860

mzealey · 2022-07-08T09:23:17Z

Please see the individual commit messages. I have not attempted to apply this to the 'new' schema and there are a couple of TODOs where I think something should be NOT NULL but I'm not 100% sure.

Combined with some server-side postgres performance optimizations this schema is supporting a cluster containing millions of simultaneous active users on relatively small hardware.

- Conversion of UNIQUE KEY constraints to PRIMARY KEY - in pg a PRIMARY KEY is a UNIQUE KEY where all columns are non-null, so this patch makes it more explicit. - Removal of useless indexes - a btree index of (A,B,C) implies fast indexing of (A,B) and (A) but directly specifying these indexes causes much IO overhead on record modification - Allow more than 2b items in spool/pubsub by switching SERIAL to BIGSERIAL - Add some TODOs where it looks like columns should be marked NOT NULL but are not actually. As I'm not particularly familiar with the ejabberd I don't know if this should change or not.

This reduces 3 IO's per update to 1 IO per update in a typical case.

p1bot · 2022-07-08T09:23:18Z

Hi @mzealey, many thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes.

licaon-kter · 2022-07-08T09:25:54Z

@mzealey

some server-side postgres performance optimizations

Do tell more. 😉

LE: also, why remove so many indexes?

p1bot · 2022-07-08T09:27:53Z

You did it @mzealey!

Thank you for signing the ProcessOne Contribution License Agreement.

We will have a look at your contribution!

coveralls · 2022-07-08T09:50:16Z

Coverage decreased (-0.02%) to 33.579% when pulling 02eab53 on binuadmin:mz/pg-schema-updates into 99d9e31 on processone:master.

mzealey · 2022-07-19T09:55:55Z

@mzealey

some server-side postgres performance optimizations

Do tell more. wink

The key ones are commit_delay=10000 and synchronous_commit=off which accept a little bit of potential data loss for a massive reduction in latency/iops.

LE: also, why remove so many indexes?

Sorry only saw this now - see commit message for 22d49ad about the rationale behind this; let me know if you need more detail?

nosnilmot · 2023-01-20T22:07:34Z

The most obviously beneficial part of this PR is the elimination of redundant indexes. Those changes are applicable to ALL schemas - old and new, and for all DB types. Commit 06ffe99 in PR #3980 addressed those.

It could be argued either way which of UNIQUE INDEX and PRIMARY KEY constraints is more explicit/logical/understandable, but again any change should be made across all schemas for consistency. I'd argue that there are limited benefits and sticking with the status-quo is preferred here. Any change would also have knock-on effects on the schema upgrade scripts and ongoing maintenance for existing installations.

Changing columns from SERIAL to BIGSERIAL is another reasonable suggestion (and again needs to apply to both old and 'new' schemas). Commit 4f0e426 in PR #3980 addressed this.

I'm not convinced the fillfactor performance changes are useful to include in a general purpose schema - premature optimization can be less than helpful, and I feel these might be workload dependent.

mzealey · 2023-01-21T10:39:19Z

Looks reasonable thanks. I don't think the fillfactor is particularly bad - Pg sets it (I believe to 90%) on btree indexes anyway, so setting it at 90% on a table which has high volume of UPDATEs will just allow much less disk IO (via HOT mechanism) at a slight cost of disk space. I don't see how it's ever going to be detrimental.

Neustradamus · 2023-01-30T20:34:29Z

To follow

Neustradamus · 2023-11-04T12:39:44Z

@mzealey: Can you look for pg.new.sql too?

There is a second PR too:

Convert indexes to primary keys #4113

mzealey added 2 commits July 8, 2022 10:15

Enable HOT updates on tables which meet criteria for them

02eab53

This reduces 3 IO's per update to 1 IO per update in a typical case.

p1bot added the cla-missing Contributor needs to sign Contribution License Agreement label Jul 8, 2022

p1bot removed the cla-missing Contributor needs to sign Contribution License Agreement label Jul 8, 2022

mremond self-assigned this Jul 8, 2022

nosnilmot mentioned this pull request Jan 20, 2023

SQL related fixes and updates #3980

Merged

mremond added this to the ejabberd 23.xx milestone Jul 20, 2023

badlop modified the milestones: ejabberd 23.10, ejabberd 23.xx Oct 17, 2023

mremond assigned jsautret and unassigned mremond Oct 26, 2023

badlop modified the milestones: ejabberd 24.xx, Parking Lot Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Postgres schema updates #3860

Postgres schema updates #3860

mzealey commented Jul 8, 2022

p1bot commented Jul 8, 2022

licaon-kter commented Jul 8, 2022 •

edited

Loading

p1bot commented Jul 8, 2022

coveralls commented Jul 8, 2022

mzealey commented Jul 19, 2022

nosnilmot commented Jan 20, 2023

mzealey commented Jan 21, 2023

Neustradamus commented Jan 30, 2023

Neustradamus commented Nov 4, 2023

Postgres schema updates #3860

Are you sure you want to change the base?

Postgres schema updates #3860

Conversation

mzealey commented Jul 8, 2022

p1bot commented Jul 8, 2022

licaon-kter commented Jul 8, 2022 • edited Loading

p1bot commented Jul 8, 2022

coveralls commented Jul 8, 2022

mzealey commented Jul 19, 2022

nosnilmot commented Jan 20, 2023

mzealey commented Jan 21, 2023

Neustradamus commented Jan 30, 2023

Neustradamus commented Nov 4, 2023

licaon-kter commented Jul 8, 2022 •

edited

Loading