Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Copy-on-Write forking (logic copied from Puma) #1160

Open
wants to merge 40 commits into
base: master
Choose a base branch
from

Conversation

johnnyshields
Copy link

@johnnyshields johnnyshields commented Jan 9, 2022

This PR further improves #1138 by adding copy-on-write forking which forks from the first worker child to the other children. After a configurable timeout (e.g. 1 hour), it will re-fork its children from the first child.

The reason is to reduce memory usage and our AWS bill at TableCheck :)

Much of the code is copied from Puma 5, with proper attribution. (Puma is MIT licensed.)

TODOS:

  • Fix phased restart (seems to be doing multiple stop/starts of worker1)
  • Fix TTIN/TTOU --> not being detected?
  • Determine why code is hot-reloading. Is this just in development mode?
  • Fix process names (showing as "ruby")
  • Write pidfile
  • Test single mode + fix bugs
  • Test cluster mode + fix bugs
  • "pooled" mode not yet supported. I am thinking to do N-reforks round robin, where the base worker of each pool is forked from once. This will ensure that all code loaded is compacted.
  • "pooled" mode TTIN and TTOU signals for worker scaling
  • Hooks for re-fork, shutdown, etc
  • Support the restart, stop, etc. script commands.
  • Cleanup logger (remove Loggable mixin)

Reason: The `script/delayed_job` process (described above) uses the
[Daemons](https://github.com/thuehlinger/daemons) gem which has several
undesirable behaviors when running inside a container (Docker, etc.):
- The parent process spawns a background child process then exits immediately,
which causes the container to shutdown.
- The worker processes are detached from the parent, which prevents logging to `STDOUT`.

The `--fork` option solves this by running workers in a foreground process tree.
When using `--fork` the `daemons` gem is not required.

The command script requires changing the following line to enable this:

# always daemonizes, never forks
Delayed::Command.new(ARGV).daemonize

must now be:

# daemonize by default unless --fork specified
Delayed::Command.new(ARGV).launch

In addition, there are several quality of life improvements added to the Command script:
- Command: Add --pools arg as an alias for --pool, and allow pipe-delimited pool definitions
- Command: Add --num-worker arg as an alias for -n / --number-of-workers (and deprecate --number-of-workers)
- Command: Pool parsing code has been extracted to PoolParser class
- Command: Add -v / --verbose switch which sets quiet=false on workers
- Command: Add -x switch as alias for --exit-on-complete
- Command: Add STDERR warning num-workers less than 1
- Command: Add STDERR warning if num-workers and pools both specified (pools overrides num-workers as per existing behavior)
- Command: Add STDERR warning if queues and pools both specified (pools overrides num-workers as per existing behavior)

The Rake script has also been enhanced:
- Rake: Uses Forking launcher under the hood
- Rake: Add support for NUM_WORKERS and POOLS args
- Add validation of min vs. max priority
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant