Action runner output streaming single collection approach #3729

Kami · 2017-09-11T09:20:41Z

This pull request works on top off #3657 and changes two collection approach with single "generic" collection approach - 03c37d4.

For some reason the actual diff looks a bit messed up, but the change is located in 03c37d4.

It addresses the feedback people had and makes the whole approach a bit more generic so it can work with other / future runners which produce just a single output stream.

The main difference is that this approach utilizes single "action_output_d_b" collection with the following fields:

execution_id
action_ref
runner_ref
timestamp
type
data

Most fields are self explanatory, but two fields of interest are "data" and "type".

"data" contains actual partial execution output. Depending on the runner, this could either be chunk, line or similar. For all the existing runners for which we implement streaming right now this is a line. I was thinking about a different name (e.g. chunk), but in the end I settled with data.

type contains type of the output. Right now that is either stdout or stderr, but for runners with just a single output stream, this could simply be output or similar. I'm open to suggestions for a better name for this field.

I was thinking about "stream" / "stream_type" / "output_type", but in the end I settled with "type". If we can't come up with anything better, I will probably change it to "output_type" so it doesn't clash with Python built-in type function.

As for now, I only implemented model changes so I won't cause unnecessary additional work for myself in case people don't agree with this approach.

Once we agree that's the model we all like, I will go ahead and implement rest of the changes.

If we agree on this approach, actual API endpoint and CLI command will look like this:

/v1/executions/<execution id>/output[?type=stdout/stderr/etc]
st2 execution tail [--type=stdout/stderr/etc]

TODO

objects. Those objects are deleted by default after 7 days.

have been re-generated, committed and included in the changeset.

values change based on the environment where this script run. This way we ensure consistent and stable output of this script, regardgless of where it runs.

CI check which verifies that all the automatically generated files are up to date

When result of action execution contains unicode, Mistral callback fails and leads to orphaned workflow execution.

The interface for the authenticate method has changed. Fix the unit tests associated with the interface change.

Fix mistral callback failure when result contains unicode

Fix Travis caching

Fix missing ';' in travis.yml

Clean up unit tests to be consistent in load runners and actions from st2test fixtures.

On cancellation request, if the liveaction has a parent, set the status to canceling to trigger post_run for the liveaction.

Separate the imports for readability and searchability.

Fix call to keystone auth in mistral runner

Fix cancellation of delayed action execution

…tackStorm/st2 into python_runner_actions_real_time_output

Kami · 2017-09-11T09:21:14Z

@lakshmi-kannan @m4dcoder @bigmstone Let's please agree on this data model today if possible so I can finish rest of the changes and get this feature out.

for storing the output instead of utilizing two collections. This approach makes whole thing more generic and makes it more future-proof and work with other runner which doesn't necessary have two streams (stdout, stderr). Now "type" argument on the document stores type of the output and it means each runner execution can produce as little or as many output types as it needs. Most of the existing runners will produce two type of output (stdout, stderr), but it's likely that in the future we will have other runners which will just produce one type of the output.

Mierdin · 2017-09-11T19:01:03Z

st2common/st2common/models/db/execution.py

-    tail behavior by periodically reading from this collection.
+    Stores output of a particular execution.
+
+    New document is inserted dynamically when a new chunk / line is received which means you can


From this, I gather that this model just represents one "write"? Meaning, for each instance of data that an execution wishes to stream about itself, each would be a unique instance of this model?

Correct - mentioned the reason for that in the other PR (performance - updating single document is not a good idea because of how locking works).

Mierdin · 2017-09-11T19:03:15Z

st2common/st2common/models/db/execution.py

    timestamp = me.DateTimeField(required=True, default=date_utils.get_datetime_utc_now)
+    type = me.StringField(required=True, default='output')


I am not a fan of using names of builtins as variable names

I agree, will rename it to output_type.

m4dcoder · 2017-09-11T19:50:37Z

Do you need rebase this branch? It's including some unrelated commits/fixes from master (unicode, travis, etc.)

m4dcoder · 2017-09-11T19:53:04Z

st2common/st2common/models/db/execution.py

+        execution_id: ID of the execution to which this output belongs.
+        action_ref: Parent action reference.
+        runner_ref: Parent action runner reference.
+        timestamp: Timestamp when this output has been produced / received.


Does it make sense to add an auto-increment sequence field in the collection? In case the timestamps are the same for one or more entries.

Actually, I will just change it to use a custom ComplexDateTimeField field we wrote which has micro-second precision and which we use for executions.

I believe microsecond precision should be fine.

m4dcoder · 2017-09-11T19:57:50Z

LGTM except for a few minor questions regarding unrelated commits in this branch and adding an auto increment sequence field in the ActionExecutionOutput collection.

Kami · 2017-09-12T07:44:44Z

Do you need rebase this branch? It's including some unrelated commits/fixes from master (unicode, travis, etc.)

It was already synced with master and the other branch is based on so no idea why Github is showing this diff the way it is.

Python built-in.

Kami · 2017-09-12T11:54:34Z

Will merge those changes into existing PR and work from there.

Kami and others added 29 commits September 1, 2017 11:44

Turn garbage collection on by default for action execution output

3226389

objects. Those objects are deleted by default after 7 days.

Re-generate sample config file.

0d366a5

Add CI check which verifies that all the automatically generated files

f71a4a8

have been re-generated, committed and included in the changeset.

Run new target on Travis CI.

acf173f

Remove out of order comment.

f3295df

Update the comment.

fe0623b

Add missing exit statement;

0daa95a

Update config gen script to use static values for config options which

93c02fb

values change based on the environment where this script run. This way we ensure consistent and stable output of this script, regardgless of where it runs.

Merge pull request #3715 from StackStorm/generated_files_ci_check

55e3b02

CI check which verifies that all the automatically generated files are up to date

Fix mistral callback failure when result contains unicode

78af14d

When result of action execution contains unicode, Mistral callback fails and leads to orphaned workflow execution.

Add more unit tests to cover various unicode use cases

eb3c5c1

Fix call to keystone auth in mistral runner

238207c

The interface for the authenticate method has changed. Fix the unit tests associated with the interface change.

Debug Travis

5287893

Merge pull request #3721 from StackStorm/mistral-callback-unicode

0a51b75

Fix mistral callback failure when result contains unicode

Debug env for Travis

3a2bed7

Prevent TravisCI caching for master branch

ebd2d44

Merge pull request #3728 from StackStorm/fix/travis-caching

9177b5c

Fix Travis caching

Merge branch 'master' into fix-mistralclient-version

aa00588

Use $TRAVIS_PULL_REQUEST env var for Travis to remove the cache

8933deb

Fix missing ';' in travis.yml

Remove debug 'env' for Travis

2920f00

Fix missing then

2038476

Clean up unit tests for execution cancellation

eef8e3e

Clean up unit tests to be consistent in load runners and actions from st2test fixtures.

Set status to canceling instead of canceled for execution with parent

f01a109

On cancellation request, if the liveaction has a parent, set the status to canceling to trigger post_run for the liveaction.

Separate the imports in test_execution_cancellation

29cf5ad

Separate the imports for readability and searchability.

Add change log entry for cancellation of delayed action execution

fe81842

Merge pull request #3727 from StackStorm/fix-mistralclient-version

447bb72

Fix call to keystone auth in mistral runner

Merge pull request #3726 from StackStorm/fix-delayed-cancel

4ccf6e2

Fix cancellation of delayed action execution

Merge branch 'master' into python_runner_actions_real_time_output

bc73c99

Merge branch 'python_runner_actions_real_time_output' of github.com:S…

319eb9c

…tackStorm/st2 into python_runner_actions_real_time_output

Kami force-pushed the python_runner_actions_real_time_output_single_db_model branch from 46ba975 to 03c37d4 Compare September 11, 2017 09:24

Mierdin reviewed Sep 11, 2017

View reviewed changes

m4dcoder reviewed Sep 11, 2017

View reviewed changes

Kami added 10 commits September 12, 2017 10:10

Use complex date time field for larger precision.

61f0128

Change field name to output_type from type so it doesn't clash with

09eff50

Python built-in.

Update affected streaming code to utilize one model approach.

bfdd1b1

Update affected runner tests.

ec46475

Update affected stream tests.

c8b2b9f

Add more tests / asserts for event stream API endpoint.

52a6552

Update changelog.

3b05bfb

Re-generate openapi.yaml file.

284e23e

Update affected garbage collection code.

15c4d6b

Implement output type filter on the executions output API endpoint.

7d2b219

Kami changed the title ~~[WIP] Action runner output streaming single collection approach~~ Action runner output streaming single collection approach Sep 12, 2017

Kami added 3 commits September 12, 2017 12:16

Update more affected code.

2f6be57

Re-generate sample config.

3389d6e

Add tests for action execution output API endpoint.

e04403a

Kami merged commit 9a24954 into python_runner_actions_real_time_output Sep 12, 2017

Kami deleted the python_runner_actions_real_time_output_single_db_model branch September 12, 2017 11:54

Kami restored the python_runner_actions_real_time_output_single_db_model branch September 12, 2017 12:13

LindsayHill deleted the python_runner_actions_real_time_output_single_db_model branch October 3, 2018 19:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Action runner output streaming single collection approach #3729

Action runner output streaming single collection approach #3729

Kami commented Sep 11, 2017 •

edited

Loading

Kami commented Sep 11, 2017

Mierdin Sep 11, 2017

Kami Sep 12, 2017

Mierdin Sep 11, 2017

Kami Sep 12, 2017

m4dcoder commented Sep 11, 2017

m4dcoder Sep 11, 2017

Kami Sep 12, 2017

m4dcoder commented Sep 11, 2017

Kami commented Sep 12, 2017

Kami commented Sep 12, 2017

		timestamp = me.DateTimeField(required=True, default=date_utils.get_datetime_utc_now)
		type = me.StringField(required=True, default='output')

Action runner output streaming single collection approach #3729

Action runner output streaming single collection approach #3729

Conversation

Kami commented Sep 11, 2017 • edited Loading

TODO

Kami commented Sep 11, 2017

Mierdin Sep 11, 2017

Choose a reason for hiding this comment

Kami Sep 12, 2017

Choose a reason for hiding this comment

Mierdin Sep 11, 2017

Choose a reason for hiding this comment

Kami Sep 12, 2017

Choose a reason for hiding this comment

m4dcoder commented Sep 11, 2017

m4dcoder Sep 11, 2017

Choose a reason for hiding this comment

Kami Sep 12, 2017

Choose a reason for hiding this comment

m4dcoder commented Sep 11, 2017

Kami commented Sep 12, 2017

Kami commented Sep 12, 2017

Kami commented Sep 11, 2017 •

edited

Loading