Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action runner output streaming single collection approach #3729

Conversation

Kami
Copy link
Member

@Kami Kami commented Sep 11, 2017

This pull request works on top off #3657 and changes two collection approach with single "generic" collection approach - 03c37d4.

For some reason the actual diff looks a bit messed up, but the change is located in 03c37d4.

It addresses the feedback people had and makes the whole approach a bit more generic so it can work with other / future runners which produce just a single output stream.

The main difference is that this approach utilizes single "action_output_d_b" collection with the following fields:

  • execution_id
  • action_ref
  • runner_ref
  • timestamp
  • type
  • data

Most fields are self explanatory, but two fields of interest are "data" and "type".

"data" contains actual partial execution output. Depending on the runner, this could either be chunk, line or similar. For all the existing runners for which we implement streaming right now this is a line. I was thinking about a different name (e.g. chunk), but in the end I settled with data.

type contains type of the output. Right now that is either stdout or stderr, but for runners with just a single output stream, this could simply be output or similar. I'm open to suggestions for a better name for this field.

I was thinking about "stream" / "stream_type" / "output_type", but in the end I settled with "type". If we can't come up with anything better, I will probably change it to "output_type" so it doesn't clash with Python built-in type function.

As for now, I only implemented model changes so I won't cause unnecessary additional work for myself in case people don't agree with this approach.

Once we agree that's the model we all like, I will go ahead and implement rest of the changes.

If we agree on this approach, actual API endpoint and CLI command will look like this:

  • /v1/executions/<execution id>/output[?type=stdout/stderr/etc]
  • st2 execution tail [--type=stdout/stderr/etc]

TODO

  • Update affected API endpoints
  • Update affected event stream code
  • Update affected cli docs
  • Update affected tests
  • Update affected docs

Kami and others added 29 commits September 1, 2017 11:44
objects.

Those objects are deleted by default after 7 days.
have been re-generated, committed and included in the changeset.
values change based on the environment where this script run.

This way we ensure consistent and stable output of this script,
regardgless of where it runs.
CI check which verifies that all the automatically generated files are up to date
When result of action execution contains unicode, Mistral callback fails and leads to orphaned workflow execution.
The interface for the authenticate method has changed. Fix the unit tests associated with the interface change.
Fix mistral callback failure when result contains unicode
Clean up unit tests to be consistent in load runners and actions from st2test fixtures.
On cancellation request, if the liveaction has a parent, set the status to canceling to trigger post_run for the liveaction.
Separate the imports for readability and searchability.
Fix call to keystone auth in mistral runner
Fix cancellation of delayed action execution
…tackStorm/st2 into python_runner_actions_real_time_output
@Kami
Copy link
Member Author

Kami commented Sep 11, 2017

@lakshmi-kannan @m4dcoder @bigmstone Let's please agree on this data model today if possible so I can finish rest of the changes and get this feature out.

for storing the output instead of utilizing two collections.

This approach makes whole thing more generic and makes it more
future-proof and work with other runner which doesn't necessary have two
streams (stdout, stderr).

Now "type" argument on the document stores type of the output and it
means each runner execution can produce as little or as many output
types as it needs.

Most of the existing runners will produce two type of output (stdout,
stderr), but it's likely that in the future we will have other runners
which will just produce one type of the output.
@Kami Kami force-pushed the python_runner_actions_real_time_output_single_db_model branch from 46ba975 to 03c37d4 Compare September 11, 2017 09:24
tail behavior by periodically reading from this collection.
Stores output of a particular execution.

New document is inserted dynamically when a new chunk / line is received which means you can
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this, I gather that this model just represents one "write"? Meaning, for each instance of data that an execution wishes to stream about itself, each would be a unique instance of this model?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct - mentioned the reason for that in the other PR (performance - updating single document is not a good idea because of how locking works).

timestamp = me.DateTimeField(required=True, default=date_utils.get_datetime_utc_now)
type = me.StringField(required=True, default='output')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of using names of builtins as variable names

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, will rename it to output_type.

@m4dcoder
Copy link
Contributor

Do you need rebase this branch? It's including some unrelated commits/fixes from master (unicode, travis, etc.)

execution_id: ID of the execution to which this output belongs.
action_ref: Parent action reference.
runner_ref: Parent action runner reference.
timestamp: Timestamp when this output has been produced / received.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add an auto-increment sequence field in the collection? In case the timestamps are the same for one or more entries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I will just change it to use a custom ComplexDateTimeField field we wrote which has micro-second precision and which we use for executions.

I believe microsecond precision should be fine.

@m4dcoder
Copy link
Contributor

LGTM except for a few minor questions regarding unrelated commits in this branch and adding an auto increment sequence field in the ActionExecutionOutput collection.

@Kami
Copy link
Member Author

Kami commented Sep 12, 2017

Do you need rebase this branch? It's including some unrelated commits/fixes from master (unicode, travis, etc.)

It was already synced with master and the other branch is based on so no idea why Github is showing this diff the way it is.

@Kami Kami changed the title [WIP] Action runner output streaming single collection approach Action runner output streaming single collection approach Sep 12, 2017
@Kami
Copy link
Member Author

Kami commented Sep 12, 2017

Will merge those changes into existing PR and work from there.

@Kami Kami merged commit 9a24954 into python_runner_actions_real_time_output Sep 12, 2017
@Kami Kami deleted the python_runner_actions_real_time_output_single_db_model branch September 12, 2017 11:54
@Kami Kami restored the python_runner_actions_real_time_output_single_db_model branch September 12, 2017 12:13
@LindsayHill LindsayHill deleted the python_runner_actions_real_time_output_single_db_model branch October 3, 2018 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants