Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrument python runner to get more metrics around delayed executions #2974

Closed
lakshmi-kannan opened this issue Oct 25, 2016 · 5 comments
Closed

Comments

@lakshmi-kannan
Copy link
Contributor

lakshmi-kannan commented Oct 25, 2016

Simple Python actions take a few seconds to complete execution.

(virtualenv)vagrant@st2dev /m/s/s/st2 ❯❯❯ st2 run examples.isprime value=7                                  STORM-2935/st2_timers_list ✭ ✱ ◼
..
id: 580fb666d9d7ed14ab36f33f
status: succeeded
parameters:
  value: 7
(virtualenv)vagrant@st2dev /m/s/s/st2 ❯❯❯ st2 execution get 580fb666d9d7ed14ab36f33f -dj                    STORM-2935/st2_timers_list ✭ ✱ ◼
{
    "action": {
        "ref": "examples.isprime"
    },
    "context": {
        "user": "stanley"
    },
    "end_timestamp": "2016-10-25T19:45:44.263915Z",
    "id": "580fb666d9d7ed14ab36f33f",
    "liveaction": {
        "action": "examples.isprime",
        "action_is_workflow": false,
        "callback": {},
        "id": "580fb666d9d7ed14ab36f33e",
        "parameters": {
            "value": 7
        },
        "runner_info": {
            "hostname": "st2dev.stackstorm.net",
            "pid": 5224
        }
    },
    "parameters": {
        "value": 7
    },
     "start_timestamp": "2016-10-25T19:45:42.049665Z",
    "status": "succeeded"
}

It looks like the fork call takes a couple of seconds but I benchmarked the fork call and it doesn't appear that fork is expensive from a standalone tester (https://gist.github.com/lakshmi-kannan/eb03a3f92d72b6f391a2f18fc8211883). So we have to instrument our code to get more fine grained metrics.

@enykeev
Copy link
Member

enykeev commented Oct 28, 2016

I don't see execution log we've added in #2718. Have we forgot to expose it on detailed view of st2client?

@Kami
Copy link
Member

Kami commented Oct 28, 2016

@enykeev Yeah, good point - we probably need to update client to include it when -d flag is used :)

@Kami
Copy link
Member

Kami commented Oct 28, 2016

@lakshmi-kannan Done some more measuring and it looks like that bulk of the time (around 1.5 seconds). is indeed spent in run_process (as assumed). Other stuff takes around 500ms or so to total run time of an execution is around 2 seconds.

I checked eventlet implementation to see what is going on (e.g. if it's using some long poll timeout or similar) and I couldn't find anything offending - https://github.com/eventlet/eventlet/blob/master/eventlet/green/subprocess.py

eventlet patches .communicate() which is the suspicious part. I tried replacing communicate with just a call to .wait() and this drop the run-time of that function to 700ms. So if we want to optimize it, we need to dig in and see exactly what is going on with .communicate() (it's also quite likely that we can't do much about it).

Another thing is also worth nothing (and something we should write in the documentation) is that no one should expect sub-1 second run time for Python runner actions - if users want to use it for something which requires such response times they will need to write their own runner, doing that with a Python runner which involves forking a subprocess with Python interpreter is simply not possible.

@Kami
Copy link
Member

Kami commented Oct 28, 2016

I did some more testing - removing communicate and doing process.wait(); process.stdout.read(), process.stderr.read().

It looks like that the slow part is indeed reading from stdout and stderr pipe (that's what basically communicate does).

@Kami
Copy link
Member

Kami commented Oct 28, 2016

I tried various things including changing the buffer size, etc, and I can't get the run-time to decrease when reading from stdout and stderr.

So in short there is not much we do for the Python runner. One thing we should do though is document the performance profile of runners - StackStorm is primarily also designed for longer running actions so if users need sub-second or Lambda like performance StackStorm is not a good fit (they could get a run time down by using a custom runner though). We should make this very explicit in documentation so there won't be any surprises.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants