Support `dotnet trace` running in a sidecar container #810

MichaelSimons · 2020-02-07T23:34:26Z

Run a containerized ASP.NET Core app within a container.

docker run -it -p 8000:80 --name aspnetapp mcr.microsoft.com/dotnet/core/samples:aspnetapp

Build a image which contains the dotnet trace tool

FROM mcr.microsoft.com/dotnet/core/sdk:3.1

RUN dotnet tool install --global dotnet-trace

ENV PATH="${PATH}:/root/.dotnet/tools"

docker build -t dotnet/trace .

Run the dotnet trace container

docker run -it --net=container:aspnetapp --pid=container:aspnetapp --cap-add ALL --privileged dotnet/trace

Try to collect a trace of the ASP.NET Core app.

Expected Results:

I should be able to collect a dotnet trace. It is a common technique to utilize sidecar containers to profile applications running in another container. This allows you to profile existing application images without having to modify them. You can package all of your tools into a completely separate tools image.

An example of running perfcollect with this pattern is documented in this blog post

Actual Results

dotnet trace ps does not match ps -aux

root@17be50bbd0fd:/# ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.1  2.1 21288796 87672 pts/0  SLsl+ 22:36   0:03 dotnet aspnetapp.dll
root      1747  0.0  0.0   3988  3036 pts/0    Ss   23:19   0:00 bash
root      2231  0.0  0.0   7640  2704 pts/0    R+   23:23   0:00 ps -aux
root@17be50bbd0fd:/# dotnet trace ps
      2254 dotnet     /usr/share/dotnet/dotnet
      2272 dotnet-trace /root/.dotnet/tools/dotnet-trace

The PID reported by dotnet trace cannot be found when running collect

root@17be50bbd0fd:/# dotnet trace collect -p 2254
No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0xFFFFFFFFFFFFFFFF  Verbose(5)          --profile
Microsoft-Windows-DotNETRuntime         0x00000004C14FCCBD  Informational(4)    --profile

[ERROR] System.ArgumentException: Process with an Id of 2254 is not running.
   at System.Diagnostics.Process.GetProcessById(Int32 processId, String machineName)
   at System.Diagnostics.Process.GetProcessById(Int32 processId)
   at Microsoft.Diagnostics.Tools.Trace.CollectCommandHandler.Collect(CancellationToken ct, IConsole console, Int32 processId, FileInfo output, UInt32 buffersize, String providers, String profile, TraceFileFormat format, TimeSpan duration) in /_/src/Tools/dotnet-trace/CommandLine/Commands/CollectCommand.cs:line 89

The PID reported by ps is not a valid .NET app when running collect

root@17be50bbd0fd:/# dotnet trace collect -p 1
No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0xFFFFFFFFFFFFFFFF  Verbose(5)          --profile
Microsoft-Windows-DotNETRuntime         0x00000004C14FCCBD  Informational(4)    --profile

[ERROR] System.PlatformNotSupportedException: Process 1 not running compatible .NET Core runtime
   at Microsoft.Diagnostics.Tools.RuntimeClient.DiagnosticsIpc.IpcClient.GetTransport(Int32 processId) in /_/src/Microsoft.Diagnostics.Tools.RuntimeClient/DiagnosticsIpc/IpcClient.cs:line 50
   at Microsoft.Diagnostics.Tools.RuntimeClient.DiagnosticsIpc.IpcClient.SendMessage(Int32 processId, IpcMessage message, IpcMessage& response) in /_/src/Microsoft.Diagnostics.Tools.RuntimeClient/DiagnosticsIpc/IpcClient.cs:line 84
   at Microsoft.Diagnostics.Tools.RuntimeClient.EventPipeClient.CollectTracing(Int32 processId, SessionConfiguration configuration, UInt64& sessionId) in /_/src/Microsoft.Diagnostics.Tools.RuntimeClient/Eventing/EventPipeClient.cs:line 80
   at Microsoft.Diagnostics.Tools.Trace.CollectCommandHandler.Collect(CancellationToken ct, IConsole console, Int32 processId, FileInfo output, UInt32 buffersize, String providers, String profile, TraceFileFormat format, TimeSpan duration) in /_/src/Tools/dotnet-trace/CommandLine/Commands/CollectCommand.cs:line 104

Notes:

I was using the dotnet trace version 3.1.57502+6767a9ac24bde3a58d7b51bdaff7c7d75aab9a65
dotnet trace worked as expected if I added it within my application container. This required me to install to .NET SDK since it is a global tool.
It is possible I am messed up something obvious here and this is something that is already supported 😄

The text was updated successfully, but these errors were encountered:

hoyosjs · 2020-02-08T00:00:26Z

@josalem

josalem · 2020-02-08T00:06:12Z

This is something we are actively working on lighting up, but is unsupported right now. There are some PRs in flight that will enable this scenario in the 5.0 timeframe. dotnet/runtime#1600 and #770 when merged will make this easy to configure. We are thinking of changing the names of the configuration variables from what was merged in the runtime PR, but the functionality should be the same. Documentation on how we recommend setting this scenario up will come before release. It will involve a shared volume mount inside the pod.

CC @shirhatti

SidShetye · 2020-02-13T17:32:06Z

Our use case: When encountering a low memory situation in a container (e.g. host = 16 GB, each containers = 2GB), the tools within that same container cannot do a memory dump because the tool doesn't have enough memory either. But if there was another container, this would be much simpler.

bss-git · 2020-02-18T02:51:18Z

.NET Core app on linux creates domain socket files in /tmp catalog. You can establish IPC session with this file.
You should start your target container with option that maps /tmp somewhere to host, e.g.
--v /tmp/container_sockets:/tmp
And start your tracing container with option that maps host catalog to /tmp in container:
--v /tmp/container_sockets:/tmp
(and with other your options like --pid).
Then if you start tracing it should just work.

Pid in this case in nothing more than abstraction. Internally Microsoft.Diagnostics.NETCore.Client.IpcClient uses pid just to find socket file name and start an IPC session:

                string ipcPort;
                try
                {
                    ipcPort = Directory.GetFiles(IpcRootPath, $"dotnet-diagnostic-{processId}-*-socket") // Try best match.
                                .OrderByDescending(f => new FileInfo(f).LastWriteTime)
                                .FirstOrDefault();
                    if (ipcPort == null)
                    {
                        throw new ServerNotAvailableException($"Process {processId} not running compatible .NET Core runtime.");
                    }
                }

I've configured intercontainer IPC in my monitoring app so it collects counters just by socket file names from two other containers. You even don't need target pid in this scenario.

glitch100 · 2020-05-14T13:28:06Z

@MichaelSimons is this still the primary issue for this? I saw in #1737 that you are using this for tracking?

Following the steps in the issue above still has issues with the PIDs being inaccurate between ps -aux and dotnet trace ps. I am not sure if I am missing something with the sidecar approach or if the socket based approach posted by @bss-git is the way it should be done. Is there any documentation for this?

MichaelSimons · 2020-05-14T13:44:16Z

@glitch100 - as far as I am aware, this is the primary issue.

@shirhatti - can you help @glitch100?

shirhatti · 2020-05-14T22:15:59Z

@glitch100 As noted in an earlier comment, access to the PID namespace isn't really required.

You just need access to the diagnostic server created by the runtime. As it stands today (3.0/3.1), this socket is always created in the /tmp directory.

Tooling and runtime changes are incoming for 5.0 that allow you customize how the diagnostics server is created and how the tools attach.

@bss-git's suggestion of sharing the temp directory across both containers should suffice.

FROM mcr.microsoft.com/dotnet/core/sdk:3.1 AS tools
RUN dotnet tool install --tool-path /tools dotnet-trace

FROM mcr.microsoft.com/dotnet/core/aspnet:3.1 AS runtime

COPY --from=tools /tools /tools
ENV PATH="/tools:${PATH}"

ENV COMPlus_EnableDiagnostics="0"

WORKDIR /tools

docker run -d -p 8000:80 -v /container:/tmp mcr.microsoft.com/dotnet/core/samples:aspnetapp
docker run --rm -u root -it -v /container:/tmp trace /bin/sh

# Inside the trace container
dotnet-trace collect -p 1

glitch100 · 2020-05-15T08:52:04Z

@shirhatti
Firstly for this to work will my dotnet-app also need:

ENV COMPlus_EnableDiagnostics="0"

In the dockerfile or we hit that issue around CoreClr starting (Due to /tmp directory.)

Once running, and having another container with the Dockerfile you provided I still hit an issue:

No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000000000000000  Informational(4)    --profile 
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile 

Unable to start a tracing session: Microsoft.Diagnostics.NETCore.Client.ServerNotAvailableException: Process 1 not running compatible .NET Core runtime.
No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000000000000000  Informational(4)    --profile 
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile 

Unable to start a tracing session: Microsoft.Diagnostics.NETCore.Client.ServerNotAvailableException: Process 1 not running compatible .NET Core runtime.
   at Microsoft.Diagnostics.NETCore.Client.IpcClient.GetTransport(Int32 processId) in /_/src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsIpc/IpcClient.cs:line 63
   at Microsoft.Diagnostics.NETCore.Client.IpcClient.SendMessage(Int32 processId, IpcMessage message, IpcMessage& response) in /_/src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsIpc/IpcClient.cs:line 104
   at Microsoft.Diagnostics.NETCore.Client.EventPipeSession..ctor(Int32 processId, IEnumerable`1 providers, Boolean requestRundown, Int32 circularBufferMB) in /_/src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsClient/EventPipeSession.cs:line 30
   at Microsoft.Diagnostics.Tools.Trace.CollectCommandHandler.Collect(CancellationToken ct, IConsole console, Int32 processId, FileInfo output, UInt32 buffersize, String providers, String profile, TraceFileFormat format, TimeSpan duration, String clrevents, String clreventlevel) in /_/src/Tools/dotnet-trace/CommandLine/Commands/CollectCommand.cs:line 130
Unable to create session.

I have confirmed that both have volumes for /tmp mapped to the host.

Could you provide a working example? Also is this documented anywhere?

shirhatti · 2020-05-15T17:29:49Z

Firstly for this to work my dotnet-app also will need:
ENV COMPlus_EnableDiagnostics="0"

That's not going to work. If you disable the diagnostics server, you can't trace the application.

If creation of pipes fails, do you mind creating a new issue for that? Let's work through that first.

Could you provide a working example?

I'll publish a gist later today.

glitch100 · 2020-05-18T09:00:48Z

@shirhatti Any luck on that gist?

glitch100 · 2020-05-18T10:40:11Z

Worth noting that I have tried what you posted with a vanilla ASPNET app with docker support (As per Visual Studio), as well as the trace container. I have tried in both Docker Compose and via the CLI with the commands above.

Docker Compose gives me that tmp CORECLR issues so I suspect I am doing something wrong there regards to the volumes/mounts.

docker run... successfully starts both containers, but again I get the same error as I posted above.

I look forward to seeing the gist as I am hoping I have done something silly rather than this being an issue on the dotnet side.

shirhatti · 2020-05-18T18:48:42Z

EDIT: I made a small change to my earlier comment to include -u root on the trace container. I've just verified that my comment does indeed work and can be considered a complete example.

Docker Compose gives me that tmp CORECLR issues so I suspect I am doing something wrong there regards to the volumes/mounts.

As I mentioned earlier, please create a separate issue for that.

glitch100 · 2020-05-18T19:47:15Z

I will give it a try. Where would you recommend I raise that issue?

shirhatti · 2020-05-19T06:27:23Z

Where would you recommend I raise that issue?

https://github.com/dotnet/runtime

glitch100 · 2020-05-19T08:15:09Z

@shirhatti I appreciate you making the edit however the timeline of this conversation does now seem a bit strange. That said the edits you made did get it working which is good 🎉, so thanks.

I might be wrong on this one but it seems like the default Dockerfile in a new ASPNETAPP (template) was not compatible with that flow, however the differences are quite minor so I am really unsure if it's down to the way I am running the container or if I made some changes along the way.

I validated this by running the sample image as you did, seeing success, running mine, seeing failure, and then updating my Dockerfile to match.

Is is the aspnet:3.1-buster-slim images?

I will make a ticket on the CORECLR repo.

Final question - is this documented somewhere?

galvesribeiro · 2020-06-01T12:54:05Z

Hey folks!

After reading this issue among many others I was able to make it work. The YAML would look something like this:

apiVersion: v1
kind: Pod
metadata:
  name: myapp
spec:
  shareProcessNamespace: true
  volumes:
    - name: data
      emptyDir: {}
    - name: tmp
      emptyDir: {}
  containers:
    - name: sampleapp
      image: sampleapp:latest
      imagePullPolicy: Never
      volumeMounts:
        - name: data
          mountPath: /app/data
        - name: tmp
          mountPath: /tmp
      ports:
        - containerPort: 8080
    - name: profiler
      image: profiler:latest
      imagePullPolicy: Never
      stdin: true
      tty: true
      env:
        - name: PROFILER_TARGET_PROCESS
          value: "SampleApp"
        - name: PROFILER_BLOB_CONTAINER
          value: "dumps"
        - name: PROFILER_DATA_PATH
          value: /profiler/data
      volumeMounts:
        - name: data
          mountPath: /profiler/data
        - name: tmp
          mountPath: /tmp

The important pieces there are:

shareProcessNamespace must be set to true otherwise, the process namespace isn't shared across the containers within the pod;
You must mount /tmp in a shared volume. dotnet-trace / dotnet-counters / dotnet-dump will all rely on it to connect to the target process;
stdin and tty must be set to true. If it is false, the dotnet-xxx tools will fail to start.

The rest of the options are totally optional as they are only meant to simplify and make the profiler container more generic. On my example, I'm also mapping the /profiler/data volume so I can pass it in the -o argument to the tools to write the files. Once the trace session is over, it will upload to a blob storage for further analysis.

So yes, it is somehow an involved process but it works just fine.

I hope it helps!

galvesribeiro · 2020-06-01T12:56:02Z

You can ofc use the injection hooks on the admission controller to add that profiler pod spec using a label rather than having it hardcoded on the pod/deployment definition. I just meant to give an example and what are the requirements to get it working.

glitch100 · 2020-06-01T13:08:21Z

Thanks a bunch - I was having trouble with docker-compose, and it looks like the stdin and tty were the bits I was missing. I will give this a go thanks

StupidScience · 2020-11-14T23:44:51Z

You can also try our kubectl plugin that was created for gathering trace/gcdump results from dotnet apps running in k8s

baal2000 · 2020-12-01T18:59:38Z

@noahfalk is this issue related to #1720?

josalem · 2020-12-01T19:21:30Z

Sort of. That issue is tracking shipping a container that comes pre-installed with our tools.

This issue is tracking an experience where you can configure your multi-container Pod to have the tools in one container and your app in another. We have added the necessary features to the tools/runtime for 5.0 to do this but haven't documented the functionality fully yet.

There is a PR open on the docs repo that documents the flags for the tools, but not the end-to-end experience: dotnet/docs#21666

MichaelSimons mentioned this issue Feb 7, 2020

Add dotnet trace tooling in base image? dotnet/dotnet-docker#1672

Closed

josalem added the containers related to running/installing/configuring Diagnostics in a container label Feb 8, 2020

josalem added this to the 5.0 milestone Feb 8, 2020

MichaelSimons mentioned this issue Apr 1, 2020

Why are dotnet-* tools not out of the box available in the docker images for .NET CORE dotnet/dotnet-docker#1737

Closed

tomasdeml mentioned this issue May 27, 2020

Running dotnet-dump from another pod #1163

Closed

noahfalk added the enhancement New feature or request label Nov 6, 2020

tommcdon modified the milestones: 5.0, 6.0 Dec 18, 2020

tommcdon modified the milestones: 6.0.0, 7.0.0 Jun 21, 2021

tommcdon modified the milestones: 7.0.0, 8.0.0 Sep 12, 2022

Jongy mentioned this issue Oct 21, 2022

dotnet-trace on containers #3480

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `dotnet trace` running in a sidecar container #810

Support `dotnet trace` running in a sidecar container #810

MichaelSimons commented Feb 7, 2020

hoyosjs commented Feb 8, 2020

josalem commented Feb 8, 2020

SidShetye commented Feb 13, 2020

bss-git commented Feb 18, 2020

glitch100 commented May 14, 2020

MichaelSimons commented May 14, 2020

shirhatti commented May 14, 2020 •

edited

Loading

glitch100 commented May 15, 2020 •

edited

Loading

shirhatti commented May 15, 2020

glitch100 commented May 18, 2020

glitch100 commented May 18, 2020

shirhatti commented May 18, 2020

glitch100 commented May 18, 2020

shirhatti commented May 19, 2020

glitch100 commented May 19, 2020

galvesribeiro commented Jun 1, 2020

galvesribeiro commented Jun 1, 2020

glitch100 commented Jun 1, 2020

StupidScience commented Nov 14, 2020

baal2000 commented Dec 1, 2020

josalem commented Dec 1, 2020

Support dotnet trace running in a sidecar container #810

Support dotnet trace running in a sidecar container #810

Comments

MichaelSimons commented Feb 7, 2020

hoyosjs commented Feb 8, 2020

josalem commented Feb 8, 2020

SidShetye commented Feb 13, 2020

bss-git commented Feb 18, 2020

glitch100 commented May 14, 2020

MichaelSimons commented May 14, 2020

shirhatti commented May 14, 2020 • edited Loading

glitch100 commented May 15, 2020 • edited Loading

shirhatti commented May 15, 2020

glitch100 commented May 18, 2020

glitch100 commented May 18, 2020

shirhatti commented May 18, 2020

glitch100 commented May 18, 2020

shirhatti commented May 19, 2020

glitch100 commented May 19, 2020

galvesribeiro commented Jun 1, 2020

galvesribeiro commented Jun 1, 2020

glitch100 commented Jun 1, 2020

StupidScience commented Nov 14, 2020

baal2000 commented Dec 1, 2020

josalem commented Dec 1, 2020

Support `dotnet trace` running in a sidecar container #810

Support `dotnet trace` running in a sidecar container #810

shirhatti commented May 14, 2020 •

edited

Loading

glitch100 commented May 15, 2020 •

edited

Loading