Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
tracing: reduce disabled span
Drop
overhead (#1974)
## Motivation Disabled spans introduce a non-trivial amount of overhead, even when no `tracing` subscriber is in use. This is primarily due to the need to create and then drop the empty `Span` struct even when the span is disabled. While thinking about #1970 a bit, I noticed that one source of overhead is that dropping a disabled span always causes a function call, even when the span is empty. This could be avoided. ## Solution In this branch, I've changed the `Drop` impls for `Span`, `Entered`, and `EnteredSpan` to be `#[inline(always)]`. In the always-inlined functions, we perform a check for whether or not the span is empty, and if it is not empty, we call into the dispatcher method to drop the span or guard. The dispatcher methods are no longer inlined. Now, the function call only occurs when the span _is_ enabled, rather than always occurring in the `Drop` call. This significantly reduces the overhead for holding a disabled span, or a disabled `Entered` guard, in a scope. Also, the `log` integration when dropping a span would always check if the span had metadata, even when `log` is disabled. This means we would do an extra branch that wasn't necessary. I moved that into the macro that guards for whether or not the `log` crate is enabled, which also significantly reduces overhead. This change reduces the overhead of a disabled span by 50-70%, per the `no_subscriber.rs` benchmarks. I also improved those benchmarks a bit to test more cases, in order to find the precise difference in overhead between just constructing `Span::none()` and the actual `span!` macros. <details> <summary><code>no_subscriber.rs</code> benchmark results</summary> ``` no_subscriber/span time: [696.37 ps 696.53 ps 696.73 ps] change: [-50.599% -50.577% -50.544%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe no_subscriber/span_enter time: [465.58 ps 466.35 ps 467.61 ps] change: [-71.350% -71.244% -71.138%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) high mild 2 (2.00%) high severe no_subscriber/empty_span time: [226.15 ps 226.73 ps 227.36 ps] change: [-84.404% -84.045% -83.663%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 5 (5.00%) high mild 1 (1.00%) high severe no_subscriber/empty_struct time: [693.32 ps 693.76 ps 694.30 ps] change: [+1.7164% +1.9701% +2.2540%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 5 (5.00%) high mild 3 (3.00%) high severe no_subscriber/event time: [294.32 ps 301.68 ps 310.85 ps] change: [+0.3073% +2.1111% +4.1919%] (p = 0.03 < 0.05) Change within noise threshold. Found 16 outliers among 100 measurements (16.00%) 2 (2.00%) high mild 14 (14.00%) high severe no_subscriber/relaxed_load time: [463.24 ps 463.74 ps 464.33 ps] change: [+1.4046% +1.6735% +1.9366%] (p = 0.00 < 0.05) Performance has regressed. Found 16 outliers among 100 measurements (16.00%) 1 (1.00%) low severe 6 (6.00%) high mild 9 (9.00%) high severe no_subscriber/acquire_load time: [465.28 ps 465.68 ps 466.08 ps] change: [+0.6837% +1.1755% +1.6034%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe no_subscriber/log time: [231.11 ps 231.27 ps 231.45 ps] change: [-4.4700% -2.3810% -0.9164%] (p = 0.00 < 0.05) Change within noise threshold. Found 17 outliers among 100 measurements (17.00%) 3 (3.00%) low mild 8 (8.00%) high mild 6 (6.00%) high severe no_subscriber_field/span time: [1.6334 ns 1.6343 ns 1.6354 ns] change: [-12.401% -12.337% -12.279%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild no_subscriber_field/event time: [461.54 ps 461.84 ps 462.14 ps] change: [-0.3654% +0.1235% +0.5557%] (p = 0.62 > 0.05) No change in performance detected. Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe no_subscriber_field/log time: [463.52 ps 463.98 ps 464.49 ps] change: [+0.3011% +0.8645% +1.6355%] (p = 0.01 < 0.05) Change within noise threshold. Found 18 outliers among 100 measurements (18.00%) 4 (4.00%) low mild 10 (10.00%) high mild 4 (4.00%) high severe ``` </details> Signed-off-by: Eliza Weisman <[email protected]>
- Loading branch information