Background Coverage Data Reporting #124

Kallin · 2018-09-28T16:06:15Z

We're trying out coverband on production, and finding that the redis calls are taking a majority of the time of some of our web requests when looking at new relic.

Digging into the redis_store, it seems that, for every file, we first get all of the existing values using existing = redis.hgetall(key), then after incrementing the line counts, we use redis.mapped_hmset(key, values) to update redis.

In our application this results in hundreds of gets, and hundreds of sets to redis in the middleware. I have a couple ideas and I'm curious what the devs here think:

Fork this call, https://github.com/danmayer/coverband/blob/master/lib/coverband/middleware.rb#L14 . The web request doesn't need to wait for anything from this report generation, so why not let it run in another process?
Pipeline the calls instead of doing things one file at a time. In our case, that would mean 2 redis commands instead of ~300 on some requests.

ie, here: https://github.com/danmayer/coverband/blob/master/lib/coverband/adapters/redis_store.rb#L26 , we iterate through every file in the report, and for each we get the existing values then increment and update here: https://github.com/danmayer/coverband/blob/master/lib/coverband/adapters/redis_store.rb#L53 . It looks like redis supports pipelining, https://redis.io/topics/pipelining, which would allow all these requests to be grouped together. I've tried this and it seems to work:

def save_report(report)
  store_array(base_key, report.keys)

  keys = report.keys.map {|file| "#{base_key}.#{file}"}

  existing_records = redis.pipelined do
    keys.each do |key|
      redis.hgetall(key)
    end
  end

  combined_report = {}
  keys.each_with_index do | key, i|
    combined_report[key] = {
      new: report.values[i],
      existing: existing_records[i]
    }
  end

  redis.pipelined do
    combined_report.each do |file, values|
      existing = values[:existing]
      new = values[:new]
      unless values.empty?
        # in redis all keys are strings
        new_string_values = Hash[new.map { |k, val| [k.to_s, val] }]
        new_string_values.merge!(existing) { |_k, old_v, new_v| old_v.to_i + new_v.to_i }
        redis.mapped_hmset(file, new_string_values)
        redis.expire(file, @ttl) if @ttl
      end
    end
  end

  # report.each do |file, lines|
  #   store_map("#{base_key}.#{file}", lines)
  # end
end

(extra credit); not sure if redis support this, but perhaps the update could be done with one command? Could you tell redis 'here are all the keys (files) and their values (arrays of line# counts), and if you find existing values for those keys, increment by the count given.

The text was updated successfully, but these errors were encountered:

kbaum · 2018-09-28T22:21:06Z

I have run into the same problem. For now I am running in production using a branch that supports caching using the MemoryCacheStore.

#119

Note that you need to enable the config.memory_caching attribute.

I also want to note that as of yesterday, we started writing to redis in a background thread that sits completely outside of the web request. After calling configure on coverband, we kick off the thread:

Thread.new do
  loop do
    ::Coverband::Collectors::Base.instance.save
    sleep 300
  end
end

Note this works only with the default configuration of the ruby coverage collector and not trace. With this enabled ruby is actually collecting coverage information all of the time and the reporting to redis can be done in a completely separate thread periodically. If you go with this approach, make sure to either set the config.percentage to 0 or remove the Coverband::Middleware.

Hope that makes sense.

Kallin · 2018-10-01T21:29:49Z

thx @kbaum .
Opened this PR to support forking reporting: #125

Kallin · 2018-10-01T22:02:54Z

also added this PR for batching redis operations: #126

danmayer · 2018-10-06T19:34:08Z

What percentage are you reporting @Kallin I would check that you have a low report rate as one of the first things...

After that I think moving away from a get per file and post per file to a single get and post the full hash might be a better way to make this change, we can avoid a good amount of complexity. I believe this issue popped up because coverage lists all files in every report and previously the tracepoint collector only listed any files with changes which is what @kbaum memory cache adds back.

danmayer · 2018-10-07T04:18:51Z

See the comments here @Kallin as I think just enabling the memory cache might be good enough to significantly improve your performance.

#102 (comment)

kbaum · 2018-10-07T11:44:33Z

Memory cache would definitely improve performance. One problem is it takes away the ability to count how many times a line was hit. Another problem is that it assumed that full coverage was being sent as opposed to doubles. The current implementation filters out file coverage that matches the previous result for that file. This mostly works (i think) but takes away line count reliability and does not optimally cache. Once i realized the faults, i created this PR:

#119

This PR knowingly removes the line counting ability but optimally caches and is the one my company is currently using. As @danmayer points out though, some folks are using the line counting so this is probably not good for a merge.

danmayer · 2018-10-09T02:57:30Z

yeah I guess for that kind of change would either need to make optional or do it as part of a major version release... I think I will just keep pushing on the 3.0 branch opposed to trying to figure out a good caching or improved redis structure.

danmayer · 2018-10-10T01:55:39Z

benchmarks now allow for local or remote redis...

local:

Warming up --------------------------------------
 store_redis_reports     1.000  i/100ms
 store_cache_reports     1.000  i/100ms
Calculating -------------------------------------
 store_redis_reports      0.676  (± 0.0%) i/s -     11.000  in  16.447342s
 store_cache_reports      5.519  (±18.1%) i/s -     77.000  in  15.045604s

Comparison:
 store_cache_reports:        5.5 i/s
 store_redis_reports:        0.7 i/s - 8.16x  slower

remote (against heroku redis)

REDIS_TEST_URL=redis://xyz.com:35439 rake benchmarks:cache_reporting
runs benchmarks on reporting large sets of files to redis using memory cache
Warming up --------------------------------------
 store_redis_reports     1.000  i/100ms
 store_cache_reports     1.000  i/100ms
Calculating -------------------------------------
 store_redis_reports      0.348  (± 0.0%) i/s -      6.000  in  17.228485s
 store_cache_reports      1.908  (± 0.0%) i/s -     27.000  in  15.155553s

Comparison:
 store_cache_reports:        1.9 i/s
 store_redis_reports:        0.3 i/s - 5.48x  slower

This shows the redis trip as a significantly larger factor if you aren't on local redis... Now heroku app to heroku redis should be much faster than my local machine to heroku redis... but I think it still can help give a better idea of the perf wins from various redis / memory cache changes.

Note without the cache local redis is 0.676 iterations per second and with remote it slows to 0.348

danmayer · 2018-10-10T03:47:17Z

OK, I released Coverband version 2.0.3.alpha... I added an AB_test benchmark against coverband demo, and ran that benchmark against it deployed on heroku... This shows significant improvement on perf. @Kallin / @kbaum feel free to give the alpha a shot. I am going to push it to our staging and then production servers and long as everything looks good I will make the non alpha release tomorrow.

Against https://coverband-demo.herokuapp.com/

2.0.2

Time per request:       109.566 [ms] (mean, across all concurrent requests)
Percentage of the requests served within a certain time (ms)
  50%    504
  66%    550
  75%    613
  80%    629
  90%    716
  95%    856
  98%   1004
  99%   1047
 100%   1141 (longest request)

2.0.3

Time per request:       63.451 [ms] (mean, across all concurrent requests)
Percentage of the requests served within a certain time (ms)
  50%    306
  66%    321
  75%    338
  80%    344
  90%    368
  95%    388
  98%    426
  99%    448
 100%    473 (longest request)

2.0.3 with memory cache on, note I think my benchmark doesn't really exercise this feature

Time per request:       71.091 [ms] (mean, across all concurrent requests)
Percentage of the requests served within a certain time (ms)
  50%    328
  66%    349
  75%    373
  80%    393
  90%    460
  95%    515
  98%    710
  99%    725
 100%    825 (longest request)

kbaum · 2018-10-10T12:37:07Z

Interesting benchmarks. I think we would see a much bigger delta for large rails apps with thousands of files. One thought is that we should probably just remove memory cache since it might cause inaccurate data and at this point gives limited to no performance win since we have reduced the number of calls to redis.

danmayer · 2018-10-10T14:38:08Z

OK I will remove memory cache in my 3.0 branch @kbaum. Yeah this Rails app is very small. I will update the benchmark script and move it from the demo repo to the coverband repo, and try to make it easy for anyone to run the benchmark against any coverband app. Trying to think if I could simplify the collection of data, like grabbing coverband version, the number of files in Coverage.peek_results, number in Redis... or something like that... At the very least, I can get the basics down and it would require some manual work by an operator to note specifics related to their configuration.

Kallin · 2018-10-10T14:42:54Z

In my local benchmarking, using a fairly large legacy rails app, I found that the memory cache was still a big improvement, even with the redis pipelining. Not sure that we would be able to run comfortably in production without it, unless one of the background threading/forking solutions was in as a replacement.

kbaum · 2018-10-10T14:49:22Z

@Kallin - try master without memory caching as i think this change will be a big improvement:

#127

Kallin · 2018-10-10T15:27:55Z

@kbaum , thanks I'll give it a test

danmayer · 2018-10-10T15:30:50Z

Yeah I think while memory caching breaks the exact line counts (which perhaps should be optional) on the 2.x branch it is a significant improvement for larger apps. That is showed by microbenchmarks, I just don't have a full benchmark against a large app that shows it. I think along with other changes in 3.0 it will not be needed at all, but I will ensure we have some solid benchmarks ensuring that 3.0 significantly reduces the overhead for large file sets.

@Kallin / @kbaum the change in #127 is on the 2.0.3.alpha release, so no need for master if you want to test that pre-release gem. I think that memory cache, in current code case, could still help apps with large file sets

danmayer · 2018-10-11T01:05:26Z

OK the 2.0.3 release is live so I am going to focus back in on 3.0 and see where we end up on thread/forking reporting

danmayer · 2018-10-11T03:46:23Z

OK wow @Kallin @kbaum

I just got the very first version of 3.0 to the point I could run it in coverband_demo and get some of the test / benchmarks working... I guess we will see how much background thread or forked process matters, just by redoing the storage mechanism for Redis to avoid the n+1 issue, the store large Redis benchmarks appear to be nearly 60X faster, that is without some of the other improvements that should be able to be added. Current estimate means it would add less than 25ms per reporting request (still controlled with a reporting frequency) even on the old middleware style reporting.

2.0.3

rake benchmarks:redis_reporting
runs benchmarks on reporting large sets of files to redis
Warming up --------------------------------------
       store_reports     1.000  i/100ms
Calculating -------------------------------------
       store_reports      0.814  (± 0.0%) i/s -     13.000  in  15.982911s

3.0.0 branch

rake benchmarks:redis_reporting
runs benchmarks on reporting large sets of files to redis
Warming up --------------------------------------
       store_reports     4.000  i/100ms
Calculating -------------------------------------
       store_reports     47.111  (± 4.2%) i/s -    708.000  in  15.066259s

Lots of cleanup and various things are still broken, but I believe the change will be very significant.

Kallin · 2018-10-11T03:54:48Z

That's amazing and looks really promising! Could you link to the change to the redis storage that you think is responsible for such a boost? I'm curious to see it.

danmayer · 2018-10-12T02:44:55Z

sure @Kallin the branch is here, it is still very messy and a work in progress, I wanted to prove some things out so this isn't what I would call standard or clean refactoring ;)

https://github.com/danmayer/coverband/tree/feature/coverband_3

The specific Redis change is here, but it doesn't or wouldn't work without some of the related changes to the collector and the reporting code. The big change is that there is no longer good reason to work file by file which made more sense with tracepoint "sparse array" format and having a few files. Now with the way Coverage is formatted never actually being a sparse array but having a hash that represents every line... and Coverage having all Ruby file coverage including Rails / Gems, and such...It has much more overhead. One option would be to do a lot more filtering, but I figured it is only 200-400k, let's just do this as 3 steps

get stored coverage
merge current and stored coverage
store the merged coverage

new Redis adapter: https://github.com/danmayer/coverband/blob/feature/coverband_3/lib/coverband/adapters/redis_store.rb

This results in a much simpler design and has allowed me to remove a bunch of conversions back in forth from line hit arrays [0,nil,1,2] and hashes {'0' => 0, '2' => 1, '3' => 2} which was the format things came from tracepoint due to it being a "sparse array"...

Let me know how it is looking @Kallin / @kbaum I will need some more time to sort out some known issues like full path vs relative path, etc.

danmayer · 2018-10-15T04:00:31Z

OK, all tests passing and full functional support on the 3.0 branch @Kallin @kbaum

#129

A bit more cleanup and work to do, but could be getting close enough for an alpha release soon.

danmayer · 2018-11-03T18:50:06Z

changed the title from "redis performance"

danmayer · 2018-11-03T20:35:31Z

As we are about to get Coverband 3.0.0 out, we think we can possibly add this as an option in the 3.0.x line. I will look at implementing the threaded option sometime soon and review the related suggested PRs.

danmayer · 2018-11-09T16:28:09Z

cool assigned over to you @kbaum thanks for taking a look on this...

kbaum · 2018-11-11T01:27:20Z

Realizing this can be pretty tricky to automatically do within Puma or Unicorn as the thread has to run within the forked workers.

danmayer · 2018-11-11T02:29:57Z

hmmm good point... Especially breaks the idea of just configuring and starting a different integration...

What if the integration always required the middleware, but based on config settings all the middleware does is check if the currently accessable thread has a coverband background thread started... if it doesn't it starts it... after the the middleware basically no-ops.

That or we would need to figure out a fairly generic way to hook into all the various servers after_fork methods.

kbaum · 2018-11-11T03:26:37Z

Figured out how to hook into puma:
470b643#diff-a637af9972be5b34783d41387d25e0bfL45
Unicorn seems more difficult.

I like your idea about the middleware starting the thread but I'm blanking on how I would check to see if there is a coverband background thread started.

I also started looking into the newrelic/rpm library to see how they start their agent in the worker. There's some comments and logic referencing delaying the start of the agent until the workers fork, but ultimately couldn't figure out how they do it.

danmayer · 2018-11-11T03:42:32Z

Thinking when starting the background thread we set a thread local variable... Middleware could check presence or not of that. Basically if it exists do nothing otherwise start background thread runner

…

On Sat, Nov 10, 2018, 8:26 PM Karl Baum ***@***.*** wrote: Figured out how to hook into puma: 470b643#diff-a637af9972be5b34783d41387d25e0bfL45 <470b643#diff-a637af9972be5b34783d41387d25e0bfL45> Unicorn seems more difficult. I like your idea about the middleware starting the thread but I'm blanking on how I would check to see if there is a coverband background thread started. I also started looking into the newrelic/rpm library to see how they start their agent in the worker. There's some comments and logic <https://github.com/newrelic/rpm/blob/master/lib/new_relic/agent/agent.rb#L436> referencing delaying the start of the agent until the workers fork, but ultimately couldn't figure out how they do it. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#124 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABhXW5zfieZ-QbBd6l9RX3PI8YDoal1ks5ut5jugaJpZM4W-zWT> .

kbaum · 2018-11-11T04:14:05Z

I really like that idea. I think we would use some kind of global process state as opposed to a thread local as there should only be one background thread per process.

kbaum · 2018-11-11T13:31:07Z

Looks much better and should work with unicorn and puma. So far tested with puma.

https://github.com/danmayer/coverband/compare/background_thread_reporting

I think we can follow this approach if we detect puma, unicorn, or any other forking framework within the current stacktrace. Otherwise we can just start the background thread within Coverband.start. Newrelic and sqreen seemed to be following this approach. I like it because we would still work out of box with other libraries like sidekiq, delayed_job, thin, etc. Basically, we either start the thread immediately or wait until the first request.

danmayer · 2018-11-11T18:21:38Z

Nice looks good so far @kbaum thanks

Some thoughts:

really like the first thread running kicks off at the process level
good call on process level var vs thread local
especially while first rolling out before we have railties integration I would make it a configurable option
* I do think with railties we can detect forking servers, but for now this would let folks configure and try it out
would still like to see it in /integrations/ vs root Coveband file directly
we should figure out a way to test, but it does seem a bit tricky to test, any ideas?

kbaum · 2018-11-11T20:22:47Z

Agreed on all points. Will move to separate class under integrations module. RE: testing, was going to give that a shot now first starting with unit testing. Might be tough to do an integration test with threads and infinite loops involved, but we might be able to come up with something.

Kallin mentioned this issue Oct 1, 2018

allowing reporting to happen in forked process #125

Closed

Kallin mentioned this issue Oct 1, 2018

pipeline redis operations to decrease overhead of repeated calls #126

Merged

danmayer changed the title ~~redis performance~~ Background Coverage Data Reporting Nov 3, 2018

danmayer added the enhancement label Nov 9, 2018

danmayer assigned danmayer and kbaum and unassigned danmayer Nov 9, 2018

kbaum mentioned this issue Nov 11, 2018

Add option for background thread reporting #138

Merged

kbaum closed this as completed in #138 Nov 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Background Coverage Data Reporting #124

Background Coverage Data Reporting #124

Kallin commented Sep 28, 2018 •

edited

Loading

kbaum commented Sep 28, 2018 •

edited

Loading

Kallin commented Oct 1, 2018

Kallin commented Oct 1, 2018

danmayer commented Oct 6, 2018

danmayer commented Oct 7, 2018

kbaum commented Oct 7, 2018

danmayer commented Oct 9, 2018

danmayer commented Oct 10, 2018

danmayer commented Oct 10, 2018

kbaum commented Oct 10, 2018

danmayer commented Oct 10, 2018

Kallin commented Oct 10, 2018

kbaum commented Oct 10, 2018

Kallin commented Oct 10, 2018

danmayer commented Oct 10, 2018

danmayer commented Oct 11, 2018

danmayer commented Oct 11, 2018

Kallin commented Oct 11, 2018

danmayer commented Oct 12, 2018 •

edited

Loading

danmayer commented Oct 15, 2018

danmayer commented Nov 3, 2018

danmayer commented Nov 3, 2018

danmayer commented Nov 9, 2018

kbaum commented Nov 11, 2018

danmayer commented Nov 11, 2018

kbaum commented Nov 11, 2018

danmayer commented Nov 11, 2018 via email

kbaum commented Nov 11, 2018

kbaum commented Nov 11, 2018

danmayer commented Nov 11, 2018

kbaum commented Nov 11, 2018

Background Coverage Data Reporting #124

Background Coverage Data Reporting #124

Comments

Kallin commented Sep 28, 2018 • edited Loading

kbaum commented Sep 28, 2018 • edited Loading

Kallin commented Oct 1, 2018

Kallin commented Oct 1, 2018

danmayer commented Oct 6, 2018

danmayer commented Oct 7, 2018

kbaum commented Oct 7, 2018

danmayer commented Oct 9, 2018

danmayer commented Oct 10, 2018

danmayer commented Oct 10, 2018

kbaum commented Oct 10, 2018

danmayer commented Oct 10, 2018

Kallin commented Oct 10, 2018

kbaum commented Oct 10, 2018

Kallin commented Oct 10, 2018

danmayer commented Oct 10, 2018

danmayer commented Oct 11, 2018

danmayer commented Oct 11, 2018

Kallin commented Oct 11, 2018

danmayer commented Oct 12, 2018 • edited Loading

danmayer commented Oct 15, 2018

danmayer commented Nov 3, 2018

danmayer commented Nov 3, 2018

danmayer commented Nov 9, 2018

kbaum commented Nov 11, 2018

danmayer commented Nov 11, 2018

kbaum commented Nov 11, 2018

danmayer commented Nov 11, 2018 via email

kbaum commented Nov 11, 2018

kbaum commented Nov 11, 2018

danmayer commented Nov 11, 2018

kbaum commented Nov 11, 2018

Kallin commented Sep 28, 2018 •

edited

Loading

kbaum commented Sep 28, 2018 •

edited

Loading

danmayer commented Oct 12, 2018 •

edited

Loading