-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Background Coverage Data Reporting #124
Comments
I have run into the same problem. For now I am running in production using a branch that supports caching using the MemoryCacheStore. Note that you need to enable the I also want to note that as of yesterday, we started writing to redis in a background thread that sits completely outside of the web request. After calling configure on coverband, we kick off the thread: Thread.new do
loop do
::Coverband::Collectors::Base.instance.save
sleep 300
end
end Note this works only with the default configuration of the ruby coverage collector and not trace. With this enabled ruby is actually collecting coverage information all of the time and the reporting to redis can be done in a completely separate thread periodically. If you go with this approach, make sure to either set the Hope that makes sense. |
also added this PR for batching redis operations: #126 |
What percentage are you reporting @Kallin I would check that you have a low report rate as one of the first things... After that I think moving away from a get per file and post per file to a single get and post the full hash might be a better way to make this change, we can avoid a good amount of complexity. I believe this issue popped up because coverage lists all files in every report and previously the tracepoint collector only listed any files with changes which is what @kbaum memory cache adds back. |
See the comments here @Kallin as I think just enabling the memory cache might be good enough to significantly improve your performance. |
Memory cache would definitely improve performance. One problem is it takes away the ability to count how many times a line was hit. Another problem is that it assumed that full coverage was being sent as opposed to doubles. The current implementation filters out file coverage that matches the previous result for that file. This mostly works (i think) but takes away line count reliability and does not optimally cache. Once i realized the faults, i created this PR: This PR knowingly removes the line counting ability but optimally caches and is the one my company is currently using. As @danmayer points out though, some folks are using the line counting so this is probably not good for a merge. |
yeah I guess for that kind of change would either need to make optional or do it as part of a major version release... I think I will just keep pushing on the 3.0 branch opposed to trying to figure out a good caching or improved redis structure. |
benchmarks now allow for local or remote redis... local:
remote (against heroku redis)
This shows the redis trip as a significantly larger factor if you aren't on local redis... Now heroku app to heroku redis should be much faster than my local machine to heroku redis... but I think it still can help give a better idea of the perf wins from various redis / memory cache changes. Note without the cache local redis is 0.676 iterations per second and with remote it slows to 0.348 |
OK, I released Coverband version
|
Interesting benchmarks. I think we would see a much bigger delta for large rails apps with thousands of files. One thought is that we should probably just remove memory cache since it might cause inaccurate data and at this point gives limited to no performance win since we have reduced the number of calls to redis. |
OK I will remove memory cache in my 3.0 branch @kbaum. Yeah this Rails app is very small. I will update the benchmark script and move it from the demo repo to the coverband repo, and try to make it easy for anyone to run the benchmark against any coverband app. Trying to think if I could simplify the collection of data, like grabbing coverband version, the number of files in Coverage.peek_results, number in Redis... or something like that... At the very least, I can get the basics down and it would require some manual work by an operator to note specifics related to their configuration. |
In my local benchmarking, using a fairly large legacy rails app, I found that the memory cache was still a big improvement, even with the redis pipelining. Not sure that we would be able to run comfortably in production without it, unless one of the background threading/forking solutions was in as a replacement. |
@kbaum , thanks I'll give it a test |
Yeah I think while memory caching breaks the exact line counts (which perhaps should be optional) on the 2.x branch it is a significant improvement for larger apps. That is showed by microbenchmarks, I just don't have a full benchmark against a large app that shows it. I think along with other changes in 3.0 it will not be needed at all, but I will ensure we have some solid benchmarks ensuring that 3.0 significantly reduces the overhead for large file sets. @Kallin / @kbaum the change in #127 is on the 2.0.3.alpha release, so no need for master if you want to test that pre-release gem. I think that memory cache, in current code case, could still help apps with large file sets |
OK the 2.0.3 release is live so I am going to focus back in on 3.0 and see where we end up on thread/forking reporting |
I just got the very first version of 3.0 to the point I could run it in coverband_demo and get some of the test / benchmarks working... I guess we will see how much background thread or forked process matters, just by redoing the storage mechanism for Redis to avoid the n+1 issue, the store large Redis benchmarks appear to be nearly 60X faster, that is without some of the other improvements that should be able to be added. Current estimate means it would add less than 25ms per reporting request (still controlled with a reporting frequency) even on the old middleware style reporting. 2.0.3
3.0.0 branch
Lots of cleanup and various things are still broken, but I believe the change will be very significant. |
That's amazing and looks really promising! Could you link to the change to the redis storage that you think is responsible for such a boost? I'm curious to see it. |
sure @Kallin the branch is here, it is still very messy and a work in progress, I wanted to prove some things out so this isn't what I would call standard or clean refactoring ;) https://github.com/danmayer/coverband/tree/feature/coverband_3 The specific Redis change is here, but it doesn't or wouldn't work without some of the related changes to the collector and the reporting code. The big change is that there is no longer good reason to work file by file which made more sense with tracepoint "sparse array" format and having a few files. Now with the way Coverage is formatted never actually being a sparse array but having a hash that represents every line... and Coverage having all Ruby file coverage including Rails / Gems, and such...It has much more overhead. One option would be to do a lot more filtering, but I figured it is only 200-400k, let's just do this as 3 steps
new Redis adapter: https://github.com/danmayer/coverband/blob/feature/coverband_3/lib/coverband/adapters/redis_store.rb This results in a much simpler design and has allowed me to remove a bunch of conversions back in forth from line hit arrays Let me know how it is looking @Kallin / @kbaum I will need some more time to sort out some known issues like full path vs relative path, etc. |
changed the title from "redis performance" |
As we are about to get Coverband 3.0.0 out, we think we can possibly add this as an option in the 3.0.x line. I will look at implementing the threaded option sometime soon and review the related suggested PRs. |
cool assigned over to you @kbaum thanks for taking a look on this... |
Realizing this can be pretty tricky to automatically do within Puma or Unicorn as the thread has to run within the forked workers. |
hmmm good point... Especially breaks the idea of just configuring and starting a different integration... What if the integration always required the middleware, but based on config settings all the middleware does is check if the currently accessable thread has a coverband background thread started... if it doesn't it starts it... after the the middleware basically no-ops. That or we would need to figure out a fairly generic way to hook into all the various servers |
Figured out how to hook into puma: I like your idea about the middleware starting the thread but I'm blanking on how I would check to see if there is a coverband background thread started. I also started looking into the newrelic/rpm library to see how they start their agent in the worker. There's some comments and logic referencing delaying the start of the agent until the workers fork, but ultimately couldn't figure out how they do it. |
Thinking when starting the background thread we set a thread local
variable... Middleware could check presence or not of that. Basically if it
exists do nothing otherwise start background thread runner
…On Sat, Nov 10, 2018, 8:26 PM Karl Baum ***@***.*** wrote:
Figured out how to hook into puma:
470b643#diff-a637af9972be5b34783d41387d25e0bfL45
<470b643#diff-a637af9972be5b34783d41387d25e0bfL45>
Unicorn seems more difficult.
I like your idea about the middleware starting the thread but I'm blanking
on how I would check to see if there is a coverband background thread
started.
I also started looking into the newrelic/rpm library to see how they start
their agent in the worker. There's some comments and logic
<https://github.com/newrelic/rpm/blob/master/lib/new_relic/agent/agent.rb#L436>
referencing delaying the start of the agent until the workers fork, but
ultimately couldn't figure out how they do it.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#124 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABhXW5zfieZ-QbBd6l9RX3PI8YDoal1ks5ut5jugaJpZM4W-zWT>
.
|
I really like that idea. I think we would use some kind of global process state as opposed to a thread local as there should only be one background thread per process. |
Looks much better and should work with unicorn and puma. So far tested with puma. https://github.com/danmayer/coverband/compare/background_thread_reporting I think we can follow this approach if we detect puma, unicorn, or any other forking framework within the current stacktrace. Otherwise we can just start the background thread within Coverband.start. Newrelic and sqreen seemed to be following this approach. I like it because we would still work out of box with other libraries like sidekiq, delayed_job, thin, etc. Basically, we either start the thread immediately or wait until the first request. |
Nice looks good so far @kbaum thanks Some thoughts:
|
Agreed on all points. Will move to separate class under integrations module. RE: testing, was going to give that a shot now first starting with unit testing. Might be tough to do an integration test with threads and infinite loops involved, but we might be able to come up with something. |
We're trying out coverband on production, and finding that the redis calls are taking a majority of the time of some of our web requests when looking at new relic.
Digging into the redis_store, it seems that, for every file, we first get all of the existing values using
existing = redis.hgetall(key)
, then after incrementing the line counts, we useredis.mapped_hmset(key, values)
to update redis.In our application this results in hundreds of gets, and hundreds of sets to redis in the middleware. I have a couple ideas and I'm curious what the devs here think:
Fork this call, https://github.com/danmayer/coverband/blob/master/lib/coverband/middleware.rb#L14 . The web request doesn't need to wait for anything from this report generation, so why not let it run in another process?
Pipeline the calls instead of doing things one file at a time. In our case, that would mean 2 redis commands instead of ~300 on some requests.
ie, here: https://github.com/danmayer/coverband/blob/master/lib/coverband/adapters/redis_store.rb#L26 , we iterate through every file in the report, and for each we get the existing values then increment and update here: https://github.com/danmayer/coverband/blob/master/lib/coverband/adapters/redis_store.rb#L53 . It looks like redis supports pipelining, https://redis.io/topics/pipelining, which would allow all these requests to be grouped together. I've tried this and it seems to work:
(extra credit); not sure if redis support this, but perhaps the update could be done with one command? Could you tell redis 'here are all the keys (files) and their values (arrays of line# counts), and if you find existing values for those keys, increment by the count given.
The text was updated successfully, but these errors were encountered: