-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize memory cache values #119
Conversation
We only want to report to coverband if a new line is covered.
Initializing with what is in redis does not make sense. Suppose within redis the file has coverage. On the first request, the file has no coverage, we will then pass this request to the underlying store.
This was resulting in memory cache incorrectly caching coverage.
I guess we need to decide the goal... binary was this line run or not or an accurate number of times a line is hit. There were some requests and PRs to have Coverband support line usage counts vs just hit or not hit... If you just send the peek_results directly every time the data is reported the count would include the number of times a line was hit and all the past times it was hit... This was actually an issue when I first moved to
Hopefully, that highlights the issue I see... but perhaps I am not following you correctly |
let me know your thoughts @kbaum |
@danmayer - Excellent explanation on why the coverage collector is calculating differences. In our case, we only care where the line was hit or not. Using this approach has potential for major performance improvements as we are only hitting redis when new code paths have been covered. I think it makes sense to offer this as an option since my bet is others are less concerned with tracking how many times a line was hit. I wonder what the use case is around tracking the exact number of times a line was hit? Is it worth the performance cost? Another thought is that maybe this all becomes a moot point if we move toward the coverage being reported to redis within a background thread? |
yeah, I think as we work on the background thread and drop some other complexity we should be able to remove most of the perf hit for it. We could make it an optional feature, but given one of the goals is to simplify install / usage, let's hold off on optional unless there is a good perf or other reason. In terms of why it is valuable, I have used it to find hot spots in the code... ex Middleware that is hit on every request but doesn't need to be. Same for before filters and around filters. Generally finding code that is executed on nearly every request is a good place to start for some optimization. The other thing I haven't done but have always considered useful would be to compare the highest hit production code vs test coverage. If something is on the critical path on production it could justify additional testing. Agree that the most valuable is hit or not hit, but overall usage ratio from very often to barely at all can be helpful. Maintaining code that is hit once a month at some point might be more of a liability than of value, depending on use cases. |
This PR does two things:
First the memory cache filters out files that have no new coverage. Previously the memory cache was not filtering out files that were hitting the same lines over and over since the value of the line was being incremented.
Second, this PR removes the difference calculation on the coverage collector. With the memory cache enabled, I'm not sure why we would need something calculating the difference in coverage. I think we can just send the latest coverage from ruby ::Coverage.peek_result to the underlying store. If we go with this approach perhaps we should always have memory cache enabled?