Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binary ops with a subset of group by keys #1244

Merged
merged 1 commit into from
Jan 28, 2021

Conversation

brharrington
Copy link
Contributor

This change add support for binary operations where both
sides are grouped and the keys for one side is a subset
of the other. Before the keys would have to match exactly.
When one side has a subset, the matching for entries will
be performed using the smaller set of keys.

For example, suppose that dimensions a and b each have
two values:

Expr: q1,(,a,b,),:by, q2,(,a,),:by, :add

Result Set:
      q1:a1b1 + q2:a1
      q1:a1b2 + q2:a1
      q1:a2b1 + q2:a2
      q1:a2b2 + q2:a2

This change add support for binary operations where both
sides are grouped and the keys for one side is a subset
of the other. Before the keys would have to match exactly.
When one side has a subset, the matching for entries will
be performed using the smaller set of keys.

For example, suppose that dimensions `a` and `b` each have
two values:

```
Expr: q1,(,a,b,),:by, q2,(,a,),:by, :add

Result Set:
      q1:a1b1 + q2:a1
      q1:a1b2 + q2:a1
      q1:a2b1 + q2:a2
      q1:a2b2 + q2:a2
```
@brharrington brharrington added this to the 1.7.0 milestone Jan 28, 2021
@brharrington brharrington requested a review from jfz January 28, 2021 13:23
@jfz
Copy link
Contributor

jfz commented Jan 28, 2021

Exact match sounds reasonable, sub set match result can be confusing for users, what kind use cases benefit from this?

@brharrington
Copy link
Contributor Author

An example use-case is computing the percentage of different types of failures for a set of clusters. The numerator is the number of failures by cluster and status. To get a percentage relative to a given cluster, the lines for a given cluster need to be divided by the total number of requests for that cluster.

name,ipc.server.call,:eq,statistic,count,:eq,:and,ipc.result,failure,:eq,:and,(,nf.cluster,ipc.status,),:by,
name,ipc.server.call,:eq,statistic,count,:eq,:and,(,nf.cluster,),:by,
:div,100,:mul

I'm not sure it creates more confusion. I have had some power users kind of assume it would work in the past and were surprised it did not. However, there is a bit of selection bias there, so may not generalize well to users that are less familiar with Atlas. We already support it where one side is grouped and the other isn't:

name,ipc.server.call,:eq,statistic,count,:eq,:and,ipc.result,failure,:eq,:and,(,ipc.status,),:by,
name,ipc.server.call,:eq,statistic,count,:eq,:and,:sum,
:div,100,:mul

This change makes it a bit more regular as that can be thought of as one side having an empty set for the grouping keys. We are no longer special casing that one example of a subset.

@brharrington brharrington merged commit d8688bd into Netflix:master Jan 28, 2021
@brharrington brharrington deleted the groupby-subset branch January 28, 2021 19:24
brharrington added a commit to brharrington/atlas that referenced this pull request Apr 15, 2022
In the past this wasn't permitted because binary operations
had to have the same grouping on both sides. In Netflix#1244 that
was changed so that one side could have a subset of the other.
brharrington added a commit to brharrington/atlas that referenced this pull request Apr 15, 2022
In the past this wasn't permitted because binary operations
had to have the same grouping on both sides. In Netflix#1244 that
was changed so that one side could have a subset of the other.
brharrington added a commit that referenced this pull request Apr 15, 2022
In the past this wasn't permitted because binary operations
had to have the same grouping on both sides. In #1244 that
was changed so that one side could have a subset of the other.
manolama pushed a commit to manolama/atlas that referenced this pull request May 22, 2024
In the past this wasn't permitted because binary operations
had to have the same grouping on both sides. In Netflix#1244 that
was changed so that one side could have a subset of the other.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants