Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage Monitoring: Extended Input Usage Tracking #5905

Open
erikwrede opened this issue Nov 12, 2024 · 0 comments
Open

Usage Monitoring: Extended Input Usage Tracking #5905

erikwrede opened this issue Nov 12, 2024 · 0 comments

Comments

@erikwrede
Copy link

Currently, input usage tracking for GraphQL operations in Hive is disabled by default. When enabled, each input variation generates a unique hash for the operation, leading to multiple entries for what is essentially the same operation with different inputs. This approach considerably impacts the efficiency and usability of Hive's analytics, as each input variation is treated as a distinct operation.

I'd like to propose a different solution to this: tracking inputs (excluding arguments) separately.

onsider a typical GraphQL schema used for filtering and mutations, featuring operations that accept various dynamic inputs. Operations structured with input objects can lead to a high degree of variation in the generated operation hashes when inputs vary:

Sample Schema
type Operation {
  id: ID!
  name: String?
  requestCount: Int
}

input StringFilter {
  eq: String
  like: String
}

input OperationFilter {
  id: IDFilter
  name: StringFilter
  requestCount: IntFilter
}

type Query {
  operations(first: Int, after: String, filter: OperationFilter): OperationConnection
}

Now let's have a look at a sample operation against this schema:

query HiveOperationInsightsPage($filter: OperationFilter) {
  operations(filter: $filter) {
    id
    name
    requestCount
  }
}

Called with these two inputs

/*A:*/ {"variables": {"filter": {"name": {"like": "OperationInsightsPa"}}}}
/*B:*/ {"variables": {"filter": {"requestCount": {"gt": 520}}}}

Under the current system with enabled input value tracking, variations A and B would be treated as separate operations. My proposal aims to aggregate such input variations under a single operation framework while still tracking the specific inputs utilized, thereby enabling more cohesive monitoring and analysis without the redundancy and performance drawbacks of multiple operation entries.

All fields in the operation are static except for inputs, which are expected to vary between queries. As a Software Engineer monitoring the performance of specific frontend queries of my schema, I'd like to see these two grouped as one in the Hive UI. I'd like to see an overview of which filters are used by the operation, but I don't need to see the combinations or the explicit performance of one input, as there are indefinitely many combinations of filters.

I acknowledge that there might also be the desire to compare performance between different filters. That's what the current - less performant - option enables, at the tradeoff of "duplicating" operations. In many cases, I acknowledge that a particular client view can use multiple filters and can implement extra monitoring if I suspect a specific filter causes more slowdown than others.

The above example not only applies to filters, but also to input mutations and other complex inputs. To address this, I'd like to propose the following:

  • Track Used Input fields separate from Operations
  • Reference associated Operations in the table tracking the input fields
  • On an operation detail page, collect all associated input logs
  • Use the new table in breaking change & schema usage monitoring for inputs

Happy to discuss further details or hear your feedback on this 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant