-
Notifications
You must be signed in to change notification settings - Fork 514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-compute safe dimension slices #1873
Conversation
Your preview environment pr-1873-bttf has been deployed with errors. Preview environment endpoint is available here |
@@ -179,7 +179,7 @@ export interface ExperimentSnapshotTraffic { | |||
dimension: { | |||
[dimension: string]: ExperimentSnapshotTrafficDimension[]; | |||
}; | |||
error?: "NO_ROWS_IN_UNIT_QUERY" | "TOO_MANY_ROWS"; | |||
error?: "NO_ROWS_IN_UNIT_QUERY" | "TOO_MANY_ROWS" | string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we be more specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just passing through the query error here, so not really. I could just have "QUERY_ERROR" and then you could look up the query error by reference in the front end, I guess?
...ront-end/components/Settings/EditDataSource/AutomaticDimension/UpdateAutomaticDimensions.tsx
Outdated
Show resolved
Hide resolved
@@ -140,7 +140,7 @@ export interface ExposureQuery { | |||
query: string; | |||
hasNameCol?: boolean; | |||
dimensions: string[]; | |||
dimensionsForTraffic?: string[]; | |||
automaticDimensionId?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this link in the database? We already have a link the other direction since there's a data source and exposureQueryId in the automatic dimension model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because I don't want to automatically just get the latest value from that collection (e.g. they re-run the query but it gives them garbage) and they choose not to save it.
packages/back-end/src/queryRunners/AutomaticDimensionQueryRunner.ts
Outdated
Show resolved
Hide resolved
).map((id) => ({ | ||
type: "experiment", | ||
id, | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be some sanity check for when there are no dimensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good call. It currently just generates broken SQL. I've updated it to throw an error and surface it to the user. I also now hide the entry point to this modal when there are no dimensions.
0837110
to
de2f0d5
Compare
import { DimensionMetadataInterface } from "../types/Integration"; | ||
import { queriesSchema } from "./QueryModel"; | ||
|
||
const dimensionMetadatachema = new mongoose.Schema({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably change the name of this model to be less abstract (metadata schema) to more specific (dimension value queries, etc)
dimensionValues: { name: string; percent: number }[]; | ||
} | ||
|
||
export interface DimensionMetadataInterface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same suggestion about naming (metadata too broad?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, renaming to dimensionSlices where important. I still keep it as dimensionMetadata
in the ExposureQueryInterface
because we may add additional metadata to that object.
* Add reliable dimensions query * Modal mostly working * Get save working * lint and such * merge origin main * Wire auto dimensions through to units and traffic query * remove dimensions for traffic * Fix SQL * Bryce comments * Move to kebab & add context * rename && remove ff * remove FF * Table version * ui/code tweaks * Add running language * remove console * Remove metadata field * Additional renaming * Hide entry with no dimensions * Move storing specified values directly to exposure query * TS * Small updates * Fix type issue * Rename && add lookback selector * Cleanup * push pixels * Add tracking * Blank out button for automatic dimensions * Ensure cast to string before string comparison --------- Co-authored-by: Bryce <[email protected]>
* initial, updating existing types and methods with new bucket stuff * moar types * simplify targeting ui, remove stickyBucketing var * add minBucketVersion, add reseed option for new phases * org setting for sticky bucketing, premium feature, organize settings page * lock SB switch based if !hasCommercialFeature * working on rollout recommendation logic * variation blocking ui, wip * tweak logic kinda * tweak * re-add disableStickyBucketing * ui wip * ui wip * ui wip * ui wip * ui wip * ui wip * ui wip * ui wip * Support ability to set role with SCIM request. (#1779) * Support ability to set role with SCIM request. * Modify cascading window logic for excluding in-progress conversions (#1921) * Modify cascading window logic for excluding in-progress conversions * Move fn * Test * Fix unnecessary activation fact metric data retrieval (and two minor FE bugs) (#1922) * Update conversion window logic for fact metrics * CLean up language * More proximate fix * Pre-compute safe dimension slices (#1873) * Add reliable dimensions query * Modal mostly working * Get save working * lint and such * merge origin main * Wire auto dimensions through to units and traffic query * remove dimensions for traffic * Fix SQL * Bryce comments * Move to kebab & add context * rename && remove ff * remove FF * Table version * ui/code tweaks * Add running language * remove console * Remove metadata field * Additional renaming * Hide entry with no dimensions * Move storing specified values directly to exposure query * TS * Small updates * Fix type issue * Rename && add lookback selector * Cleanup * push pixels * Add tracking * Blank out button for automatic dimensions * Ensure cast to string before string comparison --------- Co-authored-by: Bryce <[email protected]> * Add new rollback workflow (#1867) * Fix bug - metric capping cannot be updated on old metrics (#1930) * Add Random On-Screen Celebration Service (#1826) * Add on-screen confetti celebration service to app. Launch experiment consumes it and will fire 20% of the time. * Allow orgs to define role when creating and updating groups via SCIM (#1909) * Adds support for setting team/group global role when creating a team or updating a team via SCIM endpoints * SDK compatibility versioning (#1893) * initial: new sdkVersion field, sdkid/form UI * sdk-versioning package, getCurrentVersion method * getCapabilities * payload scrubbing. mostly feature complete * BE resolveJsonModule * update scrubbing logic to use pick * refactor json * isOutdated check * add all sdks, fill in versions/capabilities * tweaks * get minimal safe capabilities when multi-language, add tests * basic gating in SDK connection form * more retrofitting * more retrofitting * address feedback * address feedback * lint * use today's capabilities as a snapshot for default sdk version * small fixes * clean up scrubbing, add tests * fix sdkconnection post endpoint * fix model option version * api types * fix test * multi-stage scrubbing, update legacy connection logic, tests * update php for looseUnmarshalling * address feedback * log payload deltas instead of scrubbing until confident * use logger * log capabilities * use isEqual for better payload scrub comparison (#1935) * ui wip * resize release plan summary. probably redo this later... * fix hook deps * ui wip * block save when no changes, cleanup old code * more clarity around SB SDK support * refactor models based on discussions (no blockedVariations, new bool) * wip multi-stage modal * wip multi-stage modal * wip multi-stage modal * wip multi-stage modal * wip multi-stage modal * wip multi-stage modal * wip multi-stage modal * wip multi-stage modal * confirm check * add warnings while making changes * cleanup * cleanup * fine tune ui * remove blocked from variation (add later) * clean up modal ui * payload scrubbing * remove reseed from phases modals, fix any type * attempt saved group change detection * lint * move warning to release plan page, don't warn when using recommended, ui tweaks * change targeting language, ui tweaks * fix saved group comparison, fix release plan selector, adjust ui * wip SB onboarding / toggle * sticky bucketing onboarding and toggle gating + warnings * phases: looking for targeting? -> make changes * refactor * retool warnings * address some feedback * adjust settings SB toggle * fix saved group restrictive ANY calc * fix saved group restrictive ANY calc again * make changes to choices * other fix * tweaks * calculate risk level per option * phase and warning stuff * ui tweak, edit phases show/hide button logic * adjust warning language * move same-phase-everyone into default release plan options * add SB keys to payload scrubber * implement some suggestions * fix legacy webhook capabilities * fix help text * fix help text again * update types * ui tweaks * remove onboarding CTAs when sdk issues detected, point to docs * gate fallback attributes by org setting * UI tweaks for sticky bucket onboarding and Make Changes modal * minor ui tweaks * Sticky bucketing docs (#2017) * talk about new stuff in experiment-configuration, fix proxy docs (unrelated) * sb doc stub * sb doc stub * sb doc wip * change slug * sb doc wip * sb doc wip * sb doc wip * sdk docs * Language * de-emphasize fallbackAttribute in examples * Update sticky bucket docs page with fallback attribute example and helper image for SDK connections --------- Co-authored-by: Luke Sonnet <[email protected]> Co-authored-by: Jeremy Dorn <[email protected]> --------- Co-authored-by: Michael Knowlton <[email protected]> Co-authored-by: Luke Sonnet <[email protected]> Co-authored-by: Adnan Chowdhury <[email protected]> Co-authored-by: Jeremy Dorn <[email protected]> Co-authored-by: Luke Sonnet <[email protected]>
This PR introduces "Dimension Metadata", which uses existing Experiment Dimensions and creates stable bins that can be used to reliably only get a max of 21 (20 + "other") bins per experiment dimension. This helps deal with the problem where dimensions are high cardinality and are therefore unsuitable to pre-compute health queries or analysis queries for dimensions with every update.
This PR has a three key components:
DimensionMetadataModel
to store dimension results and query datadatasources
controllerExposureQuery
interface to store theid
of the latest "saved" automatic dimension as well as the computed results from the interface with the dimension metadata query runnerDimensionMetadataQueryRunner
that can manage running the query to pick the slices using historical data from an experiment exposure queryExperimentResultsQueryRunner
to use dimension metadata when needed (right now just in the units query and in the traffic query)SQLIntegration.ts
, two key changes:getDimensionMetadataQuery
that scans the last 30 days of your exposure query and gets the top 20 dimensions for each experiment dimensionallowedValues
) to reduce cardinalityid
along with the computed results to theExposureQuery
if they wish to use it laterTODO (maybe in fast follow):
Loom overview: https://www.loom.com/share/17bae04087e94f8299512d9b857b5f6b
Loom error states: https://www.loom.com/share/4382c49e44d34f9891ce81786523e097
Health query that uses automatic dimensions
Testing
To test you need the feature flag
health-tab
to be on for you, and to actually run the health queries with auto dimensions you need to have turned on the traffic query in your org settings.