Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for ?include_attributes and ?exclude_attributes API query param filters #4300

Merged
merged 72 commits into from
Sep 4, 2018

Conversation

Kami
Copy link
Member

@Kami Kami commented Aug 9, 2018

This pull request improves handling of the ?include_attributes and ?exclude_attributes API query param filters.

With those query parameters user can specify which attributes it wants to receive in the API response. Among other things, it also controls which fields we retrieve from the database.

Because of the way our API models and responses work, we convert database document to the API model before serializing that to JSON and returning it as a response. API model performs object validation and sets the default values for the fields which are not provided.

That means, for example, if user only requests name field it may get many other fields back, depending on the default values specified on the API model jsonschema (and those fields will use default values from the API model schema which is confusing and invalid).

The problem is related to the issue (deficiency in our API model handling) I've already mentioned a long time ago. We are abusing API model for two purposes - input and output. We really should have two models (one for input and one for output) and this way we would avoid issues like that.

Sadly that would require a lot more work so I'm going with the middle of the ground approach to begin with.

As per discussion with @enykeev, I decided to perform this filtering inside the router. This way it happens very last, just before serializing the response to JSON and returning it back to the user.

In the first attempt, I tried to do it inside the API model, but it really doesn't belong there. API model represents data which has been validated and it's complete and valid (based on the model jsonschema). API model with some fields missing could potentially be invalid, hence that data shouldn't be represented by the API model.

Current approach is not the most efficient, because we still convert each database document to the API model before filtering / processing the response, but it should be a start. Future optimization would involve skipping the API model layer all together when either ?exclude_attributes or ?include_attributes query param filter is provided.

That involves a lot more work and refactoring though (we would potentially need to touch and change each API controller because a lot of them override parent controller "_get_all" method and handle API model instantiation themselves).

TODO

  • Tests
    • Test for various edge cases (invalid field specified, field contains non-whitelisted characters, make sure user can't fish out secret parameters, etc.)
    • Tests for the following endpoints
      • /v1/actions
      • /v1/actions/views/overview
      • /v1/rules
      • /v1/rules/views
      • /v1/runnertypes
      • /v1/sensortypes
      • /v1/ruleenforcements
      • /v1/ruleenforcements/views
      • /v1/traces
      • /v1/triggertypes
      • /v1/triggerinstances
      • /v1/actionalias
      • /v1/executions
      • /v1/inquiries
      • /v1/packs
      • /v1/policytypes
      • /v1/policies
  • Add support for those filters to all "get_all" / list resource API endpoints

Kami added 6 commits August 9, 2018 17:03
__json__ method.

This method gets called just before the response back to the user and
serializing API object to JSON.
the router.

This way the filtering is performed at the very end, before returning
raw response to the user.

Filtering on the router makes more sense compared to filtering on the
API model because filtering the object might make it an invalid API
model (aka object won't pass API model validation anymore) and filtering
is purerly a property of the response and not the API model.
@Kami Kami added this to the 2.9.0 milestone Aug 9, 2018
@Kami
Copy link
Member Author

Kami commented Aug 9, 2018

Another thing to keep in mind - due to how we handle our resource models (actions, sensors, rules, triggers, etc.), fields which form a ref attribute (aka resource primary key / unique identifier) need to be always present in the response.

For resource models those fields are name and pack and for other models, that field is id. I personally think that's not an issue and is desirable since "get_all" / list response which doesn't contain item's primary key is of no use / makes no sense.

Kami added 6 commits August 10, 2018 09:44
All of those models now contain "ref" attribute which is already
included in the response so there is no need to have this feature
anymore.
attribute are provided.

If those attributes are provided, response will likely be partial and
fail validation.
@Kami
Copy link
Member Author

Kami commented Aug 10, 2018

@enykeev I'm still working on adding support for those query param filters to all the resource API endpoints and improved tests, but the core functionality is ready for review.

Main changes are pretty straight forward and include:

  1. New _process_response method which handles attribute exclusion before sending response back to the client - https://github.com/StackStorm/st2/pull/4300/files#diff-364c1d17f394113df1634d1423d1a3c4R578
  2. Skip response validation if response schema is available and those filters are provided - https://github.com/StackStorm/st2/pull/4300/files#diff-364c1d17f394113df1634d1423d1a3c4R540. It makes no sense to perform response validation when those filters are provided because response will be partial and will likely fail validation (this should also speed things up a bit).
  3. Removal of old "include_reference" related code which we don't need anymore (we haven't need it for a long time now) - https://github.com/StackStorm/st2/pull/4300/files#diff-364c1d17f394113df1634d1423d1a3c4R540

Kami added 4 commits August 10, 2018 16:54
key (reference) so they need to be always included in the response.
execution.runner.runner_parameters field from the database.

This is needed so we can correctly determine and mask secret parameters.
@Kami
Copy link
Member Author

Kami commented Aug 10, 2018

On a related note - it's actually a good idea that we do this parameter filtering at the very and don't skip the DB model and API model layer.

Secrets masking happens inside the DB and API layer which means skipping it would be a bad idea. Right now the function which handles filtering (including, exclusion) gets passed in an object which has already been masked so there is no way to "fish out" secret parameter.

I also plan to spend a good amount of time next week writing test cases for all the various edge cases to make sure this functionality doesn't negatively affect the security (notable secret parameters and related stuff).

@Kami
Copy link
Member Author

Kami commented Aug 14, 2018

On a related note - this pull request also exposed a lot of issues with the current handling of exclude_attribute and lack of good tests for some other API endpoints.

@Kami Kami changed the title [WIP] Better support for ?include_attributes and ?exclude_attributes API query param filters Better support for ?include_attributes and ?exclude_attributes API query param filters Aug 14, 2018
Kami added 12 commits August 14, 2018 18:20
get all response when ?include_attributes filter is provided.

Those responses don't make sense without model primary keys.
include_attributes filtering.

This controller doesn't directly return a database object, but a new one
based on the database object value so we need to take this into account.

Also update code to only retrieve fields we need.
@Kami Kami requested a review from enykeev August 16, 2018 10:27
@Kami Kami merged commit 65c1ac7 into master Sep 4, 2018
@Kami Kami deleted the include_attributes_filtering branch September 4, 2018 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants