Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/format validation errors #598

Merged
merged 13 commits into from
Oct 25, 2023
Merged

Conversation

mpoffald
Copy link
Contributor

@mpoffald mpoffald commented Oct 24, 2023

Closes # #312

This PR introduces functionality that takes Malli's validation errors and computes a human-readable string explaining the failure.

At a high level, it does this by using heuristics to select the most "useful" error from among the (typically many) errors, and then uses that error to compute some contextual information to add.

About the heuristics:

  1. Malli returns failures for every branch of a disjunction. This translates to a lot of the noise we've been seeing in the error messages. For example, if a where clause can be either a vector or a map, and the query contains a map that is invalid, Malli will return failures for not only the map issues, but also the fact that it failed all the vector cases. To a user, only the map-related failures are likely to feel relevant.
  2. Therefore, we look to the :in values on errors to help us choose, preferring longer ones. The :in vector contains the path into the specific portion of a value that failed validation. So if someone gives us a map that isn’t quite a valid ::where-map, errors specific to the contents of the map will have the longer :in values, which will be a more relevant error.
  3. Once we have chosen from the longest :ins, if there are multiple errors with the same :in, we search for any:or/:orn “parents” of those errors, and prefer those messages to the ones from children errors. I found that it was much simpler to write a nice :error/message on a disjunction than try to compute one on the fly from the children.
  4. From whichever error "wins", we traverse upwards toward the roots of the schema to get more contextual information to append to the error.

Note For this to work well, we should strive to have nice :error/messages on our ors. I added the ones we needed for the list in #312, plus a few others that occurred to me, but we may still want more. And when we make schema changes, we'll need to update these messages.

For this PR, I did not include tests for the specific error strings, since those are likely to be in flux and may prove to be a maintenance nightmare. I did add a repl file containing a bunch of incorrect queries, which I used to develop this feature. I would find this to be useful personally, but if it seems like it wouldn't be for everyone else, I can remove it.

Examples

As of this writing, here's what I get for the error messages from this list (plus a few others):

;;history errors
  (do
    (require '[fluree.db.test-utils :as test-utils])
    (require '[fluree.db.query.history :as history])
    (def conn        (test-utils/create-conn))
    (def ledger      @(fluree/create conn "historytest" {:defaultContext ["" {:ex "http://example.org/ns/"}]}))
    (def db1         @(test-utils/transact ledger [{:id   :ex/dan
                                                    :ex/x "foo-1"
                                                    :ex/y "bar-1"}
                                                   {:id   :ex/cat
                                                    :ex/x "foo-1"
                                                    :ex/y "bar-1"}
                                                   {:id   :ex/dog
                                                    :ex/x "foo-1"
                                                    :ex/y "bar-1"}])))  


  ;;empty request
  (ex-message   @(fluree/history ledger {}))
  ;;=>
  "Must supply a value for either \"history\" or \"commit-details\"; Provided: {}" 

  ;;invalid commit-details
  (ex-message   @(fluree/history ledger {:history [:ex/cat] :commit-details "I like cats" 
                                         :t {:at :latest}}))
  ;;=>
  "Error in value for \"commit-details\"; should be a boolean; Provided: \"I like cats\";  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-commit-details" 

  ;;missing subject
  (ex-message   @(fluree/history ledger {:history nil}))
  ;;=>
  "Must supply a value for either \"history\" or \"commit-details\"; Provided: {:history nil}" 

  ;;missing subject
  (ex-message   @(fluree/history ledger {:history []}))
  ;;=>
  "Must supply a value for either \"history\" or \"commit-details\"; Provided: {:history []}" 

  ;;invalid flake
  (ex-message   @(fluree/history ledger {:history [1 2] :t {:from 1}}))
  ;;=>
  "Error in value for \"history\"; Must provide a tuple of one more more iris; Provided: 1;  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-history" 

  ;;invalid t
  (ex-message   @(fluree/history ledger {:history [:ex/cat] :t {:from 2 :to 0}}))
  ;;=>
  "Error in value for \"t\"; \"from\" value must be less than or equal to \"to\" value,; Provided: {:from 2, :to 0};  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-t" 

  ;;invalid t
  (ex-message   @(fluree/history ledger {:history [:ex/cat] :t {:from -2 :to -1}}))
  ;;=>
  "Error in value for \"t\"; Value of \"from\" must be one of: the key latest, an integer > 0, or an iso-8601 datetime value; Provided: -2;  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-t" 

  ;;missing t values
  (ex-message @(fluree/history ledger {:history [:ex/cat] :t {}}))
  ;;=>
  "Error in value for \"t\"; Must provide: either \"from\" or \"to\", or the key \"at\" ; Provided: {};  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-t" 

  ;;invalid t values
  (ex-message @(fluree/history ledger {:history [:ex/cat] :t {:at 1 :from 1}}))
  ;;=>
  "Error in value for \"t\"; Must provide: either \"from\" or \"to\", or the key \"at\" ; Provided: {:at 1, :from 1};  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-t" 


;;fql

;;missing select
  (ex-message @(fluree/query {} '{:where  [[?s ?p ?o ]]}))
  ;;=>
  "Query: {:where [[?s ?p ?o]], :opts {:issuer nil}} does not have exactly one select clause. One of 'select', 'selectOne', 'select-one', 'selectDistinct', or 'select-distinct' is required in queries. See documentation here for more details: https://next.developers.flur.ee/docs/reference/errorcodes#query-missing-select; Provided: {:where [[?s ?p ?o]], :opts {:issuer nil}}" 

;;multiple selects
  (ex-message @(fluree/query {} '{:select [?s]
                                  :selectOne [?s ?p ]
                                  :where  [[?s ?p ?o ]]}))
  ;;=>
  "Query: {:select [?s], :selectOne [?s ?p], :where [[?s ?p ?o]], :opts {:issuer nil}} does not have exactly one select clause. One of 'select', 'selectOne', 'select-one', 'selectDistinct', or 'select-distinct' is required in queries. See documentation here for more details: https://next.developers.flur.ee/docs/reference/errorcodes#query-missing-select; Provided: {:select [?s], :selectOne [?s ?p], :where [[?s ?p ?o]], :opts {:issuer nil}}" 

;;invalid select var
  (ex-message @(fluree/query {} '{:select [+]
                                  :where  [[?s ?p ?o ]]}))
  ;;=>
  "Error in value for \"select\"; selector must be either a variable, wildcard symbol (`*`), iri, function application, or select map; Provided: +;  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-select" 

;;unknown key
  (ex-message @(fluree/query {} '{:select [?s]
                                  :where  [[?s ?p ?o ]]
                                  :foo [?o]}))
  ;;=>
  "Query contains an unknown key: 'foo'. See documentation here for more information on allowed query keys: https://next.developers.flur.ee/docs/reference/errorcodes#query-unknown-key; Provided: [?o]" 

;;extra k/v in where map
  (ex-message @(fluree/query {} '{:select ['?name '?email]
                                  :where  [['?s :type :ex/User]
                                           ['?s :schema/age '?age]
                                           ['?s :schema/name '?name]
                                           {:union  [[['?s :ex/email '?email]]
                                                     [['?s :schema/email '?email]]]
                                            :filter ["(> ?age 30)"]}]}))
  ;;=>
  "Error in value for \"where\"; Where map can only have 1 key/value pair; Provided: {:union [[[(quote ?s) :ex/email (quote ?email)]] [[(quote ?s) :schema/email (quote ?email)]]], :filter [\"(> ?age 30)\"]};  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-where" 

;;unrecognized op in where map
  (ex-message @(fluree/query {} {:select ['?name '?age]
                                 :where  [['?s :type :ex/User]
                                          ['?s :schema/age '?age]
                                          ['?s :schema/name '?name]
                                          {:foo "(> ?age 45)"}]}))
  ;;=>
  "Error in value for \"where\"; Unrecognized operation in where map, must be one of: filter, optional, union, bind; Provided: :foo;  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-where" 

;;invalid where
  (ex-message @(fluree/query {} '{:select [?s ?o]
                                  :where  ?s}))
  ;;=>
  "Error in value for \"where\"; Where must be a vector of clauses; Provided: ?s;  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-where" 

;;invalid where
  (ex-message @(fluree/query {} '{:select [?s ?o]
                                  :where  [?s ?p ?o]}))
  ;;=>
  "Error in value for \"where\"; where clauses must be valid tuples or maps; Provided: ?s;  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-where" 

;;invalid group-by
  (ex-message @(fluree/query {} '{:select   [?s]
                                  :where    [[?s ?p ?o]]
                                  :group-by {}}))
  ;;=>
  "Error in value for \"group-by\"; groupBy clause must be a variable or a vector of variables; Provided: {};  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-group-by" 

;;invalid order-by
  (ex-message @(fluree/query {} '{:select  ['?name '?favNums]
                                  :where   [['?s :schema/name '?name]
                                            ['?s :schema/age '?age]
                                            ['?s :ex/favNums '?favNums]]
                                  :orderBy [(foo  ?favNums)]}))
  ;;=>
  "Error in value for \"orderBy\"; Ordering must be a var or two-tuple formatted ['ASC' or 'DESC', var]; Provided: foo;  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-orderBy" 

;;invalid bind
  (ex-message @(fluree/query {} '{:select [?firstLetterOfName ?name ?canVote]
                                  :where  [[?s :schema/age ?age]
                                           [?s :schema/name ?name]
                                           {:bind [?canVote           (>= ?age 18)]}]}))
  ;;=>
  "Error in value for \"where\"; Invalid bind, must be a map with variable keys; Provided: [?canVote (>= ?age 18)];  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-where" 

;;invalid filter
  (ex-message @(fluree/query {} '{:select ['?name '?age]
                                  :where  [['?s :type :ex/User]
                                           ['?s :schema/age '?age]
                                           ['?s :schema/name '?name]
                                           {:filter "(> ?age 45)"}]}))
  ;;=>
  "Error in value for \"where\"; Filter must be a function call wrapped in a vector; Provided: \"(> ?age 45)\";  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-where" 

;;invalid filter
  (ex-message @(fluree/query {} '{:select ['?name '?age]
                                  :where  [['?s :type :ex/User]
                                           ['?s :schema/age '?age]
                                           ['?s :schema/name '?name]
                                           {:filter :foo}]}))
  ;;=>
  "Error in value for \"where\"; Filter must be a function call wrapped in a vector; Provided: :foo;  See documentation for details: https://next.developers.flur.ee/docs/reference/errorcodes#query-invalid-where" 

We attempt to choose the most useful errors by doing as follows:
- if there is a top-level failures from one of our `:fn` schemas, we just use that one
- Otherwise, we search through the errors:
  - we prefer messages with the longest `:in` values, which means the failure was in more sense more "specific"
  - if there's only one of those, we take it
  - otherwise, we take a chunk of errors which share the same `:in`, meaning the same portion of the value failed multiple branches of the schema. We arbitrarily choose the first
  - we see if those all share some common disjunction farther up the tree. Error messages on `:or`/`:orn`s can be written to express the requirements in one sentence, which we prefer over using the children of that disjunction
  - if not, we just pick one from the chunk

Other changes:
- use version of `-resolve-root-error` with two fixes:
  -  ensures `:error/fn`s have access to a correct value. This fixes a bug which was causing `nil` values in error messages
  - guard against `:invalid-schema` error when root error is due to an extra key not present in the schema
- add override that attempts to improve upon invalid-type errors
schema cannot be its own parent
We don't need to make this distinction now that we have special logic to differentiate between type and limit failures
@mpoffald mpoffald force-pushed the feature/format-validation-errors branch from b322250 to 48ac67a Compare October 24, 2023 19:29
@mpoffald mpoffald requested a review from a team October 24, 2023 19:38
Copy link
Contributor

@dpetran dpetran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Excited to see this come to fruition 🌳

@mpoffald mpoffald merged commit 4ff41ae into main Oct 25, 2023
5 checks passed
@mpoffald mpoffald deleted the feature/format-validation-errors branch October 25, 2023 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants