Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributive schemas #1086

Merged
merged 20 commits into from
Aug 27, 2024
Merged

Distributive schemas #1086

merged 20 commits into from
Aug 27, 2024

Conversation

frenchy64
Copy link
Collaborator

@frenchy64 frenchy64 commented Aug 8, 2024

:merge distributes over :multi in the following manner:

(m/deref
 [:merge
  [:map [:x :int]]
  [:multi {:dispatch :y}
   [1 [:map [:y [:= 1]]]]
   [2 [:map [:y [:= 2]]]]]]
 {:registry registry})
; => [:multi {:dispatch :y}
;     [1 [:map [:x :int] [:y [:= 1]]]]
;     [2 [:map [:x :int] [:y [:= 2]]]]]

(m/deref
 [:merge
  [:multi {:dispatch :y}
   [1 [:map [:y [:= 1]]]]
   [2 [:map [:y [:= 2]]]]]
  [:map [:x :int]]]
 {:registry registry})
; => [:multi {:dispatch :y}
;     [1 [:map [:y [:= 1]] [:x :int]]]
;     [2 [:map [:y [:= 2]] [:x :int]]]]

Copy link
Member

@ikitommi ikitommi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the approach here. What is the reasoning for this working with :or and :orn too?

[:map [:x :int]]
[:or
[:map [:y [:= 1]]]
[:map [:y [:= 2]]]]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if there is non-map here? gets replaced? I would not expect it to happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should work only with :multi 🤔

Copy link
Collaborator Author

@frenchy64 frenchy64 Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the approach here.

Seems powerful! I'm still grokking the implications and applicability myself...see below.

This is something Typed Clojure does to convert

(t/Merge '{:a t/Int} (t/U '{:b t/Int} '{:c t/Int})

to

(t/U '{:a t/Int :b t/Int} '{:a t/Int :c t/Int})

but such transformations are not so obviously applicable to Malli because of m/parse and error messages relying on named paths.

What is the reasoning for this working with :or and :orn too?

I think all 3 are kinds of disjunctions so it seemed natural to at least try it. I also tried :and which might also be useful in some (non-map) contexts.

:or + :map is not performant but I was also curious if there are other schemas analogous to :merge for which [<COMBINE> X [:or Y Z]] => [:or [<COMBINE> X Y] [<COMBINE> X Z]] makes sense.

what happens if there is non-map here? gets replaced? I would not expect it to happen.

:merge essentially wraps the distributed schema, so it inherits the semantics of merge in that case.

:merge does some weird things if you give it non-maps.

(m/deref
  [:merge
   int?
   [:map [:x :int]]]
  options)
;=> [:map [:x :int]]

(m/deref
  [:merge
   [:map [:x :int]]
   int?]
  options)
;=> int?

Perhaps it should error in these cases? FWIW Typed Clojure implements t/Merge via a protocol AssocableType which knows how to assoc a particular entry onto a map. Merging then happens in an analogous way to clojure.core/merge, but on the type level (pour entries from right-to-left one at a time).

My first impression of mu/merge is that its implementation is too concrete and could be extensible.

Anyway, because of :merge's behavior the distributive rule drops the schema on the left if only one of is a map:

(m/deref
  [:merge
   [:or
    int?
    boolean?]
   [:map [:x :int]]]
  options)
=> 
[:or [:map [:x :int]] [:map [:x :int]]]

(m/deref
  [:merge
   [:map [:x :int]]
   [:or
    int?
    boolean?]]
  options)
=>
[:or int? boolean?]

The idea behind this PR is to implement these distributive rules as abstractly as possible to open the door for other non-map applications, i.e.:

[<*> X [<+> Y Z]]
=>
[<+> [<*> X Y] [<*> X Z]]

[<*> [<+> Y Z] X]
=>
[<+> [<*> Y X] [<*> Z X]]

Going with that theme, here :merge is a kind of <*> schema that doesn't quite fit in the abstraction because of its own inconsistent semantics around non-maps.

maybe this should work only with :multi 🤔

For maps it seems like the most obvious winning combination. Still looking for other applications, but the way it's implemented we can always add others later.

We could also support wrapper schemas like :schema such as

[:merge
 X
 [:schema [:or Y Z]]]

using the same approach as m/-function-schema?.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but such transformations are not so obviously applicable to Malli because of m/parse and error messages relying on named paths.

You had a simplifying protocol suggestion for the generators earlier, if such a thing would be in the core of malli, this would be ok I guess. "here's my schema, transform it something else", e.g. user would not expect paths to be the same.

:merge does some weird things if you give it non-maps.

I was following the meta-merge functionality here:

(require '[meta-merge.core :as mm])

(mm/meta-merge 1 {:x 1})
; => {:x 1}

(mm/meta-merge {:x 1} 1)
; => 1

It would be a breaking change to start failing with. non-entry schemas.

My first impression of mu/merge is that its implementation is too concrete and could be extensible.

Indeed, it's concrete. But adding support for :multi is a great first step.

PR Looks good to me, one FIXME I se.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a breaking change to start failing with. non-entry schemas.

Agreed. A non-starter.

I was following the meta-merge functionality here:

It seems like it's all working as intended then. Applying the distributive rule would only be surprising if you expected :merge to have different semantics in the first place.

@ikitommi
Copy link
Member

ikitommi commented Aug 13, 2024

About different registries in merge, e.g. distribute-registry-test I thought this as "last one wins" here (like meta-merge):

(def schema1
  (m/schema
   [:map ::x ::y]
   {:registry (mr/composite-registry
               (m/default-schemas)
               {::x :int
                ::y :int})}))

(def schema2
  (m/schema
   [:map ::x ::z]
   {:registry (mr/composite-registry
               (m/default-schemas)
               {::x :string
                ::z :string})}))

; looks odd
(mu/merge schema1 schema2)
;[:map
; [:user/x :string]
; [:user/y :user/y]
; [:user/z :user/z]]

(m/validate *1 {::x "1", ::y 1, ::z "1"})
; => true

Not sure what would be a better way to do this, ideas most welcome. maybe a separate issue as this already solves the :multi merge thing, which is big.

@frenchy64
Copy link
Collaborator Author

About different registries in merge, e.g. distribute-registry-test I thought this as "last one wins" here (like meta-merge):

Yes it works perfectly fine until you deserialize the schema at that point.

maybe a separate issue as this already solves the :multi merge thing, which is big.

#1088

@frenchy64
Copy link
Collaborator Author

Yes it works perfectly fine until you deserialize the schema at that point.

I take that back, this schema cannot be dereferenced because ::y is not in scope for some reason.

[:merge
 [:map
  {::y boolean?}
  [:y ::y]]
 [:map]]

This can be duplicated via the new distributive rule, also saying ::y is not in scope:

[:merge
 [:map
  {::y boolean?}
  [:y ::y]]
 [:multi {:dispatch :z}
  [1 [:map]]]]

@frenchy64
Copy link
Collaborator Author

I documented the order of operations for distributing merge. I pushed in the middle of some work and I to fix the tests. Otherwise it's feeling pretty good.

@frenchy64 frenchy64 marked this pull request as ready for review August 16, 2024 17:38
@frenchy64 frenchy64 changed the title prototype distributive schemas Distributive schemas Aug 16, 2024
@frenchy64 frenchy64 requested a review from ikitommi August 16, 2024 17:39
@frenchy64 frenchy64 self-assigned this Aug 16, 2024
@frenchy64
Copy link
Collaborator Author

frenchy64 commented Aug 16, 2024

I don't know which order to apply the left/right distributive rules makes sense or is the most intuitive. Currently it's right then left. Here's the difference.

;; right-distributive then left-distributive
[:merge
 [:multi {:dispatch :y}
  [1 [:map [:y [:= 1]]]]
  [2 [:map [:y [:= 2]]]]]
 [:multi {:dispatch :z}
  [3 [:map [:z [:= 3]]]]
  [4 [:map [:z [:= 4]]]]]]
=>
[:multi {:dispatch :y}
 [1 [:multi {:dispatch :z}
     [3 [:map [:y [:= 1]] [:z [:= 3]]]]
     [4 [:map [:y [:= 1]] [:z [:= 4]]]]]]
 [2 [:multi {:dispatch :z}
     [3 [:map [:y [:= 2]] [:z [:= 3]]]]
     [4 [:map [:y [:= 2]] [:z [:= 4]]]]]]]

;; left-distributive then right-distributive
[:merge
 [:multi {:dispatch :y}
  [1 [:map [:y [:= 1]]]]
  [2 [:map [:y [:= 2]]]]]
 [:multi {:dispatch :z}
  [3 [:map [:z [:= 3]]]]
  [4 [:map [:z [:= 4]]]]]]
=>
[:multi {:dispatch :z}
 [3 [:multi {:dispatch :y}
     [1 [:map [:y [:= 1]] [:z [:= 3]]]]
     [2 [:map [:y [:= 2]] [:z [:= 3]]]]]]
 [4 [:multi {:dispatch :y}
     [1 [:map [:y [:= 1]] [:z [:= 4]]]]
     [2 [:map [:y [:= 2]] [:z [:= 4]]]]]]]

I think it mainly affects m/parse:

;; right-distributive then left-distributive
(m/parse
  [:merge
   [:multi {:dispatch :y}
    [1 [:map [:y [:= 1]]]]
    [2 [:map [:y [:= 2]]]]]
   [:multi {:dispatch :z}
    [3 [:map [:z [:= 3]]]]
    [4 [:map [:z [:= 4]]]]]]
  {:y 1 :z 3}
  options)
=> [1 [3 {:y 1, :z 3}]]

;; left-distributive then right-distributive
(m/parse
  [:merge
   [:multi {:dispatch :y}
    [1 [:map [:y [:= 1]]]]
    [2 [:map [:y [:= 2]]]]]
   [:multi {:dispatch :z}
    [3 [:map [:z [:= 3]]]]
    [4 [:map [:z [:= 4]]]]]]
  {:y 1 :z 3}
  options)
=> [3 [1 {:y 1, :z 3}]]

Copy link
Member

@ikitommi ikitommi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this in really, really good. I'll fix the on exception. 🥇

(mapv (fn [c]
(when-not (and (vector? c)
(= 2 (count c)))
(throw (ex-info "TODO" {})))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO.

@ikitommi
Copy link
Member

I don't know which order to apply the left/right distributive rules makes sense or is the most intuitive. Currently it's right then left. Here's the difference.

The current makes more sense to me.

@ikitommi
Copy link
Member

Actually, the error was not needed, -children of :multi always returns size 3 vector of key properties value. The old code would have failed with merging with entry properties:

[:multi {:dispatch :y}
  [1 {:entry "properties"} [:map [:y [:= 1]]]]
  [2 [:map [:y [:= 2]]]]]

@ikitommi
Copy link
Member

be2c2be

@ikitommi
Copy link
Member

Run some sample tests with this, really happy with it, thanks!

@ikitommi ikitommi merged commit bd1a08e into metosin:master Aug 27, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants