Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] dependent types, restricted types or generalization of the range types #172

Open
cooldome opened this issue Sep 30, 2019 · 8 comments

Comments

@cooldome
Copy link
Member

Nim currently has the concept of the range type that is used actively to express
entities like positive number and strictly positive number. Unfortunately, it is
restricted only to ordinal and float types. I would like to generalize this idea to
cover more use cases and express existing range types in more generic terms.

Example of use cases

This proposal aims to be quite generic and cover existing range types and many more.

Random examples:

  1. Not empty sequence or sorted sequence as a type.
    Sorted seq sequence or non empty sequence is still a sequence hence distinct is not useful way to describe it. Functions working with any sequence should still work with sorted/non empty sequence.

  2. Object with two date fields and one date must always be greater than another one.
    Object contains 2 sequences but their lengths should always be equal.

The implementation proposal

The idea to add type new type bound operation =check that can be attached to any type that can introduce arbitrary restriction on actual values. Signature:

proc `=check`(x: MyCheckedType)

=check proc has one argument and no return type, but proc can raise exception or defect if value is not within range. =check can be attached to the generic types too:

proc `=check`[T](x: MyCheckedType[T])

Check for constant values should be done at compile time. For runtime values =check calls are injected. Transformation phase of Nim will inject =check calls in the right places.
In similar fashion what it is currently done for range types.

Type coercion rules

  1. Checked type is implicitly convertible to its underlying type.

  2. Underlying type is implicitly convertible to checked type with =check call injected. Const values trigger immediate =check evaluation.

  3. Two checked types with the same underlying non distinct type are implicitly convertible to each other with =check call injected. Const values trigger immediate =check evaluation.

  4. Checked type argument match is a better match then underlying type.

Examples

With this proposal range types can now be implemented completely in the user case, extra compiler support no longer required:

type range*[S: static[Slice]] = S.T

func `=check`[S](x: range[S]) =
  when nimvm or compileOption("boundChecks"):
    if x < S.a or x > S.b:
      sysFatal(RangeError, "...")

func low*[S](x: range[S]): range[S] = S.a
func high*[S](x: range[S]): range[S] = S.b

Natural type integer can be defined as:

type Natural = range[0..high(int)]
type PositiveFloat = range[0.0..high(float)]

This will work now too, while previously it didn't:

const myarray = [1, 2, 3]
type MyIndex = range[0..<len(myarray)] # ..< is not supported in existing nim devel

Above example with non empty sequence, can be implemented as:

type NonEmptySeq[T] = seq[T]

proc `=check`[T](x : NonEmptySeq[T]) =
  when nimvm or compileOption("boundChecks"):
    if x.len > 0:
      sysFatal(ValueError, "...")

Summary of the changes in the compiler:

  1. Ast nodes removed as no longer required: nkChckRangeF, nkChckRange64, nkChckRange

  2. transf phase will inject =check calls where necessary.
    For const expressions =check calls will be evaluated at compile time in semfold.

  3. tyRange type is removed along with a handful of code in the sigmatch, semtypes.
    Some of the code will be reused as =check injections needs to be done in the same places
    as current nkChckRange checks.

  4. Some magics like mInc, mDec, mAddI, mSubI, mSucc, mPred will need =check inject on its result value.
    Transf will do this work too.

  5. Range type is now implemented in the user space as in the example above, still part of system.nim.

  6. No new type kind is required to be introduced. tyAlias can be used to attach check operation to.
    Looks like tyAlias in the best candidate for this work, here is why. When type is introduced you don't
    know if it wil be checked or not:

      type NonEmptySeq[T] = seq[T]

    It will be tyAlias after sempass of type section. Only after the =check operation will be discovered
    you will know it is a checked type. You will have two options: leave tyAlias kind as is or change its kind to new tyChecked on =check occurrence. I don't know which one is better, let me know if you have strong opinion. I am leaning towards first option leave tyAlias as is since tyAlias also works well with type coercion rules specified above.

Further details

=check can be called explicitly by serialization libraries to trigger check of deserialized values to
match code expectation. Calling =check proc for types that don't have =check operation defined is simply noop.

For object types that consist of many fields, it is not uncommon to initialize object element by element:

var a: MyObject
a.field1 = ...
a.field3 = ...
a.field3 = ...

This code won't work if MyObject is a checked type. As =check will be called too early on line var a: MyObject. This problem intersects with RFC 49 (#49). The idea is to use existing pragma noInit to delay or remove object is Initialized check to first object usage or result return instead:

var a {.noInit.}: NonEmptySeq[int] # no check on this line

I see 2 ways of how noInit can be implemented: =check and is initialized checks are ignored completely or the checks are delayed to the object usage.
Assigning fields of the object, doesn't count as object usage.

Caveats

This proposal makes active use of static[T] types. static[T] bugs gets more apparent.
I managed to hit some of the issues during preparation of this proposal.

@zah
Copy link
Member

zah commented Sep 30, 2019

This is an interesting proposal, but I don't think it will work out with regular type aliases. Nim has a lot of internal code that tries to eliminate the intermediate named types in a long chain of type aliases. The spec defines that an alias of a type should be considered the same type in every possible sense.

What could work better is to model the checked types as distinct types, while introducing additional general conveniences for creating distinct types with implicit conversions. At some point in the past, Nim had an undocumented support for the following nicer syntax for distinct types:

type 
  ImmutableString = distinct string without `[]=`, `add`
  ID = distinct int with `>` 
  CheckedSeq = distinct seq with *

It's easy to add additional syntax sugar for listing invariants (the checks) and for controlling the implicitness of the conversions.

In general, this functionality is closely related to another idea of having "flow-dependent" types. As a simple example, consider code like this:

var res = getSomeOption()
if res.isSome:
  echo res.get

A type system that exploits flow-dependent types will be able to notice that the res variable is proven to hold a value in the branch of the code where res.get is called and thus no further type checks have to be inserted. To achieve this property, you need a type system that is be able to reason about known constraints and predicates. Ada Spark for example is able to do this by relying on a SMT solver such as Z3.

There are many possible approaches to design such a feature - from automated ones where the compiler is deriving its knowledge from the written code to more formalized ones where the programmer must use some kind of language for combining and managing the "proofs" for the properties of interest being expressed. One has to be very careful to strike the right balance between ergonomics (simplicity), efficiency and expressive power.

@mratsim
Copy link
Collaborator

mratsim commented Oct 1, 2019

Moving to RFC repo.

edit: but I can't, I don't have rights there :/

@cooldome
Copy link
Member Author

Hi Mratsim,
I can't transfer it either. RFCs repo is not in the list of for transfer.
Can I somehow give you the permission?

@narimiran narimiran transferred this issue from nim-lang/Nim Oct 10, 2019
@awr1
Copy link

awr1 commented Apr 5, 2020

IMO a compiler option to globally disable runtime dependent typechecks might be prudent.

For @zah's point about the type aliases: perhaps checkable types should have a pragma {.checkable.} to them, preventing them from being alias-able, as one can still make aliases to distinct types.

Alternatively, permit type aliases like we already do, but specify that check procedures must be applied to a "real" type, and not an alias to one.

@github-actions
Copy link

This RFC is stale because it has been open for 1095 days with no activity. Contribute a fix or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 26, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 2, 2023
@omentic
Copy link

omentic commented Jun 23, 2023

I'd like to see this reopened, ranges currently have problems this RFC would address.

@metagn
Copy link
Contributor

metagn commented Jun 23, 2023

To keep discussion fresh and up to date it might be better to remake as a new RFC in general

@omentic
Copy link

omentic commented Jun 23, 2023

Sounds good. This RFC was not actually about dependent types but a runtime-focused refinement types system, FWIW.

I suppose these would be somewhat of a dual to (old) concepts: concepts are compile-time predicates while these checked types are primarily runtime. I'm going to mull over if it would be worth it to unify these: perhaps as =check and =staticcheck hooks? Though a better approach may be based on concepts...

Rust has somewhat of an equivalent feature being played around with (rust-lang/rfcs#671), though I have my doubts it'll make it into core being such a runtime burden. I strongly suspect that static verifiers eg. Prusti will be pushed for instead.

I think that there's a really interesting possibility to leverage DrNim for such a feature. Proper dependent types are almost certainly not useful for a language like Nim (I don't think they're terribly useful outside of anything but theorem proving: though they are very cool) but a good system of refinement types could be leveraged to remove bounds checks at the cost of (dramatically) increased compile times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants