Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto keyword for vars #851

Merged
merged 14 commits into from
Jan 10, 2022
361 changes: 361 additions & 0 deletions proposals/p0851.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,361 @@
# Variable type inference

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/851)

<!-- toc -->

## Table of contents

- [Problem](#problem)
- [Background](#background)
- [Variable type inference in other languages](#variable-type-inference-in-other-languages)
- [Carbon equivalents](#carbon-equivalents)
- [Pattern matching](#pattern-matching)
- [Carbon equivalents](#carbon-equivalents-1)
- [Pattern matching on `if`](#pattern-matching-on-if)
- [Proposal](#proposal)
- [Open questions](#open-questions)
- [Inferring a variable type from literals](#inferring-a-variable-type-from-literals)
- [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
- [Alternatives considered](#alternatives-considered)
- [Elide the type instead of using `auto`](#elide-the-type-instead-of-using-auto)
- [Use `_` instead of `auto`](#use-_-instead-of-auto)

<!-- tocstop -->

## Problem

Inferring variable types is a common, and desirable, use-case. In C++, the use
of `auto` for this purpose is prevalent. We desire this enough for Carbon that
we already make prevalent use in example code.

## Background

Although [#826: Function return type inference](p0826.md) introduced the `auto`
keyword for `fn`, the use type inference in `var` should be expected to be more
prevalent.

In C++, `auto` can be used in variables as in:

```cpp
auto x = DoSomething();
```

In Carbon, we're already using `auto` extensively in examples in a similar
fashion. However, it's notable that in C++ the use of `auto` replaces the actual
type, and is likely there as a matter of backwards compatibility.

### Variable type inference in other languages

[#618: var ordering](p0618.md) chose the ordering of var based on other
languages. Most of these also provide inferred variable types.

Where the `<identifier>: <type>` syntax matches, here are a few example inferred
types:

- [Kotlin](https://kotlinlang.org/docs/basic-syntax.html#variables):
`var x = 5`
- [Python](https://docs.python.org/3/tutorial/introduction.html): `x = 5`
- [Rust](https://doc.rust-lang.org/std/keyword.mut.html): `let mut x = 5;`
- [Swift](https://docs.swift.org/swift-book/LanguageGuide/TheBasics.html):
`var x = 5`

Ada appears to
[require](https://learn.adacore.com/courses/intro-to-ada/chapters/strongly_typed_language.html#what-is-a-type)
a variable type in declarations.

For different syntax, [Go](https://tour.golang.org/basics/14) switches from
`var x int = 5` to `x := 5`, using the `:=` to trigger type inference.

#### Carbon equivalents

In Carbon, we have generally discussed:

```carbon
var x: auto = 5;
```

However, given precedent from other languages, we could omit this:

```carbon
var x = 5;
```

### Pattern matching

As we consider variable type inference, we need to consider how pattern matching
would be affected.

[Swift's patterns](https://docs.swift.org/swift-book/ReferenceManual/Patterns.html)
support:

- Value-binding patterns, as in:

```swift
switch point {
case let (x, y):
...
}
```

- Expression patterns, as in:

```swift
switch point {
case (0, 0):
...
case (x, y):
...
}
```

- **Combining** the above, as in:

```swift
switch point {
case (x, let y):
...
}
```

#### Carbon equivalents

With Carbon, we have generally discussed:

- Value-binding patterns, as in:

```carbon
match (point) {
case (x: auto, y: auto):
...
}
```

- Expression patterns, as in:

```carbon
match (point) {
case (0, 0):
...
case (x, y):
...
}
```

- **Combining** the above, as in:

```carbon
match (point) {
case (x, y: auto):
...
}
```

However, it may be possible to mirror Swift's syntax more closely; for example:

- Value-binding patterns, as in:

```carbon
match (point) {
case let (x, y):
...
}
```

- Expression patterns, as in:

```carbon
match (point) {
case (0, 0):
...
case (x, y):
...
}
```

- **Combining** the above, as in:

```carbon
match (point) {
case (x, let y):
...
}
```

In the above, this presumes to allow `let` inside a tuple in order to indicate
the variable-binding for one member of a tuple.

### Pattern matching on `if`

In Carbon, it's been suggested that `if` could support testing pattern matching
when there's only one case, for example with `auto`:

```carbon
match (point) {
case (x, y: auto):
// Pattern match succeeded, `y` is initialized.
default:
// Pattern match failed.
}

=>

if ((x, y: auto) = point) {
// Pattern match succeeded, `y` is initialized.
} else {
// Pattern match failed.
}
```

A `let` approach may still look like:

```carbon
match (point) {
case (x, let y):
// Pattern match succeeded, `y` is initialized.
default:
// Pattern match failed.
}

=>

if ((x, let y) = point) {
// Pattern match succeeded, `y` is initialized.
} else {
// Pattern match failed.
}
```

Some caveats should be considered for pattern matching tests inside `if`:

- There is syntax overlap with C++, which allows code such as:

```carbon
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
if (Derviced* derived = dynamic_cast<Derived*>(base)) {
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
// dynamic_cast succeeded, `derived` is initialized.
} else {
// dynamic_cast failed.
}
```

- Syntax is similar to `match` itself; it may be worth considering whether the
feature is
[sufficiently distinct and valuable](/docs/project/principles/one_way.md) to
support.

## Proposal

Carbon should offer variable type inference using `auto` in place of the type,
as in:

```carbon
var x: auto = DoSomething();
```

At present, variable type inference will simply use the type on the right side.
In particular, this means that in the case of `var y: auto = 1`, the type of `y`
is `IntLiteral(1)` rather than `i64` or similar. This
[may change](#inferring-a-variable-type-from-literals), but is the simplest
answer for now.

## Open questions

### Inferring a variable type from literals

Using the type on the right side for `var y: auto = 1` currently results in a
constant `IntLiteral` value, whereas most languages would suggest a variable
integer value. This is something that will be considered as part of type
inference in general, because it also affects generics, templates, lambdas, and
return types.

## Rationale based on Carbon's goals

- [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write)
- Frequently code will have the type on the line, as in
`var x: auto = CreateGeneric[Type](args)`. Avoiding repetition of the
type will reduce the amount of code that must be read, and should allow
readers to comprehend more quickly.
- [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)
- The intent is to have `auto` work similarly to C++'s type inference, in
order to ease migration.

## Alternatives considered

### Elide the type instead of using `auto`

As discussed in background,
[other languages allow eliding the type](#variable-type-inference-in-other-languages).
Carbon could do similar, allowing:

```carbon
var y = 1;
```

Advantages:

- Forms a cross-language syntax consistency with languages that put the type
on the right side of the identifier.
- This is particularly strong with Kotlin and Swift, because they also use
`var`.
- Use of `auto` is currently unique to C++; Carbon will be extending its use.

Disadvantages:

- Type inference becomes the _lack_ of a type, rather than the presence of
something different.
- C++'s syntax legacy may have needed `auto` in order to provide type
inference, where Carbon does not.
Comment on lines +308 to +313
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be good to also capture that this creates the potential for ambiguity with expression patterns as well. There are possible ways of addressing that ambiguity (some of which you get into in the background section), but I feel like its still an important consideration in the tradeoff here. @zygoloid might even be able to suggest something here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a bullet for that, linking the background.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mid-air collision between your update and my suggestion. Here's what I'd written up (which looks similar enough to your new bullet):

Suggested change
Disadvantages:
- Type inference becomes the _lack_ of a type, rather than the presence of
something different.
- C++'s syntax legacy may have needed `auto` in order to provide type
inference, where Carbon does not.
Disadvantages:
- Type inference becomes the _lack_ of a type, rather than the presence of
something different.
- C++'s syntax legacy may have needed `auto` in order to provide type
inference, where Carbon does not.
- Removes syntactic distinction between binding of a new name and reference
to an existing name. Reintroducing this distinction would likely require
additional syntax, such as Swift's nested `let`.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting here -- replaced mine with @zygoloid's


We expect there will be long-term pushback over the cross-language
inconsistency, particularly as the `var <identifier>: auto` syntax form will be
unique to Carbon. However, there's a strong desire not to have the lack of a
type mean the type will be inferred.

### Use `_` instead of `auto`

We have discussed using `_` as a placeholder to indicate that an identifier
would be unused, as in:

```carbon
fn IgnoreArgs(_: i32) {
// Code that doesn't need the argument.
}
```

We could use `_` instead of `auto`, leading to:

```carbon
var x: _ = 1;
```

However, removing `auto` entirely would also require using it when inferring
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
function return types. For example:

```carbon
fn InferReturnType() -> _ {
return 3;
}
```

Advantages:

- Reduces the number of keywords in Carbon.
- Less to type.
- The incremental convenience may ameliorate the decision to require a
keyword for type inference.

Disadvantages:

- There's a feeling that `_` means "discard", which doesn't match the
semantics of inferring a return type, since the type is not discarded.
- There may be some ambiguities in handling, such as `var c: (_, _) = (a, b)`.
- The reduction of typing is unlikely to address arguments that the keyword is
not technically required.

The sense of "discard" versus "infer" semantics is why we are using `auto`.