Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Allow type inference for const or static #3546

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

Neo-Zhixing
Copy link

@Neo-Zhixing Neo-Zhixing commented Dec 21, 2023

Allow type inference for const or static when the RHS is known.

const PI = 3.1415; // inferred as f64
static MESSAGE = "Hello, World!"; // inferred as &'static str
const FN = std::string::String::default; // Inferred as the unnamable type of ZST closure associated with this item. Its type is reported by `type_name_of_val` as ::std::string::String::default

Pre-RFC discussion: #1349

cc #1623

Rendered

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 21, 2023
# Motivation
[motivation]: #motivation

Rust currently requires explicit type annotations for `const` and `static` items.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to write a little bit about why Rust is like this currently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this should have a longer explanation of Rusts "rule" of "no inference in signatures", how this RFC is breaking it and why this is okay.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion! I actually don't know why the design was made that way. Maybe someone from the Rust team can help explain?

The "type is missing in const" error was emitted from the parser so my guess would be that it was just difficult to infer types when consts were implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's like this currently because it was decided that all public API points should be "obviously semver stable" rather than "quick to type".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining this. I've incorporated it into the RFC.

from the initial value. For example:

```rs
const PI = 3.1415; // inferred as f64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally prefer that the actual type of numeric or float types be known rather than picking arbitrary defaults (e.g. i32 or f64). I'm not too keen on it in a local context either tbh but there it's mitigated by the compiler being able to infer the real type from the surrounding code most of the time.

const PI = 3.1415; // error
const PI = 3.1415_f64; // inferred as f64

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer more consistent behavior with let bindings. So let's make a simple vote here:
🎉: const PI = 3.1415; // error
🚀: const PI = 3.1415; // inferred as f64

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a big issue for me either way it's just that a few times I've only later realised a type has been unexpectedly made i32. But it's easily fixed and more of an annoyance than a problem per se. It doesn't help that an i32 is very rarely what I want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it follows the normal rust inference default types then it's at least no worse than what let does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it's worse in the sense that let can use future code to infer the type, so mostly this is a non issue there unless there's a lot of generics involved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I would most like if the const was of {float} type but people aren't ready for that conversation maybe

```rs
const PI = 3.1415; // inferred as f64
static MESSAGE = "Hello, World!"; // inferred as &'static str
const FN_PTR = std::string::String::default; // inferred as fn() -> String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically the const would have the function item's type instead of a function pointer type.
https://doc.rust-lang.org/reference/types/function-item.html

Copy link
Author

@Neo-Zhixing Neo-Zhixing Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a good point - I guess there's not much point to coerce the function item's type into function pointer type for const items. But should we coerce for static items? That way you get to reassign the static items with functions of the same signature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about why a person would want it as a static at all, so perhaps we shouldn't allow it at all in the first version of this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lokathor maybe they just use the const as a quick switch between two cfg implementations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense as a const but little sense as a static

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the type is a ZST, like function item types, at runtime, const and static are essentially interchangable. when the type is a function pointer, unless wrapped in something with interior mutability, static is basically just a const with a stable address, IIRC LLVM will still constant-propagate the value to most uses, since it's marked read-only so LLVM knows the value won't change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing implicit coercions into an unknown target type sounds very confusing. If the right-hand side is String::default, then the resulting type should be the actual type of the expression, the function item ZST.

@Aloso
Copy link

Aloso commented Dec 22, 2023

Both of these drawback could be addressed using an allow-by-default clippy lint for const and static types.

Make that a rustc lint, and make it warn by default, unless

  • The type is unnameable
  • The item was generated by a macro
  • The item is private or not accessible outside the crate (i.e. not part of the public API)

Then I'd be very happy with this proposal.

One open question is to what extent type placeholders should be allowed, and if they should silence the lint:

const X: _ = 3.14;
const Y: [_; 4] = [1, 2, 3, 4];

@Aloso
Copy link

Aloso commented Dec 22, 2023

I'd like to point out that all of the mentioned downsides also apply to function return types. The main difference is that functions support impl Trait as return type, but const/static items do not. Allowing this would make Rust more expressive:

const FOO: impl Fn() -> i32 = || {
    todo!();
};

text/0000-const-type-inference.md Outdated Show resolved Hide resolved
text/0000-const-type-inference.md Outdated Show resolved Hide resolved
However, not all `const` or `static` items are public, and explicit typing isn't always important for semvar stability.
Requiring explicit typing for this reason seems a bit heavy handed.

Both of these drawback could be addressed using an allow-by-default clippy lint for `const` and `static` types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think an allow-by-default lint addresses concerns about it being potentially confusing. Having a lint can be useful, but it doesn't address the problem because almost everyone won't be using it (but may be using inference).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, happy to withdraw this suggestion and go with what you've suggested instead.

@chrysn
Copy link

chrysn commented Dec 28, 2023

Many of the cases where this would be helpful would also be addressed by type_alias_impl_trait; in particular, it helps with dealing with unnameable types and macro / code generator output, without the drawbacks of loss of clarity and semver trouble (because the trait is guaranteed, and is also the only property usable in other crates).

This may not be an argument against this RFC, but at least warrants discussion in the alternatives.

@kpreid
Copy link
Contributor

kpreid commented Dec 28, 2023

Many of the cases where this would be helpful would also be addressed by type_alias_impl_trait

That's a good thing to note as an alternative, but it won't help with array lengths, which are one of the ways the current rules frequently create pain (especially for include_bytes!()ed arrays where the length isn't even a property of the same source code file):

static ARRAY_OF_UNINTERESTING_LENGTH = [
    "foo",
    "bar",
    // ...
];

In the cases where a TAIT such as impl AsRef<[T]> could reasonably be used, you can also use an static &[T] slice, but not all cases are that.

@JarredAllen
Copy link

I feel like providing explicit types for values should be assumed to be the default, and use of the _ placeholder in types looks better to me:

// I prefer this
const FOO: _ = "hello";
// over this
const FOO: "hello";

The use of placeholders also I think works better in the event of a "partially-nameable" type, e.g.:

struct MyWrapper<T>(T);

// We can name `MyWrapper`, but not its generic argument.
static FOO: MyWrapper<_> = MyWrapper(|| todo!());

Also, for type inference around literals specifically, I'd prefer it not guess between options. I think this could lead to some confusion around things like this:

fn takes_u8(_: u8) {}
fn takes_u16(_: u16) {}

// Demonstrating type inference on local variables.
fn local() {
    let foo = 7;
    // Uncommenting either of these lines works, but uncommenting both results in a compile error.
    // takes_u8(foo);
    // takes_u16(foo);
    // If both lines are commented, `foo` is an i32; otherwise, the uncommented line changes its type.
}

// Demonstrating type inference on static variables.
fn local() {
    static foo: _ = 7; // Inferred as i32 since there's no uses.
    // Can uncommenting one of the below lines work?
    // I don't like uses of a static changing its value, but it also feels wrong to let inference change a literal's inferred type based on uses in locals, but not statics.
    // takes_u8(foo);
    // takes_u16(foo);
}

I'd prefer if literals still have to be explicitly annotated (either on the static definition or in the literal itself).

Both of these drawback could be addressed using an allow-by-default clippy lint for const and static types.

Given that the main use-case IMO is for types that can't be named, I think making a deny-by-default lint in rustc for a placeholder in the place of anything nameable (except for macro-generated code) would also be a good idea. And once TAIT is stabilized, I think it should expand to also lint any type in a public API (but not array lengths), regardless of whether it's macro-generated or nameable, since that can be used instead.

@Neo-Zhixing
Copy link
Author

Neo-Zhixing commented Jan 9, 2024

I do think that requiring at least a _ placeholder is a good idea. It nudges people to specify explicit types when they can by making const inference "opt-in".

Given the semver compatibility concerns, I also think that it's a good idea to require numerical types used for inference to have explicit typing.

I've updated the RFC to reflect this.

However, I don't think a deny-by-default lint is a good idea. Some types can just be too cumbersome to type, and I think we should leave this decision up to the users.

@tmccombs
Copy link

However, I don't think a deny-by-default lint is a good idea. Some types can just be too cumbersome to type, and I think we should leave this decision up to the users.

in that case I think it would be totally reasonable to add an #[allow(...)] attribute to the item. And I think such a lint should (at least by default) only apply to public items.

@Neo-Zhixing
Copy link
Author

Updated.

@tmccombs Ok yes I guess an #[allow(...)] would be fine here, but let's make it a deny-by-default clippy lint instead of a rustc lint. It's a relatively easy fix and I don't want this to intrude into the developer experience too much.

@Kolsky
Copy link

Kolsky commented Jan 12, 2024

Adding a prior art for const array length inference with a macro, regardless whether this rfc ends up accepted or not: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=121942080c9bb1d9a7a83bfbf315e6c2

Other points were already discussed, but just to summarize: globals should have sandboxed inference context, where their type would be fully known after all constraints in const expr block has been applied; i.e. no default types for literals, nor implicit casts should be allowed. Otherwise it'd be confusing and inconsistent at best.

const PI = 3.1415926; should not compile, add a literal suffix.
const NEWS = String::new; should have function item type.

There was no mention of associated consts in trait impls, perhaps it's best to leave them as is for now. However their type is known from trait definition, it's a future possibility.

@Lokathor
Copy link
Contributor

the NEWS example being a fn actually would be an implicit conversation still

@Kolsky
Copy link

Kolsky commented Jan 12, 2024

the NEWS example being a fn actually would be an implicit conversation still

Function item type is different from fn pointer type. It'd have an unnameable type of ZST closure associated with this item. Its type is reported by type_name_of_val as ::alloc::string::String::new, but it can't be written like this.

@Ixrec
Copy link
Contributor

Ixrec commented Jan 13, 2024

There should be some discussion of RFC #2010, why that was closed, and what makes this RFC an improvement. At the moment it's not clear if anything's changed since then.

After re-skimming that old thread, I believe the main factual/technical issue with the idea is when some concrete data was gathered, it implied this change would be far less helpful than anyone expected. Apparently i32 consts are so rare that in most "trivial" cases where the const is a single numeric literal, we'd only be adding the option to e.g. rewrite const FOO: usize = 1; as const FOO = 1usize; but you'd still need to write the type on the literal, so it wouldn't even achieve the stated goal of writing fewer explicit types. And this was the only non-vague reason cited for "postponing" it.

@tmccombs
Copy link

From what I understand, the main motivation for this is that it allows you to have consts and statics with unnameable types. Data on current usage of const and static isn't very useful for that, since you can't currently have consts or statics with those values.

Some additional points:

  • the things that can be done in a const expression is greatly expanded since 2017, which means it is more likely to use a complicated type or an unnameable type
  • it is now possible to use impl types in function returns (although having impl type aliases could provide a similar solution for const and statics)
  • const and static aren't necessarily at the top level. It feels weird that the type can be elided on a let statement inside a function, but not on a const or static inside a function
  • that RFC was closed as postponed, not completely rejected

@Neo-Zhixing Neo-Zhixing requested a review from Noratrieb January 18, 2024 07:11
@afetisov
Copy link

afetisov commented Feb 1, 2024

Strong no on this one.

The motivation is very weak.

Sometimes the constant's value is complex, making the explicit type overly verbose.

Use a type alias. If your const type is too complex, it will make it painful to use in downstream code. You can infer a complex type, but you can't ignore it, and you must still bend your code to fit the specific type.

You also don't need to put that complex type into a const item. You can use a function, or maybe a const fn, which is in many ways easier to use (e.g. it can be generic).

In some cases, the type may be unnamable.

That should be fixed by making impl Trait properly work with consts. Nothing fundamental is blocking it, it just isn't done yet.

When creating macros, the precise type might not be known to the macro author.

But it should be known to the macro user, and the author can just ask to supply that type. It will also likely make the macro more readable.

Code generators may not have enough information to easily determine the type.

Same as above. Also nothing forces you to declare items as a const. Code generator can just directly inline the item. If the const item is part of the public API, then I definitely want the generated code to contain the type. If the code generator's author can't reasonably define the type, knowing full well how their generator works, how the hell am I supposed to deduce it from a barf of generated code?

This change aims to make Rust more expressive, concise and maintainable

There is literally no gain in expressivity, the same code is possible. It is more concise, yes, but conciseness on its own isn't a virtue. Not at the cost of readability. And it's definitely less maintainable, because you have less information to go with, and errors are easier to make.

That's a good thing to note as an alternative, but it won't help with array lengths, which are one of the ways the current rules frequently create pain (especially for include_bytes!()ed arrays where the length isn't even a property of the same source code file)

If you don't care about the length of the array, don't type it as an array. Use a slice. Particularly relevant with include_bytes!, which can easily includes many kilobytes of code, which would make it a terrible array (slow to copy, prone to stack overflow). Putting an include_bytes! array into a const is just asking for unexpected SO in your code.


The types of all literals must be fully constrained, which generally means numeric literals must either have a type suffix, or the type must specify their type

Numeric consts are the most common kinds of consts out there. If I can't omit the type on such consts (and to be clear, omitting it would just cause ambiguity and type/semantic errors), then that slashes a major reason to introduce this feature. It also means that I can't omit any type which contains a number, e.g. the element type of arrays.

When declaring a top-level item, the typing may not be entirely omitted.

Why? It doesn't serve any purpose. You still lose the explicit typing, you still must remember to type redundant information. With lifetime elision, the point of explicit '_ is to signify that the function is lifetime-generic. With const types, there is no information supplied, of course a const item has some type!.

static MESSAGE: _ = "Hello, World!";

There are some proposals to make the string syntax polymorphic. E.g. it could define a &str or String depending on the inferred type. This proposal would block any string literal polymorphism, so that must be called out explicitly.

const FN: _ = std::string::String::default;

String::default isn't a closure, it's a specific function. There is no way currently to specify the unique function type, but there is no reason why it shouldn't exist, given a reasonable syntax and compelling use cases.

Which also shows that the const type elision is harmfully ambiguous in this case. Did the author intend to use a unique function type? A closure type? An erased fn() -> String type? Not everyone even knows that functions and closures have unique types, which are different from function pointers!

You may say "what if I use the closure syntax". Well, a closure in a const item can't capture any context, so it's by definition coercible to a function pointer. So what should const F = || (); be? Is it a zero-sized closure? A function pointer? They will behave differently in code.

static ARR: [u8; _] = [12, 23, 34, 45];

Ok, now I want to use that array in my downstream code. I need to know its exact length, because my API requires an array of specific size. Am I supposed to manually count the array size each time?

For the original author, writing the array size is trivial. Just write 0, and the compiler will complain that it doesn't match the actual size 4. Write it down, done. It's a trivial annoyance, and it doesn't matter much unless you change the lengths of your arrays frequently (in which case its a semver break anyway, so it should be made explicit and done deliberately, rather than the compiler silently swallowing the breaking change).

You're trading a trivial ease-of-writing change into an issue for each one of your downstream consumers.


That said, I would be more open to type inference in const & static items which are strictly local to a function. Those are definitely not part of a public API, and any inference issues aren't harder to resolve than for a simple let binding. It's basically like a local variable, but which is known to be const at compile time, or which has a stable address.

However, given that the examples above present more cases for confusion than assumed by the RFC, I'm not so certain that the minor possible simplifications would pull the weight of a new feature.

I also don't think it's a good idea to introduce const type inference for private items (i.e. not part of the crate's public API). Yes, the semver issues don't exist, so there is less motivation to oppose it, but all the other issues are still present. It's a matter of protecting inexperienced users from themselves, for the same reason that we don't infer signatures of private functions, even though we could. Without a clearly defined small inference scope, type inference becomes a footgun, as demonstrated by languages such as Haskell and OCaml, where the best practice is to annotate all toplevel items, even though the compiler can infer their types.

@VorpalBlade
Copy link

VorpalBlade commented Feb 1, 2024

The array length case would simplify for me. I have had a few cases where I have static private (to the module) arrays. These can grow or shrink pretty often as they represent "passes" or "checks" of the code that it iterates over.

A recent example is the code implementing an environment sanity check in my command line program. It has an array of (check name string, function pointer that implements check). There is no good reason to have to update the number as well.

Is it a fairly minor annoyance? Yes. Perhaps not worth the semver hazard that exists for public items. But I rarely make globally visible constants, most constants are module local in my programs.

So I can see both sides of the argument.

@tmccombs
Copy link

tmccombs commented Feb 2, 2024

Use a type alias. If your const type is too complex, it will make it painful to use in downstream code. You can infer a complex type, but you can't ignore it, and you must still bend your code to fit the specific type.

For a public const, sure. But for a local const or static, that seems like overkill.

You also don't need to put that complex type into a const item. You can use a function, or maybe a const fn, which is in many ways easier to use (e.g. it can be generic).

That isn't the case for a static.

But it should be known to the macro user, and the author can just ask to supply that type. It will also likely make the macro more readable.

Not necessarily. For example, what if the type is the return type of calling a user-supplied generic function with a type that is generated by the macro? And even if it is known, in some cases it could significantly hurt the ergonomics of the macro to require the user to supply it.

Also nothing forces you to declare items as a const. Code generator can just directly inline the item.

Again, that is not the case for statics.

If the code generator's author can't reasonably define the type, knowing full well how their generator works, how the hell am I supposed to deduce it from a barf of generated code?

Often you don't really care that much about the actual type, just a trait that it implements.

There is literally no gain in expressivity, the same code is possible.

Not unless/until we get impl Trait type aliases that are usable on consts and statics, in the case of unnameable types.

It is more concise, yes, but conciseness on its own isn't a virtue. Not at the cost of readability. And it's definitely less maintainable

If your type is a combination of a bunch of combinitors for an internal or local const or static, I don't think having a large complex type is very readable, or maintainable.

If you don't care about the length of the array, don't type it as an array

What if you need to pass the array as an array, not a slice, to a function that is generic in the size of the array?

Numeric consts are the most common kinds of consts out there.

Just because a use case isn't the most common use case doesn't mean it isn't important.

Ok, now I want to use that array in my downstream code. I need to know its exact length, because my API requires an array of specific size. Am I supposed to manually count the array size each time?

I would hope that the generated documentation would document the actual inferred type, not just the type as written in code.

All that said, If we get impl Trait type aliases in the near future, I'm not entirely convinced either way for top level const/static. I do really want inference for local const/static though.

@Kolsky
Copy link

Kolsky commented Feb 2, 2024

Use a type alias

There is exactly one place in the code where I want to declare say static with the type OnceLock<Arc<RwLock<HashMap<String, SystemTime>>>, impl FnOnce() -> Arc<RwLock<HashMap<String, SystemTime>>>>. I only speak with it through a specific public API that manages this singleton. How exactly type alias is supposed to help? For a bunch of singletons this would be a pain to write.

But it should be known to the macro user, and the author can just ask to supply that type.

Macro doesn't know any types whatsoever. It only works with the source text. Supplying the type could be an option in some cases, unless it creates too much burden on macro author and/or user.

Numeric consts are the most common kinds of consts out there.

Given complex expressions on numeric consts, you'd have to add a literal suffix. Most numeric APIs reside in methods that are useful too often. Consts that use other consts are also a very common scenario. They do not require a literal, so only one const would have to be explicitly typed, the rest can benefit from it. I don't see why the workflow must be limited to plain const literals.

static ARR: [u8; _] = [12, 23, 34, 45];

Ok, now I want to use that array in my downstream code. I need to know its exact length

You can do the exact same thing you've proposed: add anonymous const, then just write 0 in the explicit type, the compiler would emit the correct value. Also why would you care about the exact value if the maintainer doesn't.
Besides, it is already trivially possible to hide the array length behind a macro I've left in this thread.

There are some proposals to make the string syntax polymorphic.

Those suck. It's a non-goal to hide a performance footgun behind a syntax sugar, which a flaky type inference combined with general misunderstanding on how it works would be constantly leading to, creating a nightmare for code reviewers. We already do not insert any implicit clones for the same reason. And, as you notice, no matter how useful Copy is and how rare it is to be the source of problems, it still can easily introduce stack overflows. Fstrings/custom prefixes would be a far better solution.

If you don't care about the length of the array, don't type it as an array. Use a slice.

Thank you for advice but please let me decide myself.

That said, I would be more open to type inference in const & static items which are strictly local to a function.

Unfortunately statics can be used in several user-facing functions. You can't declare an item local to multiple functions.


I'd say it deserves an experimentation. People can try it out, it'd give usage data, which would determine if it's useful or not.

@FrankHB

This comment was marked as abuse.

@FrankHB
Copy link

FrankHB commented Mar 29, 2024

Use a type alias. If your const type is too complex, it will make it painful to use in downstream code. You can infer a complex type, but you can't ignore it, and you must still bend your code to fit the specific type.

For a public const, sure. But for a local const or static, that seems like overkill.

No. This is nothing to do with publicity or locality, but the intent of the API author. Until the language supports the feature to express "unspecified" items explicitly in the syntax, it is legitimate to encode the concrete type in the public code but annotated with documentation to caveat users about the fact that parts of them are implementation details and subject to change.

@caseyross
Copy link

Here's my perspective as someone new to Rust:

I was confused by the compiler asking me to add a type for a const string literal, as it seemed like the type would be already known to the compiler.

I would support making type declarations optional for any const/static where there is exactly 1 possible type. I think this is an unimpeachable ergonomics improvement -- however, as mentioned abovethread, there is an argument against it if, through unrelated RFCs, there exists a possibility of making currently unambiguous types ambiguous in the future. (Though doing something like that sounds like a bad idea in the first place.)

I wouldn't support automatic type inference for numeric literals or other values where the literal value could fit multiple types. I think the user should always know what type is coming out of the const/static declaration, and the compiler should not be making any choices regarding that type.

@technetos

This comment was marked as resolved.

@FrankHB

This comment was marked as abuse.

@FrankHB

This comment was marked as abuse.

@oli-obk

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.