diff --git a/proposals/p3980.md b/proposals/p3980.md new file mode 100644 index 0000000000000..bbc7c0eae7c72 --- /dev/null +++ b/proposals/p3980.md @@ -0,0 +1,397 @@ +# Singular `extern` declarations + + + +[Pull request](https://github.com/carbon-language/carbon-lang/pull/3980) + + + +## Table of contents + +- [Abstract](#abstract) +- [Problem](#problem) +- [Background](#background) +- [Proposal](#proposal) +- [Details](#details) + - [Type coherency](#type-coherency) + - [Using imported declarations](#using-imported-declarations) + - [`has_extern` modifier](#has_extern-modifier) + - [Versus current state](#versus-current-state) +- [Rationale](#rationale) +- [Alternatives considered](#alternatives-considered) + - [Allow multiple `extern` declarations, remove the import requirement, or both](#allow-multiple-extern-declarations-remove-the-import-requirement-or-both) + - [Total number of allowed declarations (`extern` and non-`extern`)](#total-number-of-allowed-declarations-extern-and-non-extern) + - [Do not restrict the number of forward declarations](#do-not-restrict-the-number-of-forward-declarations) + - [Allow up to two declarations total](#allow-up-to-two-declarations-total) + - [Allow up to four declarations total](#allow-up-to-four-declarations-total) + - [Don't require `has_extern`](#dont-require-has_extern) + - [Alternate names for `has_extern`](#alternate-names-for-has_extern) + + + +## Abstract + +Each entity is restricted to one, optional `extern` declaration. If used, it +must be imported by the owning library. The owning library annotates the +existence of an `extern` with the `has_extern` modifier. + +## Problem + +In the `extern` model from +[#3762: Merging forward declarations](https://github.com/carbon-language/carbon-lang/pull/3762), +multiple `extern` declarations are allowed. +[#3763: Matching redeclarations](https://github.com/carbon-language/carbon-lang/pull/3763) +further evolved the `extern` keyword. + +The prior `extern` model assumed that the `extern` and non-`extern` declarations +of a class formed two different types, which could be merged. +[As discussed on #packages-and-libraries](https://discord.com/channels/655572317891461132/1217182321933815820/1230990636073881693), +this runs into an issue with code such as: + +``` +library "a"; +class C {} +``` + +``` +library "b"; +extern class C; +extern fn F() -> C*; +``` + +``` +library "c"; +import library "a"; +extern fn F() -> C*; +``` + +Here, the return types of `F` differ. + +This proposal aims to address the differing return types by unifying the type of +`C` regardless of whether it's `extern`. This could be done under multiple +different approaches, and this proposal aims for one which is efficient to +process. + +## Background + +Proposals: + +- [#3762: Merging forward declarations](https://github.com/carbon-language/carbon-lang/pull/3762) +- [#3763: Matching redeclarations](https://github.com/carbon-language/carbon-lang/pull/3763) + +Discussions: + +- [#packages-and-libraries: `extern` type coherency](https://discord.com/channels/655572317891461132/1217182321933815820/1230990636073881693) +- [#packages-and-libraries: When to allow/disallow redeclarations](https://discord.com/channels/655572317891461132/1217182321933815820/1236016051632865421) +- [Open discussion 2024-05-09: Number of allowed redeclarations](https://docs.google.com/document/d/1s3mMCupmuSpWOFJGnvjoElcBIe2aoaysTIdyczvKX84/edit?resourcekey=0-G095Wc3sR6pW1hLJbGgE0g&tab=t.0#heading=h.bu7djkos4xo) + +## Proposal + +A given entity may have up to three declarations: + +- An optional `extern` declaration + - It must be in a separate library from the definition. + - The owning library's API file must import the `extern` declaration, and + must also contain a declaration. +- An optional forward declaration + - This must come before the definition. The API file is considered to be + before the implementation file. +- A required definition + +The first owning declaration must have the `has_extern` modifier. + +The consequential changes to the [problem example](#problem) are then: + +``` +library "a"; + +// This proposal makes the import required. +import library "b"; + +// This proposal adds `has_extern`. +has_extern class C {} +``` + +``` +library "b"; +extern class C; +extern fn F() -> C*; +``` + +``` +library "c"; +import library "a"; +extern fn F() -> C*; +``` + +## Details + +### Type coherency + +In the context of the example that is the [problem](#problem), `C` will produce +the same type regardless of whether `C` is the `extern` or non-`extern` +declaration. This means that both function signatures have identical types. + +We do this by having the non-`extern` declaration import the `extern` +declaration. Because one can see the other, the compiler can easily use the same +type for both declarations. This makes it easy for compilation of other +libraries, which may be importing either or both declarations, to easily +determine that types are equivalent. + +### Using imported declarations + +Since `extern class C;` must be imported by the owning library, we now allow +uses of the imported name prior to its declaration within the same file. This is +a divergence from +[#3762](https://github.com/carbon-language/carbon-lang/pull/3762). It means the +following now works: + +``` +library "extern"; + +extern class MyType; +``` + +``` +library "use_extern"; +import library "extern" + +// Uses the `extern` declaration. +fn Foo(val: MyType*); + +has_extern class MyType { + fn Bar[addr self: Self*]() { Foo(self); } +} +``` + +### `has_extern` modifier + +The `has_extern` modifier must be present on the first owning declaration, when +there is an `extern` declaration. Because the first owning declaration must be +in an API file, this will only be present in API files. + +The modifier allows libraries to restrict whether they have an `extern` +declaration. It will also be used to support tool-based validation that the +`extern` declaration is imported as required. + +### Versus current state + +The key changes are in comparison to the design from +[#3762](https://github.com/carbon-language/carbon-lang/pull/3762): + +- There may only be one `extern` declaration. +- The owning library is required to import the `extern` declaration. +- When there is an `extern` declaration, the first owning declaration (either + forward declaration or definition) must be marked as `has_extern`. + - If there are two, it is not on the second. +- Imported declarations are now valid for use, even when the same entity is + declared later in the file. +- The number of allowed forward declarations is reduced from two (one each in + API and implementation files) to one. + +Other parts remain, such as the design for when modifier keywords are allowed. + +## Rationale + +- [Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem) + - `has_extern` supports compiler validation in finding the `extern` + declaration within imports. +- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) + - Unifying the type of `extern` entities addresses a type coherency issue. + - `has_extern` supports developer control about whether `extern` should + exist, making it a library decision about whether `extern` use-cases + should be supported. +- [Fast and scalable development](/docs/project/goals.md#fast-and-scalable-development) + - Requiring the `extern` declaration be imported by the owning library + should improve compiler performance. + +This proposal makes a trade-off with +[Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code). +The restriction of a unique `extern` declaration is expected to require +additional work in migration, because C++ `extern` declarations will need to be +consolidated. This is currently counter-balanced by the trade-offs involved, +although it may result in a reevaluation of that aspect of this proposal. + +## Alternatives considered + +### Allow multiple `extern` declarations, remove the import requirement, or both + +We limit to one `extern` declaration. Continuing to allow multiple `extern` +declarations (the previous state) is feasible. Similarly, we could have the +non-`extern` declaration not required to import the `extern` declaration; this +could be done with or without multiple `extern` declarations. For this set of +alternatives, the issues which would arise are similar. + +In the compiler, we want to be able to determine that two types are equal +through a unique identifier, such as a 32-bit integer. When one declaration sees +another directly, as through an import, we identify the redeclaration by name, +and reuse the unique identifier. This deduplication can occur once per +declaration. Indirect imports can continue to use the unique identifier. + +We could instead support unifying declarations that did not see each other. +However, this would require canonicalizing all types by name instead of by +unique identifier. For example, consider: + +``` +package Other library "type"; +class MyType { + fn Print(); +}; +``` + +``` +package Other library "use_type"; +fn Make() -> MyType*; +``` + +``` +package Other library "extern" api; +extern class MyType; +``` + +``` +package Other library "use_extern"; +fn Print(val: MyType*); +``` + +``` +library "merge"; +import Other library "use_type"; +import Other library "use_extern"; +Other.Print(Other.Make()); +``` + +Here, the "merge" library doesn't see either declaration of `MyType` directly. +However, `Print(Make())` requires that both declarations of `MyType` be +determined as equivalent. This particular indirect use also means that the names +will not have been added to name lookup, so there is no reason for the two +declarations to be associated by name. + +In order to do merge these declarations, we would need to identify that fully +qualified names and other structural details are equivalent when the type is +used (including non-explicit uses, such as interface lookup). We could achieve +this, for example, by having a name lookup table for in-use types, managed per +library. Each library would also need to validate that declarations were +semantically equivalent, versus the current approach validating as part of the +redeclaration. The cost of a per-library approach is expected to have a +significant impact on the amount of work done as part of semantic analysis. + +We may end up wanting to do similar work in order to improve diagnostics for +invalid cases where the `extern` is not correctly declared and imported. +However, additional work on invalid code is less of a concern than additional +work on fully valid code. + +In order to maintain a high-performance compiler, we are taking a restrictive +approach that makes it simpler to associate type information. + +### Total number of allowed declarations (`extern` and non-`extern`) + +A few options were considered regarding the number of allowed declarations. + +We limit to two non-`extern` declarations: the optional forward declaration, and +required definition. The need to provide interface implementations (for example, +`impl MyType as Add`) is considered to constrain this choice. + +In this category, alternatives considered were: + +- Do not restrict the number of declarations +- Allow up to two declarations total +- Allow up to four declarations total + +Details for why each alternative was declined are below. + +#### Do not restrict the number of forward declarations + +We could not restrict the number of forward declarations, allowing an arbitrary +amount -- possibly also after the definition. This would be consistent with C++. + +One thing to consider here is modifier keyword behavior. If we require modifier +keywords to match across all declarations, that could become a maintenance +burden for developers. If we don't, it makes the meaning of a given forward +declaration more ambiguous. + +This option is declined due to the lack of clear benefit. + +#### Allow up to two declarations total + +Under this option, we would only allow one forward declaration, treating the +`extern` declaration as a forward declaration. This would mean two declarations +overall, instead of three. + +For this, the main concern was interactions between file placement of the +definition, and file placement of interface implementations. Interface +implementations must generally be in API files in order to be seen by other +libraries. + +If the definition is required to be in the API file in order to allow the +interface implementations in the API file, the API file would need to import +libraries required to construct the definition. That could create issues for +separation of build dependencies, and could also make it more difficult to +unravel some dependency cycles between libraries. + +If the definition was allowed to be in the implementation file even when there +were interface implementations in the API file, the ambiguity of seeing an +`extern` declaration and being unsure of whether this was the owning library +could have negative consequences for evaluation of interface constraints. + +The purpose of allowing a forward declaration when there is an `extern` +declaration is to make it clear for interface implementations that they exist in +the owning library, while processing the API file. + +#### Allow up to four declarations total + +The four declarations would be: + +1. `extern` declaration +2. Forward declaration in API file +3. Forward declaration in implementation file +4. Definition + +The number of forward declarations allowed is consistent with the current state +from [#3762](https://github.com/carbon-language/carbon-lang/pull/3762). + +This would allow for clarity when defining in the implementation file, to also +be able to put a forward declaration above -- even when the forward declaration +is pulled from the API file. + +If we're allowing declarations from another file (including the `extern` +declaration) to be used before an entity is declared in the same file, the +motivating factor for allowing a repeat forward declaration in an implementation +file is removed. Previously, that was required for an entity to be referenced +prior to its definition. + +In discussion of this option, it was considered unclear why we would allow two +forward declarations, but not allow even more. The more popular choice seemed to +be not restricting, which was also declined. + +### Don't require `has_extern` + +Instead of requiring a `has_extern` modifier on the definition, we could infer +from the presence of an `extern` declaration. + +We had declined allowing a definition to control whether `extern` was allowed in +discussion of [#3762](https://github.com/carbon-language/carbon-lang/pull/3762), +although this is not directly mentioned in the proposal. At the time, it was +dropped because the owning library didn't need to include `extern` declarations, +and so having the definition opt-in to allowing `extern` was viewed as low +benefit. However, now that the owning library must import the `extern` +declaration, there is a tighter association and so we reevaluated. + +The modifier offers a benefit for being able to verify the association between +`extern` and `has_extern` declarations, and offers additional parity in +modifiers. It also makes it easy for a tool to know if it's missing a +declaration. + +The information that an `extern` declaration exists somewhere else is expected +to offer long-term refactoring benefits. + +### Alternate names for `has_extern` + +We're not entirely happy with the `has_extern` name, although it is ambiguous. A +few other names that might be worth considering are `is_extern` and `externed`. +However, it's not clear either of these names are better. The name may change in +the future if a better model is brought up.