-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
397 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,397 @@ | ||
# Singular `extern` declarations | ||
|
||
<!-- | ||
Part of the Carbon Language project, under the Apache License v2.0 with LLVM | ||
Exceptions. See /LICENSE for license information. | ||
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
--> | ||
|
||
[Pull request](https://github.com/carbon-language/carbon-lang/pull/3980) | ||
|
||
<!-- toc --> | ||
|
||
## Table of contents | ||
|
||
- [Abstract](#abstract) | ||
- [Problem](#problem) | ||
- [Background](#background) | ||
- [Proposal](#proposal) | ||
- [Details](#details) | ||
- [Type coherency](#type-coherency) | ||
- [Using imported declarations](#using-imported-declarations) | ||
- [`has_extern` modifier](#has_extern-modifier) | ||
- [Versus current state](#versus-current-state) | ||
- [Rationale](#rationale) | ||
- [Alternatives considered](#alternatives-considered) | ||
- [Allow multiple `extern` declarations, remove the import requirement, or both](#allow-multiple-extern-declarations-remove-the-import-requirement-or-both) | ||
- [Total number of allowed declarations (`extern` and non-`extern`)](#total-number-of-allowed-declarations-extern-and-non-extern) | ||
- [Do not restrict the number of forward declarations](#do-not-restrict-the-number-of-forward-declarations) | ||
- [Allow up to two declarations total](#allow-up-to-two-declarations-total) | ||
- [Allow up to four declarations total](#allow-up-to-four-declarations-total) | ||
- [Don't require `has_extern`](#dont-require-has_extern) | ||
- [Alternate names for `has_extern`](#alternate-names-for-has_extern) | ||
|
||
<!-- tocstop --> | ||
|
||
## Abstract | ||
|
||
Each entity is restricted to one, optional `extern` declaration. If used, it | ||
must be imported by the owning library. The owning library annotates the | ||
existence of an `extern` with the `has_extern` modifier. | ||
|
||
## Problem | ||
|
||
In the `extern` model from | ||
[#3762: Merging forward declarations](https://github.com/carbon-language/carbon-lang/pull/3762), | ||
multiple `extern` declarations are allowed. | ||
[#3763: Matching redeclarations](https://github.com/carbon-language/carbon-lang/pull/3763) | ||
further evolved the `extern` keyword. | ||
|
||
The prior `extern` model assumed that the `extern` and non-`extern` declarations | ||
of a class formed two different types, which could be merged. | ||
[As discussed on #packages-and-libraries](https://discord.com/channels/655572317891461132/1217182321933815820/1230990636073881693), | ||
this runs into an issue with code such as: | ||
|
||
``` | ||
library "a"; | ||
class C {} | ||
``` | ||
|
||
``` | ||
library "b"; | ||
extern class C; | ||
extern fn F() -> C*; | ||
``` | ||
|
||
``` | ||
library "c"; | ||
import library "a"; | ||
extern fn F() -> C*; | ||
``` | ||
|
||
Here, the return types of `F` differ. | ||
|
||
This proposal aims to address the differing return types by unifying the type of | ||
`C` regardless of whether it's `extern`. This could be done under multiple | ||
different approaches, and this proposal aims for one which is efficient to | ||
process. | ||
|
||
## Background | ||
|
||
Proposals: | ||
|
||
- [#3762: Merging forward declarations](https://github.com/carbon-language/carbon-lang/pull/3762) | ||
- [#3763: Matching redeclarations](https://github.com/carbon-language/carbon-lang/pull/3763) | ||
|
||
Discussions: | ||
|
||
- [#packages-and-libraries: `extern` type coherency](https://discord.com/channels/655572317891461132/1217182321933815820/1230990636073881693) | ||
- [#packages-and-libraries: When to allow/disallow redeclarations](https://discord.com/channels/655572317891461132/1217182321933815820/1236016051632865421) | ||
- [Open discussion 2024-05-09: Number of allowed redeclarations](https://docs.google.com/document/d/1s3mMCupmuSpWOFJGnvjoElcBIe2aoaysTIdyczvKX84/edit?resourcekey=0-G095Wc3sR6pW1hLJbGgE0g&tab=t.0#heading=h.bu7djkos4xo) | ||
|
||
## Proposal | ||
|
||
A given entity may have up to three declarations: | ||
|
||
- An optional `extern` declaration | ||
- It must be in a separate library from the definition. | ||
- The owning library's API file must import the `extern` declaration, and | ||
must also contain a declaration. | ||
- An optional forward declaration | ||
- This must come before the definition. The API file is considered to be | ||
before the implementation file. | ||
- A required definition | ||
|
||
The first owning declaration must have the `has_extern` modifier. | ||
|
||
The consequential changes to the [problem example](#problem) are then: | ||
|
||
``` | ||
library "a"; | ||
// This proposal makes the import required. | ||
import library "b"; | ||
// This proposal adds `has_extern`. | ||
has_extern class C {} | ||
``` | ||
|
||
``` | ||
library "b"; | ||
extern class C; | ||
extern fn F() -> C*; | ||
``` | ||
|
||
``` | ||
library "c"; | ||
import library "a"; | ||
extern fn F() -> C*; | ||
``` | ||
|
||
## Details | ||
|
||
### Type coherency | ||
|
||
In the context of the example that is the [problem](#problem), `C` will produce | ||
the same type regardless of whether `C` is the `extern` or non-`extern` | ||
declaration. This means that both function signatures have identical types. | ||
|
||
We do this by having the non-`extern` declaration import the `extern` | ||
declaration. Because one can see the other, the compiler can easily use the same | ||
type for both declarations. This makes it easy for compilation of other | ||
libraries, which may be importing either or both declarations, to easily | ||
determine that types are equivalent. | ||
|
||
### Using imported declarations | ||
|
||
Since `extern class C;` must be imported by the owning library, we now allow | ||
uses of the imported name prior to its declaration within the same file. This is | ||
a divergence from | ||
[#3762](https://github.com/carbon-language/carbon-lang/pull/3762). It means the | ||
following now works: | ||
|
||
``` | ||
library "extern"; | ||
extern class MyType; | ||
``` | ||
|
||
``` | ||
library "use_extern"; | ||
import library "extern" | ||
// Uses the `extern` declaration. | ||
fn Foo(val: MyType*); | ||
has_extern class MyType { | ||
fn Bar[addr self: Self*]() { Foo(self); } | ||
} | ||
``` | ||
|
||
### `has_extern` modifier | ||
|
||
The `has_extern` modifier must be present on the first owning declaration, when | ||
there is an `extern` declaration. Because the first owning declaration must be | ||
in an API file, this will only be present in API files. | ||
|
||
The modifier allows libraries to restrict whether they have an `extern` | ||
declaration. It will also be used to support tool-based validation that the | ||
`extern` declaration is imported as required. | ||
|
||
### Versus current state | ||
|
||
The key changes are in comparison to the design from | ||
[#3762](https://github.com/carbon-language/carbon-lang/pull/3762): | ||
|
||
- There may only be one `extern` declaration. | ||
- The owning library is required to import the `extern` declaration. | ||
- When there is an `extern` declaration, the first owning declaration (either | ||
forward declaration or definition) must be marked as `has_extern`. | ||
- If there are two, it is not on the second. | ||
- Imported declarations are now valid for use, even when the same entity is | ||
declared later in the file. | ||
- The number of allowed forward declarations is reduced from two (one each in | ||
API and implementation files) to one. | ||
|
||
Other parts remain, such as the design for when modifier keywords are allowed. | ||
|
||
## Rationale | ||
|
||
- [Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem) | ||
- `has_extern` supports compiler validation in finding the `extern` | ||
declaration within imports. | ||
- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) | ||
- Unifying the type of `extern` entities addresses a type coherency issue. | ||
- `has_extern` supports developer control about whether `extern` should | ||
exist, making it a library decision about whether `extern` use-cases | ||
should be supported. | ||
- [Fast and scalable development](/docs/project/goals.md#fast-and-scalable-development) | ||
- Requiring the `extern` declaration be imported by the owning library | ||
should improve compiler performance. | ||
|
||
This proposal makes a trade-off with | ||
[Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code). | ||
The restriction of a unique `extern` declaration is expected to require | ||
additional work in migration, because C++ `extern` declarations will need to be | ||
consolidated. This is currently counter-balanced by the trade-offs involved, | ||
although it may result in a reevaluation of that aspect of this proposal. | ||
|
||
## Alternatives considered | ||
|
||
### Allow multiple `extern` declarations, remove the import requirement, or both | ||
|
||
We limit to one `extern` declaration. Continuing to allow multiple `extern` | ||
declarations (the previous state) is feasible. Similarly, we could have the | ||
non-`extern` declaration not required to import the `extern` declaration; this | ||
could be done with or without multiple `extern` declarations. For this set of | ||
alternatives, the issues which would arise are similar. | ||
|
||
In the compiler, we want to be able to determine that two types are equal | ||
through a unique identifier, such as a 32-bit integer. When one declaration sees | ||
another directly, as through an import, we identify the redeclaration by name, | ||
and reuse the unique identifier. This deduplication can occur once per | ||
declaration. Indirect imports can continue to use the unique identifier. | ||
|
||
We could instead support unifying declarations that did not see each other. | ||
However, this would require canonicalizing all types by name instead of by | ||
unique identifier. For example, consider: | ||
|
||
``` | ||
package Other library "type"; | ||
class MyType { | ||
fn Print(); | ||
}; | ||
``` | ||
|
||
``` | ||
package Other library "use_type"; | ||
fn Make() -> MyType*; | ||
``` | ||
|
||
``` | ||
package Other library "extern" api; | ||
extern class MyType; | ||
``` | ||
|
||
``` | ||
package Other library "use_extern"; | ||
fn Print(val: MyType*); | ||
``` | ||
|
||
``` | ||
library "merge"; | ||
import Other library "use_type"; | ||
import Other library "use_extern"; | ||
Other.Print(Other.Make()); | ||
``` | ||
|
||
Here, the "merge" library doesn't see either declaration of `MyType` directly. | ||
However, `Print(Make())` requires that both declarations of `MyType` be | ||
determined as equivalent. This particular indirect use also means that the names | ||
will not have been added to name lookup, so there is no reason for the two | ||
declarations to be associated by name. | ||
|
||
In order to do merge these declarations, we would need to identify that fully | ||
qualified names and other structural details are equivalent when the type is | ||
used (including non-explicit uses, such as interface lookup). We could achieve | ||
this, for example, by having a name lookup table for in-use types, managed per | ||
library. Each library would also need to validate that declarations were | ||
semantically equivalent, versus the current approach validating as part of the | ||
redeclaration. The cost of a per-library approach is expected to have a | ||
significant impact on the amount of work done as part of semantic analysis. | ||
|
||
We may end up wanting to do similar work in order to improve diagnostics for | ||
invalid cases where the `extern` is not correctly declared and imported. | ||
However, additional work on invalid code is less of a concern than additional | ||
work on fully valid code. | ||
|
||
In order to maintain a high-performance compiler, we are taking a restrictive | ||
approach that makes it simpler to associate type information. | ||
|
||
### Total number of allowed declarations (`extern` and non-`extern`) | ||
|
||
A few options were considered regarding the number of allowed declarations. | ||
|
||
We limit to two non-`extern` declarations: the optional forward declaration, and | ||
required definition. The need to provide interface implementations (for example, | ||
`impl MyType as Add`) is considered to constrain this choice. | ||
|
||
In this category, alternatives considered were: | ||
|
||
- Do not restrict the number of declarations | ||
- Allow up to two declarations total | ||
- Allow up to four declarations total | ||
|
||
Details for why each alternative was declined are below. | ||
|
||
#### Do not restrict the number of forward declarations | ||
|
||
We could not restrict the number of forward declarations, allowing an arbitrary | ||
amount -- possibly also after the definition. This would be consistent with C++. | ||
|
||
One thing to consider here is modifier keyword behavior. If we require modifier | ||
keywords to match across all declarations, that could become a maintenance | ||
burden for developers. If we don't, it makes the meaning of a given forward | ||
declaration more ambiguous. | ||
|
||
This option is declined due to the lack of clear benefit. | ||
|
||
#### Allow up to two declarations total | ||
|
||
Under this option, we would only allow one forward declaration, treating the | ||
`extern` declaration as a forward declaration. This would mean two declarations | ||
overall, instead of three. | ||
|
||
For this, the main concern was interactions between file placement of the | ||
definition, and file placement of interface implementations. Interface | ||
implementations must generally be in API files in order to be seen by other | ||
libraries. | ||
|
||
If the definition is required to be in the API file in order to allow the | ||
interface implementations in the API file, the API file would need to import | ||
libraries required to construct the definition. That could create issues for | ||
separation of build dependencies, and could also make it more difficult to | ||
unravel some dependency cycles between libraries. | ||
|
||
If the definition was allowed to be in the implementation file even when there | ||
were interface implementations in the API file, the ambiguity of seeing an | ||
`extern` declaration and being unsure of whether this was the owning library | ||
could have negative consequences for evaluation of interface constraints. | ||
|
||
The purpose of allowing a forward declaration when there is an `extern` | ||
declaration is to make it clear for interface implementations that they exist in | ||
the owning library, while processing the API file. | ||
|
||
#### Allow up to four declarations total | ||
|
||
The four declarations would be: | ||
|
||
1. `extern` declaration | ||
2. Forward declaration in API file | ||
3. Forward declaration in implementation file | ||
4. Definition | ||
|
||
The number of forward declarations allowed is consistent with the current state | ||
from [#3762](https://github.com/carbon-language/carbon-lang/pull/3762). | ||
|
||
This would allow for clarity when defining in the implementation file, to also | ||
be able to put a forward declaration above -- even when the forward declaration | ||
is pulled from the API file. | ||
|
||
If we're allowing declarations from another file (including the `extern` | ||
declaration) to be used before an entity is declared in the same file, the | ||
motivating factor for allowing a repeat forward declaration in an implementation | ||
file is removed. Previously, that was required for an entity to be referenced | ||
prior to its definition. | ||
|
||
In discussion of this option, it was considered unclear why we would allow two | ||
forward declarations, but not allow even more. The more popular choice seemed to | ||
be not restricting, which was also declined. | ||
|
||
### Don't require `has_extern` | ||
|
||
Instead of requiring a `has_extern` modifier on the definition, we could infer | ||
from the presence of an `extern` declaration. | ||
|
||
We had declined allowing a definition to control whether `extern` was allowed in | ||
discussion of [#3762](https://github.com/carbon-language/carbon-lang/pull/3762), | ||
although this is not directly mentioned in the proposal. At the time, it was | ||
dropped because the owning library didn't need to include `extern` declarations, | ||
and so having the definition opt-in to allowing `extern` was viewed as low | ||
benefit. However, now that the owning library must import the `extern` | ||
declaration, there is a tighter association and so we reevaluated. | ||
|
||
The modifier offers a benefit for being able to verify the association between | ||
`extern` and `has_extern` declarations, and offers additional parity in | ||
modifiers. It also makes it easy for a tool to know if it's missing a | ||
declaration. | ||
|
||
The information that an `extern` declaration exists somewhere else is expected | ||
to offer long-term refactoring benefits. | ||
|
||
### Alternate names for `has_extern` | ||
|
||
We're not entirely happy with the `has_extern` name, although it is ambiguous. A | ||
few other names that might be worth considering are `is_extern` and `externed`. | ||
However, it's not clear either of these names are better. The name may change in | ||
the future if a better model is brought up. |