Skip to content

Commit

Permalink
Filling out template with PR 3980
Browse files Browse the repository at this point in the history
  • Loading branch information
jonmeow committed May 23, 2024
1 parent 31060ca commit 8c2f391
Showing 1 changed file with 397 additions and 0 deletions.
397 changes: 397 additions & 0 deletions proposals/p3980.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,397 @@
# Singular `extern` declarations

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/3980)

<!-- toc -->

## Table of contents

- [Abstract](#abstract)
- [Problem](#problem)
- [Background](#background)
- [Proposal](#proposal)
- [Details](#details)
- [Type coherency](#type-coherency)
- [Using imported declarations](#using-imported-declarations)
- [`has_extern` modifier](#has_extern-modifier)
- [Versus current state](#versus-current-state)
- [Rationale](#rationale)
- [Alternatives considered](#alternatives-considered)
- [Allow multiple `extern` declarations, remove the import requirement, or both](#allow-multiple-extern-declarations-remove-the-import-requirement-or-both)
- [Total number of allowed declarations (`extern` and non-`extern`)](#total-number-of-allowed-declarations-extern-and-non-extern)
- [Do not restrict the number of forward declarations](#do-not-restrict-the-number-of-forward-declarations)
- [Allow up to two declarations total](#allow-up-to-two-declarations-total)
- [Allow up to four declarations total](#allow-up-to-four-declarations-total)
- [Don't require `has_extern`](#dont-require-has_extern)
- [Alternate names for `has_extern`](#alternate-names-for-has_extern)

<!-- tocstop -->

## Abstract

Each entity is restricted to one, optional `extern` declaration. If used, it
must be imported by the owning library. The owning library annotates the
existence of an `extern` with the `has_extern` modifier.

## Problem

In the `extern` model from
[#3762: Merging forward declarations](https://github.com/carbon-language/carbon-lang/pull/3762),
multiple `extern` declarations are allowed.
[#3763: Matching redeclarations](https://github.com/carbon-language/carbon-lang/pull/3763)
further evolved the `extern` keyword.

The prior `extern` model assumed that the `extern` and non-`extern` declarations
of a class formed two different types, which could be merged.
[As discussed on #packages-and-libraries](https://discord.com/channels/655572317891461132/1217182321933815820/1230990636073881693),
this runs into an issue with code such as:

```
library "a";
class C {}
```

```
library "b";
extern class C;
extern fn F() -> C*;
```

```
library "c";
import library "a";
extern fn F() -> C*;
```

Here, the return types of `F` differ.

This proposal aims to address the differing return types by unifying the type of
`C` regardless of whether it's `extern`. This could be done under multiple
different approaches, and this proposal aims for one which is efficient to
process.

## Background

Proposals:

- [#3762: Merging forward declarations](https://github.com/carbon-language/carbon-lang/pull/3762)
- [#3763: Matching redeclarations](https://github.com/carbon-language/carbon-lang/pull/3763)

Discussions:

- [#packages-and-libraries: `extern` type coherency](https://discord.com/channels/655572317891461132/1217182321933815820/1230990636073881693)
- [#packages-and-libraries: When to allow/disallow redeclarations](https://discord.com/channels/655572317891461132/1217182321933815820/1236016051632865421)
- [Open discussion 2024-05-09: Number of allowed redeclarations](https://docs.google.com/document/d/1s3mMCupmuSpWOFJGnvjoElcBIe2aoaysTIdyczvKX84/edit?resourcekey=0-G095Wc3sR6pW1hLJbGgE0g&tab=t.0#heading=h.bu7djkos4xo)

## Proposal

A given entity may have up to three declarations:

- An optional `extern` declaration
- It must be in a separate library from the definition.
- The owning library's API file must import the `extern` declaration, and
must also contain a declaration.
- An optional forward declaration
- This must come before the definition. The API file is considered to be
before the implementation file.
- A required definition

The first owning declaration must have the `has_extern` modifier.

The consequential changes to the [problem example](#problem) are then:

```
library "a";
// This proposal makes the import required.
import library "b";
// This proposal adds `has_extern`.
has_extern class C {}
```

```
library "b";
extern class C;
extern fn F() -> C*;
```

```
library "c";
import library "a";
extern fn F() -> C*;
```

## Details

### Type coherency

In the context of the example that is the [problem](#problem), `C` will produce
the same type regardless of whether `C` is the `extern` or non-`extern`
declaration. This means that both function signatures have identical types.

We do this by having the non-`extern` declaration import the `extern`
declaration. Because one can see the other, the compiler can easily use the same
type for both declarations. This makes it easy for compilation of other
libraries, which may be importing either or both declarations, to easily
determine that types are equivalent.

### Using imported declarations

Since `extern class C;` must be imported by the owning library, we now allow
uses of the imported name prior to its declaration within the same file. This is
a divergence from
[#3762](https://github.com/carbon-language/carbon-lang/pull/3762). It means the
following now works:

```
library "extern";
extern class MyType;
```

```
library "use_extern";
import library "extern"
// Uses the `extern` declaration.
fn Foo(val: MyType*);
has_extern class MyType {
fn Bar[addr self: Self*]() { Foo(self); }
}
```

### `has_extern` modifier

The `has_extern` modifier must be present on the first owning declaration, when
there is an `extern` declaration. Because the first owning declaration must be
in an API file, this will only be present in API files.

The modifier allows libraries to restrict whether they have an `extern`
declaration. It will also be used to support tool-based validation that the
`extern` declaration is imported as required.

### Versus current state

The key changes are in comparison to the design from
[#3762](https://github.com/carbon-language/carbon-lang/pull/3762):

- There may only be one `extern` declaration.
- The owning library is required to import the `extern` declaration.
- When there is an `extern` declaration, the first owning declaration (either
forward declaration or definition) must be marked as `has_extern`.
- If there are two, it is not on the second.
- Imported declarations are now valid for use, even when the same entity is
declared later in the file.
- The number of allowed forward declarations is reduced from two (one each in
API and implementation files) to one.

Other parts remain, such as the design for when modifier keywords are allowed.

## Rationale

- [Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem)
- `has_extern` supports compiler validation in finding the `extern`
declaration within imports.
- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution)
- Unifying the type of `extern` entities addresses a type coherency issue.
- `has_extern` supports developer control about whether `extern` should
exist, making it a library decision about whether `extern` use-cases
should be supported.
- [Fast and scalable development](/docs/project/goals.md#fast-and-scalable-development)
- Requiring the `extern` declaration be imported by the owning library
should improve compiler performance.

This proposal makes a trade-off with
[Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code).
The restriction of a unique `extern` declaration is expected to require
additional work in migration, because C++ `extern` declarations will need to be
consolidated. This is currently counter-balanced by the trade-offs involved,
although it may result in a reevaluation of that aspect of this proposal.

## Alternatives considered

### Allow multiple `extern` declarations, remove the import requirement, or both

We limit to one `extern` declaration. Continuing to allow multiple `extern`
declarations (the previous state) is feasible. Similarly, we could have the
non-`extern` declaration not required to import the `extern` declaration; this
could be done with or without multiple `extern` declarations. For this set of
alternatives, the issues which would arise are similar.

In the compiler, we want to be able to determine that two types are equal
through a unique identifier, such as a 32-bit integer. When one declaration sees
another directly, as through an import, we identify the redeclaration by name,
and reuse the unique identifier. This deduplication can occur once per
declaration. Indirect imports can continue to use the unique identifier.

We could instead support unifying declarations that did not see each other.
However, this would require canonicalizing all types by name instead of by
unique identifier. For example, consider:

```
package Other library "type";
class MyType {
fn Print();
};
```

```
package Other library "use_type";
fn Make() -> MyType*;
```

```
package Other library "extern" api;
extern class MyType;
```

```
package Other library "use_extern";
fn Print(val: MyType*);
```

```
library "merge";
import Other library "use_type";
import Other library "use_extern";
Other.Print(Other.Make());
```

Here, the "merge" library doesn't see either declaration of `MyType` directly.
However, `Print(Make())` requires that both declarations of `MyType` be
determined as equivalent. This particular indirect use also means that the names
will not have been added to name lookup, so there is no reason for the two
declarations to be associated by name.

In order to do merge these declarations, we would need to identify that fully
qualified names and other structural details are equivalent when the type is
used (including non-explicit uses, such as interface lookup). We could achieve
this, for example, by having a name lookup table for in-use types, managed per
library. Each library would also need to validate that declarations were
semantically equivalent, versus the current approach validating as part of the
redeclaration. The cost of a per-library approach is expected to have a
significant impact on the amount of work done as part of semantic analysis.

We may end up wanting to do similar work in order to improve diagnostics for
invalid cases where the `extern` is not correctly declared and imported.
However, additional work on invalid code is less of a concern than additional
work on fully valid code.

In order to maintain a high-performance compiler, we are taking a restrictive
approach that makes it simpler to associate type information.

### Total number of allowed declarations (`extern` and non-`extern`)

A few options were considered regarding the number of allowed declarations.

We limit to two non-`extern` declarations: the optional forward declaration, and
required definition. The need to provide interface implementations (for example,
`impl MyType as Add`) is considered to constrain this choice.

In this category, alternatives considered were:

- Do not restrict the number of declarations
- Allow up to two declarations total
- Allow up to four declarations total

Details for why each alternative was declined are below.

#### Do not restrict the number of forward declarations

We could not restrict the number of forward declarations, allowing an arbitrary
amount -- possibly also after the definition. This would be consistent with C++.

One thing to consider here is modifier keyword behavior. If we require modifier
keywords to match across all declarations, that could become a maintenance
burden for developers. If we don't, it makes the meaning of a given forward
declaration more ambiguous.

This option is declined due to the lack of clear benefit.

#### Allow up to two declarations total

Under this option, we would only allow one forward declaration, treating the
`extern` declaration as a forward declaration. This would mean two declarations
overall, instead of three.

For this, the main concern was interactions between file placement of the
definition, and file placement of interface implementations. Interface
implementations must generally be in API files in order to be seen by other
libraries.

If the definition is required to be in the API file in order to allow the
interface implementations in the API file, the API file would need to import
libraries required to construct the definition. That could create issues for
separation of build dependencies, and could also make it more difficult to
unravel some dependency cycles between libraries.

If the definition was allowed to be in the implementation file even when there
were interface implementations in the API file, the ambiguity of seeing an
`extern` declaration and being unsure of whether this was the owning library
could have negative consequences for evaluation of interface constraints.

The purpose of allowing a forward declaration when there is an `extern`
declaration is to make it clear for interface implementations that they exist in
the owning library, while processing the API file.

#### Allow up to four declarations total

The four declarations would be:

1. `extern` declaration
2. Forward declaration in API file
3. Forward declaration in implementation file
4. Definition

The number of forward declarations allowed is consistent with the current state
from [#3762](https://github.com/carbon-language/carbon-lang/pull/3762).

This would allow for clarity when defining in the implementation file, to also
be able to put a forward declaration above -- even when the forward declaration
is pulled from the API file.

If we're allowing declarations from another file (including the `extern`
declaration) to be used before an entity is declared in the same file, the
motivating factor for allowing a repeat forward declaration in an implementation
file is removed. Previously, that was required for an entity to be referenced
prior to its definition.

In discussion of this option, it was considered unclear why we would allow two
forward declarations, but not allow even more. The more popular choice seemed to
be not restricting, which was also declined.

### Don't require `has_extern`

Instead of requiring a `has_extern` modifier on the definition, we could infer
from the presence of an `extern` declaration.

We had declined allowing a definition to control whether `extern` was allowed in
discussion of [#3762](https://github.com/carbon-language/carbon-lang/pull/3762),
although this is not directly mentioned in the proposal. At the time, it was
dropped because the owning library didn't need to include `extern` declarations,
and so having the definition opt-in to allowing `extern` was viewed as low
benefit. However, now that the owning library must import the `extern`
declaration, there is a tighter association and so we reevaluated.

The modifier offers a benefit for being able to verify the association between
`extern` and `has_extern` declarations, and offers additional parity in
modifiers. It also makes it easy for a tool to know if it's missing a
declaration.

The information that an `extern` declaration exists somewhere else is expected
to offer long-term refactoring benefits.

### Alternate names for `has_extern`

We're not entirely happy with the `has_extern` name, although it is ambiguous. A
few other names that might be worth considering are `is_extern` and `externed`.
However, it's not clear either of these names are better. The name may change in
the future if a better model is brought up.

0 comments on commit 8c2f391

Please sign in to comment.