- Problem
- Background
- Proposal
- Details
- Caveats
- Rationale based on Carbon's goals
- Alternatives considered
Control flow is documented at
language overview. for
loops are common
in C++, and Carbon should consider providing some form of it.
There are two forms of for
loops in C++:
- Semisemi (semicolon, semicolon):
for (int i = 0; i < list.size(); ++i)
- Range-based:
for (auto x : list)
Semisemi for
loops have been around for a long time, and are in C. Range-based
for
loops were added in C++11.
For example, here is a basic semisemi:
for (int i = 0; i < list.size(); ++i) {
printf("List at %d: %s\n", i, list[i].name);
}
An equivalent semisemi using iterators and the comma operator may look like:
int i = 0;
for (auto it = list.begin(); it != list.end(); ++it, ++i) {
printf("List at %d: %s\n", i, it->name);
}
Range-based syntax can be simpler, but can also make it more difficult if there are multiple pieces of interesting information:
int i = 0;
for (const auto& x : list) {
printf("List at %d: %s\n", i, x.name);
++i;
}
Java provides equivalent syntax to C++. Although Java doesn't have a comma operator, it does provide for comma-separated statements in the first and third sections of semisemi for loops.
Both TypeScript and JavaScript offer three kinds of for loops:
- Semisemi, mirroring C++.
for (x of list)
, mirroring range-based for loops.for (x in list)
, returning indices.
For example, here is an in
loop:
for (i in list) {
console.log('List at ' + i + ': ' + list[i].name);
}
Python, Swift, and Rust all only support range-based for loops, using
for x in list
syntax.
Go uses for
as its primary looping construct. It has:
- Semisemi, mirroring C++.
for i < list.size()
condition-only loops, mirroring C++while
loops.for {
infinite loops.
Carbon should adopt C++-style range-based for
loops syntax. Semisemi for
loops should be addressed through a different mechanism.
Related keywords are:
for
continue
: continues with the next loop iteration.break
: breaks out of the loop.
For loop syntax looks like: for (
var
type variable :
expression
) {
statements }
Similar to the
if/else proposal, the
braces are optional and must be paired ({ ... }
) if present. When there are no
braces, only one statement is allowed.
continue
will continue with the next loop iteration directly, skipping any
other statements in the loop body.
break
exits the loop immediately.
All of this is consistent with C and C++ behavior.
The syntax for inputs is not being defined in this proposal. However, we can still establish critical things to support:
- Interoperable C++ objects that work with C++'s range-based
for
loops, such as containers with iterators. - Carbon arrays and other containers.
- Range literals. These are not proposed, but for an example seen in other
languages,
0..2
may indicate the set of integers [0, 2).
%token FOR
statement:
FOR "(" pattern ":" expression ")" statement
| /* preexisting statements elided */
;
The continue
and break
statements are intended to be added as part of the
while proposal.
This baseline syntax is based on C++, following the migration sub-goal Familiarity for experienced C++ developers with a gentle learning curve. To the extent that this proposal anchors on a particular approach, it aims to anchor on C++'s existing syntax, consistent with that sub-goal.
Alternatives will generally reflect breaking consistency with C++ syntax. While most proposals may consider alternatives more, this proposal suggests a threshold of only accepting alternatives that skew from C++ syntax if they are clearly better; the priority in this proposal is to avoid debate and produce a trivial proposal. Where an alternative would trigger debate, it should be examined by an advocate in a separate proposal.
Carbon will not provide semisemi support. This decision will be contingent upon
a better alternative loop structure which is not currently provided by while
or for
syntax. If Carbon doesn't evolve a better solution, semisemi support
will be added later.
For details, see the alternative.
Range literals are important to the ergonomics of range-based for
loops, and
should be added. However, they should be examined separately as part of limiting
the scope of this proposal.
Several languages have the concept of providing an index with the object in a range-based for loop:
- Python does
for i, item in enumerate(items)
, with a global function. - Go does
for i, item := range items
, with a keyword. - Swift does
for (i, item) in items.enumerated()
, having removed aenumerate()
global function. - Rust does
for (i, item) in items.enumerate()
.
An equivalent pattern for Carbon should be examined separately as part of limiting the scope of this proposal.
Relevant goals are:
-
3. Code that is easy to read, understand, and write:
- Range-based
for
loops are easy to read and very helpful. - Semisemi
for
syntax is complex and can be error prone for cases where range-based loops work. Avoiding it, even by providing equivalent syntax with a different loop structure, should discourage its use and direct engineers towards better options. The alternative syntax should also be easier to understand than semisemi syntax, otherwise we should just keep semisemi syntax.
- Range-based
-
7. Interoperability with and migration from existing C++ code:
- Keeping syntax close to C++ will make it easier for developers to transition.
Both alternatives from the
if
/else
proposal
apply to while
as well: we could remove parentheses, require braces, or both.
The conclusions mirror here in order to avoid a divergence in syntax.
Additional alternatives follow.
We could include semisemi for loops for greater consistency with C++.
This is in part important because switching from a semisemi for
loop to a
while
loop is not always straightforward due to how for
evaluates the third
section of the semisemi. The inter-loop evaluation of the third section is
important given how it interacts with continue
. In particular, consider the
loops:
for (int i = 0; i < 3; ++i) {
if (i == 1) continue;
printf("%d\n", i);
}
int j = 0;
while (j < 3) {
if (j == 1) continue;
printf("%d\n", j);
++j;
}
int k = 0;
while (k < 3) {
++k;
if (k == 1) continue;
printf("%d\n", k);
}
int l = 0;
while (l < 3) {
if (l == 1) {
++l;
continue;
}
printf("%d\n", l);
++l;
}
To explain the differences between these loops:
- The first loop will print 0 and 2.
- The second loop will print 0, then loop infinitely because the increment is never reached.
- The third loop will only print 2 because the increment happens too early.
- Only the fourth loop is equivalent to the first loop, and it duplicates the increment.
There is no easy place to put the increment in a while
loop.
Advantages:
- We need a plan for
migrating both developers and code from C++
semisemis
for
loops, and providing them in Carbon is the easiest solution.- Semisemis remain common in C++ code.
- Semisemis are much more flexible than range-based
for
loops.while
loops do not offer a sufficient alternative.
Disadvantages:
- Semisemi loops can be error prone, such as
for (int i = 0; i < 3; --i)
.- Syntax such as
for (int x : range(0, 3))
leaves less room for developer mistakes. - Removing semisemi syntax will likely improve understandability of Carbon code, a language goal.
- Syntax such as
- If we add semisemi loops, it would be very difficult to get rid of them.
- Code using them should be expected to accumulate quickly, from both migrated code and developers familiar with C++ idioms.
If we want to remove for
loops, we should avoid adding them. We do need to
ensure that developers are happy with the replacement, although that should be
achievable through providing strong range support, including range literals.
A story for migrating developers and code is still required. For developers, it
would be ideal if we could have a compiler error that detects semisemi loops and
advises the preferred Carbon constructs. For both developers and code, we need a
suitable loop syntax that is easy to use in cases that remain hard to write in
while
or range-based for
loops. This will depend on a separate proposal, but
there's at least presently interest in this direction.
Range-based for loops could write in
instead of :
, such as:
for (x in list) {
...
}
An argument for switching now, instead of using
C++ as a baseline, would be that var
syntax has been
discussed as using a :
, and avoiding :
in range-based for loops may reduce
syntax ambiguity risks. However, the
current var
proposal
does not use a :
, and so this risk is only a potential future concern: it's
too early to require further evaluation.
Because the benefits of this alternative are debatable and would diverge from
C++, adopting in
would run contrary to
using C++ as a baseline. Any divergence should be justified
and reviewed as a separate proposal.
C++ allows for (auto [x, y] : range_of_pairs)
which is not explicitly part of
the syntax here. Carbon is likely to support this through tuples, so adding
special for
syntax for this would likely be redundant.