Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check safe initialization of static objects #16970

Merged
merged 113 commits into from
Jun 16, 2023

Conversation

liufengyun
Copy link
Contributor

@liufengyun liufengyun commented Feb 20, 2023

The problem is illustrated by the example below:

class Foo(val opposite: Foo)
case object A extends Foo(B)     // A -> B
case object B extends Foo(A)     // B -> A

The check aims to be simple for programmers to understand, expressive, fast, and sound.

The check is centered around two design ideas: (1) initialization-time irrelevance; (2) partial ordering.

The check enforces the principle of initialization-time irrelevance, which means that the time when a static object is initialized should not change program semantics. For that purpose, it enforces the following rule:

The initialization of a static object should not directly or indirectly read or write mutable state owned by another static object.

This principle not only puts the initialization of static objects on a solid foundation but also avoids whole-program analysis.

Partial ordering means that the initialization dependencies of static objects form a directed-acyclic graph (DAG). No cycles with length bigger than 1 allowed --- which might lead to deadlocks in the presence of concurrency and strong coupling & subtle contracts between objects.

Related Issues:

#16152
#9176
#11262
scala/bug#9312
scala/bug#9115
scala/bug#9261
scala/bug#5366
scala/bug#9360

library/src/scala/annotation/init.scala Outdated Show resolved Hide resolved
library/src/scala/annotation/init.scala Outdated Show resolved Hide resolved
library/src/scala/annotation/init.scala Outdated Show resolved Hide resolved
library/src/scala/annotation/init.scala Outdated Show resolved Hide resolved
library/src/scala/annotation/init.scala Outdated Show resolved Hide resolved
// init param fields
klass.paramGetters.foreach { acc =>
val value = paramsMap(acc.name.toTermName)
if acc.is(Flags.Mutable) then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen the pattern of this if statement before. Perhaps it could be factored out. Or perhaps it's too messy to make that general enough.

parent match
case tree @ Block(stats, NewExpr(tref, New(tpt), ctor, argss)) => // can happen
evalExprs(stats, thisV, klass)
val args = evalArgs(argss.flatten, thisV, klass)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a recursive call to initParent on the NewExpr.

compiler/src/dotty/tools/dotc/transform/init/Objects.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/transform/init/Objects.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/transform/init/Objects.scala Outdated Show resolved Hide resolved
// ----------------------------- abstract domain -----------------------------

sealed abstract class Value:
def show(using Context): String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we begin to define the abstract domain and abstract values as listed in Section 4.4 in paper

valsMap: mutable.Map[Symbol, Value],
varsMap: mutable.Map[Symbol, Heap.Addr],
outersMap: mutable.Map[ClassSymbol, Value])
extends Value:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref is a parent class for both ObjectRef and OfClass, which correspond to Object and Instance locations as listed in Section 4.4. Both ObjectRef and OfClass need to store the values of immutable fields, as stored in valsMap, and the abstract heap address for mutable fields, as stored in varsMap, and the pointers and values for outer classes, as listed in outersMap.


/** A reference to a static object */
case class ObjectRef(klass: ClassSymbol)
extends Ref(valsMap = mutable.Map.empty, varsMap = mutable.Map.empty, outersMap = mutable.Map.empty):
Copy link
Contributor

@EnzeXing EnzeXing May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This corresponds with Object(O) as defined in Section 4.4. Note that the maps are mutable, and will be filled as we speculate the initialization of the object.

/**
* Represents values that are instances of the specified class.
*
* Note that the 2nd parameter block does not take part in the definition of equality.

This comment was marked as resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have case class Foo(a: Int)(b: Int), then Foo(1)(2) == Foo(1)(3).

case class OfClass private (
klass: ClassSymbol, outer: Value, ctor: Symbol, args: List[Value], env: Env.Data)(
valsMap: mutable.Map[Symbol, Value], varsMap: mutable.Map[Symbol, Heap.Addr], outersMap: mutable.Map[ClassSymbol, Value])
extends Ref(valsMap, varsMap, outersMap):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This corresponds to Instance as defined in Section 4.4. Again, the maps are mutable. We also store its local environment, which stores the information of the local method it resides in. We do not need to store local environment for ObjectRef since they refer to global objects.

def show(using Context) = refs.map(_.show).mkString("[", ",", "]")

/** A cold alias which should not be used during initialization. */
case object Cold extends Value:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This corresponds to the abstract value Cold as in Section 4.4.

case object Cold extends Value:
def show(using Context) = "Cold"

val Bottom = RefSet(ListSet.empty)

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use it as a convenience so that we have fewer cases in the method call/select/assign. The convenience should work as if we define a separate Bottom --- otherwise, it's broken.


/** The address for mutable local variables . */
private case class LocalVarAddr(regions: Regions.Data, sym: Symbol, owner: ClassSymbol) extends Addr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstract heap addresses can refer to mutable fields or local variables. A heap address also stores other necessary information, such as its owner and the region it belongs to. The value corresponds to the address is the join of all mutations to this address.

@EnzeXing
Copy link
Contributor

Comparison between the language design in paper and the actual implementation in checker: in paper, we explicitly list an abstract heap that maps a reference to an object or instance to the map that stores its field values. The checker directly stores the field value maps in ObjectRef and OfClass without using an explicit map. I believe Fengyun said that the paper design is convenient in such way that the references to objects and instances are immutable, as the explicit heap and maps handle the mutable parts, while in the checker, every reference contains mutable parts. Also it seems that in paper, only each instance stores the region it belongs to, while in code implementation, each mutable field stores the region in belongs to. Does it mean that we can define different regions during initialization of an object? This seems to be a more flexible design.

@EnzeXing
Copy link
Contributor

EnzeXing commented Jun 1, 2023

Syntax for the data structure abstraction we use in Objects.scala:

ve ::= ObjectRef(class)
| OfClass(class, vs[outer], ctor, args, env)
| OfArray(object[owner], regions)
| Fun(..., env)
| Cold // abstract values in domain
vs ::= Set(ve) // set of abstract values

refMap = ( ObjectRef | OfClass ) -> ( valsMap, varsMap, outersMap ) // refMap stores field informations of an object or instance
valsMap = valsym -> vs // maps immutable fields to their values
varsMap = valsym -> addr // each mutable field has an abstract address
outersMap = class -> vs // maps outer objects to their values

arrayMap = OfArray -> addr // an array has one address that stores the join value of every element

heap = addr -M> vs

env = (valsMap, Option[env]) // stores local variables in the residing method, and possibly outer environments

addr ::= localVarAddr(regions, valsym, owner)
| fieldVarAddr(regions, valsym, owner) // independent of OfClass/ObjectRef
| arrayAddr(regions, owner) // independent of array element type

regions ::= List(sourcePosition)

Note: -> refers to extensible maps whose entries cannot be modified; -M> refers to extensible whose entries can be modified

@liufengyun liufengyun changed the title WIP - Check safe initialization of static objects Check safe initialization of static objects Jun 12, 2023
@liufengyun liufengyun marked this pull request as ready for review June 12, 2023 19:31
@liufengyun
Copy link
Contributor Author

@nicolasstucki We introduced new annotations to the standard library, marked with @experimental. Could you have a look and see if there are any concerns with introducing the experimental annotations?

@liufengyun liufengyun requested a review from nicolasstucki June 13, 2023 18:17
Bottom

case value: (Bottom.type | ObjectRef | OfClass | Cold.type) =>
// The outer can be a bottom value for top-level classes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If klass is a top-level class, what would klass.owner be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The owner will be the enclosing package --- the root of the name hierarchy is the _root_ package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the case, would klass.owner.isClass return true on line 817, or it will try the find the enclosing method of the owner? I'm asking this because I'm trying to refine the type for thisV in resolveEnv.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Packages are represented as classes in the compiler --- therefore it will return true for top-level classes. It would be nice to add a comment here.

Copy link
Contributor

@nicolasstucki nicolasstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @experimental annotation LGTM

@liufengyun liufengyun merged commit 5caa6aa into scala:main Jun 16, 2023
@Kordyjan Kordyjan added this to the 3.4.0 milestone Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants