Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture the kse3 issue in test cases and close it #21260

Merged
merged 2 commits into from
Aug 22, 2024

Conversation

dwijnand
Copy link
Member

The underlying type of an opaque type is only visible to anything within
the scope that contains that opaque type. So, for instance, a Worm
opaque type in a Worms object, anything within the Worms object sees the
underlying type. So Worms.Worm is an opaque abstract type with bounds,
while Worms.this.Worm is an opaque type with an underlying type. But
you can also reference Worms.Worm while being inside of the Worms
object.

The change I made to opaque types allowed for member selection to see
the underlying type when in a scope that can see that, which makes it
consistent with how TypeComparer allows those two types to be seen as
equivalent, when in the right scope.

In kse3, it seems, the fact that this wasn't done was utilised by using
an "external" reference to the opaque type, which avoided the underlying
type's method being selected, and the extension method being selected
instead.

While my change broke kse3, I believe the change is good, as it brings
consistency. And it is possible to fix the kse3 code, by using the
"universal function call syntax" (to borrow from Rust nomenclature) for
calling the extension method. Alternatively, the extension methods can
be defined where they really don't see the underlying type, and then the
companion object can be made to include the extension methods (to keep
them in implicit scope).

Closes #21239

The underlying type of an opaque type is only visible to anything within
the scope that contains that opaque type.  So, for instance, a Worm
opaque type in a Worms object, anything within the Worms object sees the
underlying type.  So Worms.Worm is an opaque abstract type with bounds,
while Worms.this.Worm is an opaque type with an underlying type.  But
you can also reference Worms.Worm while being inside of the Worms
object.

The change I made to opaque types allowed for member selection to see
the underlying type when in a scope that can see that, which makes it
consistent with how TypeComparer allows those two types to be seen as
equivalent, when in the right scope.

In kse3, it seems, the fact that this wasn't done was utilised by using
an "external" reference to the opaque type, which avoided the underlying
type's method being selected, and the extension method being selected
instead.

While my change broke kse3, I believe the change is good, as it brings
consistency.  And it is possible to fix the kse3 code, by using the
"universal function call syntax" (to borrow from Rust nomenclature) for
calling the extension method.  Alternatively, the extension methods can
be defined where they really don't see the underlying type, and then the
companion object can be made to include the extension methods (to keep
them in implicit scope).
@dwijnand dwijnand changed the title Capture the kse3 issue in test cases Capture the kse3 issue in test cases and close it Jul 24, 2024
@SethTisue
Copy link
Member

SethTisue commented Jul 25, 2024

🪱

Dunno about the implementation, but Dale walked me through the design/spec considerations, and I do agree that @Ichoran's code here was essentially exploiting a bug, so it's reasonable to ask him to change it. (Hi Rex!)

@SethTisue
Copy link
Member

SethTisue commented Jul 25, 2024

Dale found this key piece of history: 90b580b — the argument about transitivity is one of the basic constraints here.

@dwijnand dwijnand marked this pull request as ready for review July 25, 2024 16:41
@dwijnand dwijnand requested a review from SethTisue July 25, 2024 16:41
@Ichoran
Copy link

Ichoran commented Jul 25, 2024

The reason I did it the way that I did is because the ergonomics are a lot nicer. Also, I couldn't get Or to work at all without it, but Or was the first thing I wrote with opaque/union types, so maybe that was my fault.

If this wasn't intended--that my.full.path.Opaque looks different than Opaque when in the my.full.path namespace--I certainly can do it the other way. Hopefully not many other people have used the same trick. But it really does muck up the ergonomics for the library writer to have to use universal function call syntax. (In kse3 I have explicitly made internal ergonomics a non-goal.)

Last time I checked, adding more companion objects (or export statements) resulted in more object initializer clutter in the bytecode (even in cases where object initialization is conceptually a no-op). If there's a workaround for that so the bytecode is equally clean either way, then I'd prefer to wrap things more deeply.

@Ichoran
Copy link

Ichoran commented Jul 26, 2024

So, I checked the bytecode and the main issue is that with inline methods--and this is the most critical use case because boxing is already a perfectly valid newtype strategy--the inlining doesn't actually elide all the prep for doing method calls; it still pushes the companion object field on the stack.

From the test set, we have this approach with a companion object to hide things, where we can add inlining or not (I also added an export statement so usage is identical between the two workarounds):

package lib4:
  object Worms:
    opaque type Worm[V] = AtomicReference[AnyRef]
    object Worm extends WormOps:
      inline def create[V](v: V): Worm[V] = new AtomicReference[AnyRef](v.asInstanceOf[AnyRef])
      extension [V](worm: Worm[V])
        inline def wormAsAtomic: AtomicReference[AnyRef] = worm

  import Worms.Worm
  export Worms.Worm
  trait WormOps:
    extension [V](worm: Worm[V])
      inline def get: V  = worm.wormAsAtomic.get().asInstanceOf[V]
      def get2: V = get
      inline def get3: V = get

package test4:
  import lib4._
  object Test:
    def mcuse(worm: Worm[String]): String = worm.get2
    def inuse(worm: Worm[String]): String = worm.get3

and do the same for the universal function call syntax approach:

package lib7:
  opaque type Worm[V] = AtomicReference[AnyRef]
  object Worm:
    inline def create[V](v: V): Worm[V] = new AtomicReference[AnyRef](v.asInstanceOf[AnyRef])
    extension [V](worm: Worm[V])
      inline def get: V  = worm.get().asInstanceOf[V]
      def get2: V = Worm.get(worm)
      inline def get3: V = Worm.get(worm)

package test7:
  import lib7._
  object Test:
    def mcuse(worm: Worm[String]): String = worm.get2
    def inuse(worm: Worm[String]): String = worm.get3

The method call bytecode (mcuse) is identical in the two cases:

Companion Approach                                           Universal Function Approach
###########################################################  ################################################################
 0: getstatic     #37  // Field ...lib4/Worms$Worm$.MODULE$   0: getstatic     #37  // Field ...lib7/...package$Worm$.MODULE$
 3: aload_1                                                   3: aload_1
 4: invokevirtual #41  // Method ...lib4/Worms$Worm$.get2     4: invokevirtual #41  // Method ...lib7/...package$Worm$.get2
 7: checkcast     #43  // class java/lang/String              7: checkcast     #43   // class java/lang/String
10: areturn                                                  10: areturn

Great!

But the inlined versions have far more module field loads in the companion object case:

Companion Approach                                           Universal Function Approach
###########################################################  ###########################################################
 0: getstatic     #50  // Field ...lib4/Worms$.MODULE$        0: getstatic     #50  // Field ...lib7/...package$.MODULE$
 3: astore_2                                                  3: astore_2
 4: getstatic     #37  // Field ...lib4/Worms$Worm$.MODULE$   4: aload_1
 7: astore_3                                                  5: astore_3
 8: getstatic     #50  // Field ...lib4/Worms$.MODULE$        6: aload_2
11: astore        4                                           7: astore        4
13: aload_1                                                   9: aload_3
14: astore        5                                          10: astore        5
16: aload         5                                          12: aload         5
18: invokevirtual #55  // Method ...AtomicReference.get      14: invokevirtual #55  // Method AtomicReference.get
21: checkcast     #43  // class java/lang/String             17: checkcast     #43  // class java/lang/String
24: areturn                                                  20: areturn

because they both reference irrelevant objects, but in the latter case it's all the same field each time.

Now I haven't tried chasing the code through the JIT compiler to see if it ever has trouble with the left version and not the right. Neither of them looks as good as one would hope: really you'd like to see just the invokevirtual and checkcast. But because it's not obviously free, I'm a little wary betting code whose advantage would be speed on the difference.

Anyway, the workarounds work! But at least under 3.5.0-RC3, they have a modest bytecode difference so it's not all just the same thing.

@Ichoran
Copy link

Ichoran commented Jul 29, 2024

@dwijnand - I guess we're supposing that nobody else is using this for math?

Because my test suite in kse.maths caught a very non-obvious bug resulting from the change.

opaque type Frac = Long
object Frac{
  ...
  extension (f: kse.maths.Frac)
    ...
    def +(g: Frac): kse.maths.Frac = ...
    def -(g: Frac): kse.maths.Frac =
      f + Frac.wrap(((-g.numerator).toLong << 32) | (g.unwrap & 0xFFFFFFFFL))
}

This simply works: Long + Long is fine. Except I wanted Frac.+ to add two Fracs.

Here, you don't get a compile-time error like with get.

Certainly if there is a time to change it, earlier (i.e. now) is better than later. But if opaques have been used with this to alter the behavior of the same methods, things will just silently break (unless there's a good test suite, which fortunately I have been pretty diligent about).

@dwijnand
Copy link
Member Author

dwijnand commented Aug 7, 2024

the inlining doesn't actually elide all the prep for doing method calls; it still pushes the companion object field on the stack.

Looks like @soronpo independently discovered this too, so we can track this issue separately in his #21334.

@dwijnand
Copy link
Member Author

dwijnand commented Aug 7, 2024

Certainly if there is a time to change it, earlier (i.e. now) is better than later. But if opaques have been used with this to alter the behavior of the same methods, things will just silently break (unless there's a good test suite, which fortunately I have been pretty diligent about).

Thanks for catching and highlighting this. I've gone and reverted my Typer change for this, from both 3.5.1 and 3.3.4 (#21340 and #21341), so we only allow this change to come into play as of the 3.6 minor release.

I've also added a -3.6-migration warning, so anyone else can look to use that.

Copy link
Member

@SethTisue SethTisue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dale and I talked this through once already before, and then we went through it together a second time just now, and it LGTM, the change is worthwhile, the bytecode cost might get improved later, and the migration warning has us covered on the behavior change.

@dwijnand dwijnand merged commit afcb0ad into scala:main Aug 22, 2024
28 checks passed
@dwijnand dwijnand deleted the opaque-types-are-visible branch August 22, 2024 15:45
@dwijnand dwijnand added this to the 3.6.0 milestone Aug 22, 2024
@Ichoran
Copy link

Ichoran commented Aug 22, 2024

All right. I'm pretty sure I have sufficiently extensive tests to fix all my code, which I will do, but I bet people are going to be bitten by this. I hope we don't lose any spacecraft. (Mostly in jest, but one key use for opaque types is numerics, and this will silently switch from using a + defined on your opaque type to the + used on the underlying primitive, which will give type-safe but numerically wrong results.)

@SethTisue
Copy link
Member

I'm pretty sure I have sufficiently extensive tests to fix all my code,

If I understand correctly, you shouldn't need to rely on the tests — the new migration warning should tell you exactly where you need to change your code.

@dwijnand
Copy link
Member Author

Indeed, and you enable that with

import scala.language.`3.6-migration`

or on the command line.

@Ichoran
Copy link

Ichoran commented Aug 22, 2024

As soon as it hits nightly (I assume it isn't there yet?) I'll check it out and see if it catches everything.

@SethTisue
Copy link
Member

SethTisue commented Aug 22, 2024

(I assume it isn't there yet?)

not yet. try tomorrow.

my method of checking is to scala-cli -S 3.nightly, note the SHA in the version number (currently 3.6.0-RC1-bin-20240821-a2c53a1-NIGHTLY), and then in this repo do git log a2c53a1 to see what it includes.

@Ichoran
Copy link

Ichoran commented Oct 23, 2024

I've converted all my code, but in so doing, it seems like the new behavior is extremely prone to error. The old behavior with full path typing as a workaround was already error-prone enough: you had to remember to use it. This is even worse.

Is it possible to have a permanent compiler warn option that yells at you when there is ambiguity between the underlying type and the newtype? I would never want to be without this. It's just too easy to write something, think you're using it, and actually be using the method on the underlying type.

Since you can always either call the method explicitly or set the type explicitly to the underlying type, there's never a need for it to be ambiguous.

While it's true that there are styles that avoid the issue (e.g. deeper nesting), I don't think it's good to leave something this error-prone where it can be used at all.

Even if you're careful the first time, you're not protected: if you wrap something that gains a new method, your code will all silently fall over to that next time you recompile. That's a pretty annoying failure mode, especially since one of the use cases of opaque types can be to avoid stuff like that (e.g. when Java added lines to String).

So, anyway, this is a warning I would want on forever and always, not just for migration.

@Ichoran
Copy link

Ichoran commented Oct 23, 2024

@SethTisue - Should I open an enhancement request for this?

@dwijnand
Copy link
Member Author

@SethTisue - Should I open an enhancement request for this?

Yes, please. I'm happy to move that warning under something permanent.

But, yeah, that's the problem with extension methods vs class methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression in ichoran/kse3 for opaque types
3 participants