Skip to content

Adding a useless assignment statement alters program behavior. #21649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jrlemieux opened this issue Sep 25, 2024 · 10 comments · May be fixed by #23002
Open

Adding a useless assignment statement alters program behavior. #21649

jrlemieux opened this issue Sep 25, 2024 · 10 comments · May be fixed by #23002

Comments

@jrlemieux
Copy link

jrlemieux commented Sep 25, 2024

Compiler version

3.5.0, 3.5.1

Minimized code

This is a follow-up on an issue reported by Tomassino on the Scala users.scala-lang.org forum.

I created this small demo Github project to showcase this issue: https://github.com/jrlemieux/scala-weird

The two classes Test1 and Test2 inherit common code that is calling their virtual partial function outerPf_2.

The only difference between the two classes is the commenting of the useless statement "val something = p".

class Test1() extends TestTrait:

  val outerPf_2: PartialFunction[Any, Any] = p =>
    val something = p
    innerPf(p) match
      case Some(_) => println("Something")

class Test2() extends TestTrait:

  val outerPf_2: PartialFunction[Any, Any] = p =>
    // val something = p
    innerPf(p) match
      case Some(_) => println("Something")

Expectation

It was expected that Test1.outerPf_2 would behave exactly as Test2.outerPf_2, but it doesn't.

The compiler generates very different code for the isDefinedAt method of the outerPf_2 partial functions.

For Test1:

  public final boolean isDefinedAt(java.lang.Object);
    Code:
       0: aload_1
       1: astore_2
       2: iconst_1
       3: ireturn

For Test2:

  public final boolean isDefinedAt(java.lang.Object);
    Code:
       0: aload_0
       1: getfield      #24                 // Field $outer:Lcom/playzone/some-common-name;
       4: invokevirtual #33                 // Method com/playzone/some-common-name.innerPf:()Lscala/PartialFunction;
       7: aload_1
       8: invokeinterface #39,  2           // InterfaceMethod scala/PartialFunction.apply:(Ljava/lang/Object;)Ljava/lang/Object;
      13: checkcast     #41                 // class scala/Option
      16: astore_2
      17: aload_2
      18: instanceof    #43                 // class scala/Some
      21: ifeq          26
      24: iconst_1
      25: ireturn
      26: iconst_0
      27: ireturn

The difference might be due to the fact that one PF is inlining code. But whether the compiler optimizes with inlining or not should not affect the behavior of the code. Adding an extra useless assignment statement should not affect the code's behavior.

@jrlemieux jrlemieux added itype:bug stat:needs triage Every issue needs to have an "area" and "itype" label labels Sep 25, 2024
@som-snytt
Copy link
Contributor

I don't have the phone where I can just ask it what forum topic was I reading where I linked to a scala 2 ticket about partial functions, but that ticket has a discussion about the equivalence of the function shape x => x match and the pattern matching anonymous function.

The spec may need to iron out details, but the assumption that adding a statement doesn't matter isn't correct.

Maybe it is worth a lint.

@jrlemieux
Copy link
Author

If your comment "Maybe it is worth a lint." means running scalafix, I just did, with no warnings. Thanks.

@jrlemieux
Copy link
Author

And would "the assumption that adding a useless statement doesn't matter is correct" be true ?

@dwijnand
Copy link
Member

So different source code emits different bytecode. How is the behaviour observably different at runtime? Is there a test case (as basic as a main with asserts) that captures an unexpected difference in behaviour between the two?

@som-snytt
Copy link
Contributor

By "worth a lint", I meant that someone should add code to lint (internal lint in scalac or external in scalafix):

scala> val f: PartialFunction[String, Boolean] = s => s.toUpperCase match { case "TRUE" => true }
val f: PartialFunction[String, Boolean] = <function1>

scala> val g: PartialFunction[String, Boolean] = s => { val x = s.toUpperCase; x match { case "TRUE" => true }}
val g: PartialFunction[String, Boolean] = <function1>

scala> f.isDefinedAt("no")
val res2: Boolean = false

scala> g.isDefinedAt("no")
val res3: Boolean = true

The old scala 2 ticket:
scala/bug#4940
which is open for a related bug in scala 2.

For the g case, I propose the nomenclature, "impartial function".

@Tomassino-ibm
Copy link

Hi all, I would like to add a couple of examples showing why this is a bug, in my opinion. Consider the following code (all code uses scala2 syntax, since I tested it on scala 2.13, 3.3 and 3.5):

object Test01 extends App {
  val innerPf: PartialFunction[Any, Option[Unit]] = { case x: String => Some(()) }
  val outerPf_1: PartialFunction[Any, Any]        = { p =>
    innerPf(p) match { case Some(_) => println("Something") }
  }
  val outerPf_2: PartialFunction[Any, Any]        = { p =>
    val something = p
    innerPf(p) match { case Some(_) => println("Something") }
  }

  println(s"innerPf.isDefinedAt(10): ${innerPf.isDefinedAt(10)}")
  println(s"innerPf.isDefinedAt(\"abc\"): ${innerPf.isDefinedAt("abc")}")

  try {
    println(s"outerPf_1.isDefinedAt(10): ${outerPf_1.isDefinedAt(10)}")
  } catch {
    case _: MatchError => println("outerPf_1.isDefinedAt(10): MatchError")
  }
  try {
    println(s"outerPf_1.isDefinedAt(\"abc\"): ${outerPf_1.isDefinedAt("abc")}")
  } catch {
    case _: MatchError => println("outerPf_1.isDefinedAt(\"abc\"): MatchError")
  }

  try {
    println(s"outerPf_2.isDefinedAt(10): ${outerPf_2.isDefinedAt(10)}")
  } catch {
    case _: MatchError => println("outerPf_2.isDefinedAt(10): MatchError")
  }
  try {
    println(s"outerPf_2.isDefinedAt(\"abc\"): ${outerPf_2.isDefinedAt("abc")}")
  } catch {
    case _: MatchError => println("outerPf_2.isDefinedAt(\"abc\"): MatchError")
  }

The output is:

innerPf.isDefinedAt(10): false
innerPf.isDefinedAt("abc"): true
outerPf_1.isDefinedAt(10): MatchError
outerPf_1.isDefinedAt("abc"): true
outerPf_2.isDefinedAt(10): true
outerPf_2.isDefinedAt("abc"): true

So, calling isDefinedAt throws an exception, while I would assume that the function should never do that.

Here is another example:

object Test02 extends App {
  val innerPf: PartialFunction[Any, Option[Unit]] = { x =>
    println("Why am I called?")
    Some(())
  }
  val outerPf_1: PartialFunction[Any, Any]        = { p =>
    innerPf(p) match { case Some(_) => println("Something") }
  }
  val outerPf_2: PartialFunction[Any, Any]        = { p =>
    val something = p

    innerPf(p) match { case Some(_) => println("Something") }
  }

  println(s"innerPf.isDefinedAt(10): ${innerPf.isDefinedAt(10)}")
  println(s"innerPf.isDefinedAt(\"abc\"): ${innerPf.isDefinedAt("abc")}")

  try {
    println(s"outerPf_1.isDefinedAt(10): ${outerPf_1.isDefinedAt(10)}")
  } catch {
    case _: MatchError => println("outerPf_1.isDefinedAt(10): MatchError")
  }
  try {
    println(s"outerPf_1.isDefinedAt(\"abc\"): ${outerPf_1.isDefinedAt("abc")}")
  } catch {
    case _: MatchError => println("outerPf_1.isDefinedAt(\"abc\"): MatchError")
  }

  try {
    println(s"outerPf_2.isDefinedAt(10): ${outerPf_2.isDefinedAt(10)}")
  } catch {
    case _: MatchError => println("outerPf_2.isDefinedAt(10): MatchError")
  }
  try {
    println(s"outerPf_2.isDefinedAt(\"abc\"): ${outerPf_2.isDefinedAt("abc")}")
  } catch {
    case _: MatchError => println("outerPf_2.isDefinedAt(\"abc\"): MatchError")
  }
}

The output is:

innerPf.isDefinedAt(10): true
innerPf.isDefinedAt("abc"): true
Why am I called?
outerPf_1.isDefinedAt(10): true
Why am I called?
outerPf_1.isDefinedAt("abc"): true
outerPf_2.isDefinedAt(10): true
outerPf_2.isDefinedAt("abc"): true

So, in this case, isDefinedAt even calls innerPf, when called on outerPf_1, which is definitely unexpected.

As I said above, I tested code with scala 2.13, 3.3 and 3.5 and the behavior is exactly the same. In my opinion this is a bug, since isDefinedAt has a wrong behavior (it calls a function, the exception is most probably because of this).

@odersky
Copy link
Contributor

odersky commented Sep 27, 2024

I agree that there's something missing here (quite a lot, actually). First, I could not find any point in the spec where it is defined what kind of lambdas can implement partial functions. If I missed it, please point it out. A spec would be the first thing to look at for a decision what should be done. Worse, I did not find any treatment of single abstract method types in the Scala spec. It seems PartialFunctions are treated as a special case of this.

Now if we look at the definition of PartialFunction in the standard library, then the examples show PartialFunction exclusively implemented with case blocks, not with other lambdas.

val sample = 1 to 10
def isEven(n: Int) = n % 2 == 0
val eveningNews: PartialFunction[Int, String] = {
  case x if isEven(x) => s"$x is even"
}

// The method collect is described as "filter + map"
// because it uses a PartialFunction to select elements
// to which the function is applied.
val evenNumbers = sample.collect(eveningNews)

val oddlyEnough: PartialFunction[Int, String] = {
  case x if !isEven(x) => s"$x is odd"
}

// The method orElse allows chaining another PartialFunction
// to handle input outside the declared domain.
val numbers = sample.map(eveningNews orElse oddlyEnough)

// same as
val numbers = sample.map(n => eveningNews.applyOrElse(n, oddlyEnough))

val half: PartialFunction[Int, Int] = {
  case x if isEven(x) => x / 2
}

I believe that was the original intention. And, specifically an arbitrary lambda would not be allowed as the implementation of a PartialFunction. But there was one exception: A function such as

x => x match
  case pat1 => rhs1
  ...
  case patN => rhsN

(with the x repeated as parameter and match selector) would be allowed as a PartialFunction, because by the time the compiler sees this the case block has already been expanded to this form. Also, it allows to add an annotation to the parameter of a partial function, which is sometimes needed. This was as far as things went originally.

But with the arrival of SAM types in 2.12 we now do allow arbitrary lambdas as implementations and simply set the isDefinedAt method to be always true for these. This is IMO a bit dubious. Maybe it's better to be told immediately that a function cannot have a meaningful isDefinedAt method?

And, to make matters worse the original test also got diluted. It seems instead of only treating x => x match { cases } as a partial function with a non-trivial isDefinedAt, we now also allow x => e match { cases } as such, for arbitrary expressions e. Lacking a spec, it's hard to be sure, but I believe this was a bug.

Now, what to do?

  1. We could treat x => e match { cases } as a regular lambda without a special isDefinedAt method. That would match the original intention. But that would be a change in runtime behavior, which is very hard to push through.
  2. We could do (1) and disallow regular lambdas for partial functions (except those of the form x => x match { cases }). I think that would be the choice that leads to the least suprises. But it might also lead to a lot of breakage. Not sure how common this pattern is in Scala codebases.

@odersky odersky added itype:bug stat:needs spec and removed itype:question stat:needs triage Every issue needs to have an "area" and "itype" label labels Sep 27, 2024
@som-snytt
Copy link
Contributor

@Tomassino-ibm

So, calling isDefinedAt throws an exception, while I would assume that the function should never do that.

That assumption does not hold. Synthetic isDefinedAt is not the same as "reachability" that the compiler warns about. An arbitrary extractor could throw.

scala> val f: PartialFunction[String, Boolean] = { case "TRUE" if ??? => true }
val f: PartialFunction[String, Boolean] = <function1>

scala> f.isDefinedAt("TRUE")
scala.NotImplementedError: an implementation is missing
  at scala.Predef$.$qmark$qmark$qmark(Predef.scala:344)
  at rs$line$1$$anon$1.isDefinedAt(rs$line$1:1)
  at rs$line$1$$anon$1.isDefinedAt(rs$line$1:1)
  ... 30 elided

@Tomassino-ibm
Copy link

Tomassino-ibm commented Sep 27, 2024

I would like to give my perspective, that of a simple user of scala, not of a language expert. What seems odd to me is that x => x match { cases } is treated specially. To me that is a total function and, since total functions can be seen as a special case of partial functions, it can be of type PartialFunction[A, B] with a trivial isDefinedAt which always returns true. If that was true, then, for example, it would have made sense to check the internal match for completeness and issue a compiler warning otherwise (as usual). In the end, if I'm not wrong, every total function can already be converted to a partial function with isDefinedAt always returning true, so it's just that syntax that's special. The same can be said for x => e match { cases } for which the current behavior is even stranger to me.

I'm not sure why this choice was made in the first place, there are probably valid reasons I ignore. It's also obvious that it's not possible to simply change the specification of the language with the risk of breaking code that is valid today and I'm not familiar with the process of how language specification evolves, so I won't try to give any suggestion on what to do 🙂

P.S.: @som-snytt Thanks for pointing out that isDefinedAt can actually throw, depending on how the partial function is implemented, I didn't thought about that 👍

@som-snytt
Copy link
Contributor

For reference, the example deemed "kind of a feature" on the linked scala 2 ticket (scala/bug#4940):

scala> import util.*

scala> List("1", "two", "3").collect(x => Try(x.toInt) match { case Success(i) => i })
val res0: List[Int] = List(1, 3)

compares favorably to

scala> import PartialFunction.condOpt

scala> List("1", "two", "3").flatMap(x => condOpt(Try(x.toInt)) { case Success(i) => i })
val res1: List[Int] = List(1, 3)

The scala 2 ticket (from 2011) says it is a spec issue and includes the current example:

scala> val g: PartialFunction[Int, Int] = (x: Int) => {(); x match { case 2 => 3 }}
1 warning found
-- [E129] Potential Issue Warning: -------------------------------------------------------------------------------------
1 |val g: PartialFunction[Int, Int] = (x: Int) => {(); x match { case 2 => 3 }}
  |                                                ^^
  |                                                A pure expression does nothing in statement position
  |
  | longer explanation available when compiling with `-explain`
val g: PartialFunction[Int, Int] = <function1>

which did not compile in 2011 and did not compile with dotty in 2020.

The example from that ticket deemed it "weird", perhaps because one might expect "magic" to transform the throwing expression into something like the explicit try:

scala> List("1", "two", "3").collect(_.toInt match { case 1 | 3 => 42 })
java.lang.NumberFormatException: For input string: "two"

similar to the expectation in this thread that "PF means it won't throw".

My old comment was

I think my sense that 8.5 already justifies the working behavior is:

scala> { case 42 => 0 }: Comparable[Int]
val res7: Comparable[Int] = $Lambda$2315/0x00000008019ea440@37e64e37

scala> ((i: Int) => i match { case 42 => 0 }): Comparable[Int]
val res8: Comparable[Int] = $Lambda$2316/0x000000080183e840@6d5de79a

scala> ((i: Int) => i match { case 42 => 0 }): PartialFunction[Int, Int]
val res9: PartialFunction[Int,Int] = <function1>

which is as diagnosed above, SAM makes it natural.

(The Scala 2 bug was that it took (i: Int) => i match as i => i match, obviously wrong.)

On further inspection, I see that we didn't entirely omit to think of updating the spec, but we didn't think of the expanded case of an arbitrary selector expression; I said I meant to give it some thought, but it was the holiday July 4 weekend in the States, which is no excuse:

scala/scala#8172 (comment)

sjrd did review:

Hum, can I ask why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants