@felixmulder

DECONSTRUCTING DOTTY

A NEXT GENERATION SCALA COMPILER

Felix Mulder

RECONSTRUCTING SCALA

DOTTY: THE TIME TRAVELLING SUPER COMPILER

Felix Mulder

THOUGHTS ON SCALA 3

HOW DOTTY SHOULD RESHAPE DEV EXPERIENCE

Felix Mulder

Slides on: felixmulder.com

ABOUT ME

</bragging>

Who in here knows Scala?

Who uses it professionally?

ALL (GOOD) STORIES HAVE A BEGINNING

  • Beautiful syntax
  • Less boilerplate
  • FP without the FUD
  • but then again...

<SURPRISES>

trait Super {
  def x: String
  println(s"initialized with: $x")
}

class Tester extends Super {
  val x = "wat"
}
"initializd with: null"

DON'T WORRY SCALA, I STILL LOVE YOU

WE CAN WORK AROUND THIS

Newbie "Solutions"

  • Compromised Immutability
  • Lazification hell
  • Refactor, maybe dump the trait
public interface Super {
    default public void $init$() {
        println("initialized with: " + x());
    }

    public String x();
}

public class Tester implements Super {
    private final String x;
    public Tester() {
        Super.super.$init$();
        x = "wat";
    }
    public String x() { return this.x; }
}

TYPE INFERENCE

TYPE INFERENCE

def merp[A,B](a: A, f: A => B): B = f(a)

merp(1, x => x * 2) // error: missing parameter type

TYPE INFERENCE

def merp[A,B](a: A)(f: A => B): B = f(a)

merp(1)(x => x * 2) // 2
trait ListOf[+A] {

  def foldLeft1[B](z: B, f: (B, A) => B) = ???
  foldLeft1(List.empty, (b, a) => b)           // 1
  foldLeft1(List.empty, (b, a) => a :: b)      // 2


  def foldLeft2[B](z: B)(f: (B, A) => B) = ???
  foldLeft2(List.empty) { (b, a) => b }        // 3
  foldLeft2(List.empty) { (b, a) => a :: b }   // 4

}
scalac compiles what?
trait ListOf[+A] {

  def foldLeft1[B](z: B, f: (B, A) => B) = ???
  foldLeft1(List.empty, (b, a) => b)           // 1
  foldLeft1(List.empty, (b, a) => a :: b)      // 2


  def foldLeft2[B](z: B)(f: (B, A) => B) = ???
  foldLeft2(List.empty) { (b, a) => b }        // 3
  foldLeft2(List.empty) { (b, a) => a :: b }   // 4

}
scalac compiles what?
trait ListOf[+A] {

  def foldLeft1[B](z: B, f: (B, A) => B) = ???
  foldLeft1(List.empty, (b, a) => b)           // 1
  foldLeft1(List.empty, (b, a) => a :: b)      // 2


  def foldLeft2[B](z: B)(f: (B, A) => B) = ???
  foldLeft2(List.empty) { (b, a) => b }        // 3
  foldLeft2(List.empty) { (b, a) => a :: b }   // 4

}
Dotty compiles what?
trait ListOf[+A] {

  def foldLeft1[B](z: B, f: (B, A) => B) = ???
  foldLeft1(List.empty, (b, a) => b)           // 1
  foldLeft1(List.empty, (b, a) => a :: b)      // 2


  def foldLeft2[B](z: B)(f: (B, A) => B) = ???
  foldLeft2(List.empty) { (b, a) => b }        // 3
  foldLeft2(List.empty) { (b, a) => a :: b }   // 4

}
Dotty compiles.

INSTANTIATE AS LATE AS POSSIBLE

BUT NOT TOO LATE

trait ListOf[+A] {

  def foldLeft1[B](z: B, f: (B, A) => B) = ???
  foldLeft1(List.empty, (b, a) => b)           // 1
  foldLeft1(List.empty, (b, a) => a :: b)      // 2


  def foldLeft2[B](z: B)(f: (B, A) => B) = ???
  foldLeft2(List.empty) { (b, a) => b }        // 3
  foldLeft2(List.empty) { (b, a) => a :: b }   // 4

}

</SURPRISES>

SLOW COMPILE TIMES

Scala provides:
  • higher-order types
  • generic methods
  • generic classes
  • multiple inheritance
  • pattern matching
  • lazy evaluation
  • garbage collection

Compilers crash course

  • Tokenize Source
  • Build Trees (AST)
  • Typecheck
  • Simplify
  • Bytecode!
1 + 2
  (+)
  / \
(1) (2)
Apply(Select(Lit(1), $plus), List(Lit(2)))
Phase 1
Phase 2
Phase 3

javac pipeline

  1. Parse
  2. Enter
  3. Annotate
  4. Attribue
  5. Flow
  6. Desugar
  7. Generate

dotty pipeline

source code

But, but, but compilation speed?

You said Dotty would be faster!

Fused Phases

Phase 1
Phase 2
Phase 3

LANGUAGE IMPROVEMENTS

  • Trait Parameters
  • No more procedure syntax
  • No old style macros, embrace the meta!
  • Union Types
  • Intersection Types
  • Implicit functions

Procedure Syntax

def foo {}
// error!

def foo = {}
// res: Unit
But there's a flag for that: -language:Scala2

Intersection and Union Types

A & B

the greatest lower bound: a supertype of all subtypes of both A and B

A | B

the least upper bound: a subtype of all supertypes of both A and B
trait A
trait B
trait C extends B with A // ==> A & B is a supertype of C
class A(val i: Int)
class B(val i: Int)

def p1(implicit ev: A <:< (A | B)) = ???

def p2(implicit ev: B <:< (A | B)) = ???

def p3(implicit ev: (A | B) <:< AnyRef) = ???

def p4(implicit ev: (A | B) <:< AnyVal) = ??? // does not compile

Supercharged Any

final class SuperHipsterInternalAPI {
  def foo(a: Any) = ??? // NO!
  def foo(a: TypeA | TypeB) = ??? // YES!
}
Precise Types Bring Performance - Dmitry Petrashko

Trait Parameters instead of Early Initializers

trait Super {
  def x: String
  println(s"initialized with $x")
}
class Tester extends Super {
  val x = "wat"
}

new Tester
// prints: "initialized with null"
trait Super(x: String) {
  println(x)
}
class Foo extends Super("wat")

new Foo
// prints: "initialized with wat."

YOUR AVERAGE BOILERPLATED LIBRARY

def storeUser(u: User): DBIO[User] = withCtx { implicit transact =>
  // much boilerplate
}
def storeUser(u: User): DBIO[User] = {
  // impossibru in scalac
}

ENTER IMPLICIT FUNCTIONS

implicit Context => R
trait ImplicitFunction1[-T,+R] {
  def apply(i: T): R
}

NEW MORE COOL API

// In the api somwehere:
type DBIO[T] = implicit Transaction => T

// In user code:
def storeUser(u: User): DBIO[User] = {
  // ctx available!
}
def ctx: Transaction[Context] = implicitly[Context]
Implicit Functions - Martin Odersky
Tagless Final Interpreters in Dotty - Olivier Blanvillain

</SYNTAX DIFFERENCES>

REPL

&

ERROR MESSAGES

Let's Be Mainstream! - Evan Czaplicki

SCALADOC

"Documentation cannot - and so need not - say everything. Its purpose is to help the next programmer build an accurate theory about the system." - Peter Naur

BUILD TOOLS

LANGUAGE SERVER

CROSS BUILDING LIBRARIES

TASTY

  • Pickling format
  • Typed Trees
  • Efficiently stored in bytecode
  • Interop between binary incompatible compilers

Developer Usability

Get involved today!

REPL

Issues

ERROR MESSAGES

Issue #1589

IDEs & TOOLING

Issues

Build Tools

OPTIMIZED SCALA CODE

Dotty Optimizer

  • Dmitry Petrashko, @DarkDimius
  • Call graph
  • Whole program optimization
  • User defined rewrite rules

Implementing a linter

@rewrites object rules {
  def toIsEmpty[T](xs: List[T]) =
    Rewrite(from = xs.length == 0,
            to = xs.isEmpty)

  def customFancyWarning(x: Int | Double | ... | Numeric[_]) =
    Warn(pattern = x / 0,
         msg = "division by zero, you fool!")
}

Example

public double average(int[] data) {
    int sum = 0;
    for(int i = 0; i < data.length; i++) {
        sum += data[i];
    }
    return sum * 1.0d / data.length;
}
def average(xs: Array[Int]) =
  xs.reduce(_ + _) * 1.0 / xs.size
Java Scala
45 msec 872 msec

Java

public double average(int[] data) {
    int sum = 0;
    for(int i = 0; i < data.length; i++) {
        sum += data[i];
    }
    return sum * 1.0d / data.length;
}
  • Range check
  • Addition
  • Index increment

Scala

def average(xs: Array[Int]) =
  xs.reduce(_ + _) * 1.0 / x.size
def reduce(op: Function2[Obj, Obj, Obj]): Obj = {
  var first = true
  var acc: Obj = null
  this.foreach { e =>
    if (first) {
      acc = e
      first = false
    } else acc = op.apply(acc, e)
  }
  acc
}
def foreach(f: Funtion1[Obj, Obj]) {
  var i = 0
  val len = length
  while (i < len) {
    f.apply(this(i))
    i += 1
  }
}

Specialize all the things!

trait T[A1, A2, ..., An]
how many combinations?
10 ^ n
trait T[A1, A2, ..., An] {
    def foo[B1, B2, ..., Bm] = ???
}
and now?
10 ^ (n + m)

Existing solutions for Scala 2

trait Function3[-T1, -T2, -T3, +R]

Both: how many specializations are needed?

Dotty Linker's approach

Don't ask the user what to specialize
- specialize on what's actually being used.

Auto-Specialization in Dotty Linker

  • Takes your program & libraries, analyzes how you use them
  • Sees what instantiations are needed
  • Specializes them
  • No need to annotate types
Types and terms are treated the same for specialization:
def foo[T](x: Seq[T], id: Int) = x(id)
  • Type specialization removes boxing
  • Term specialization removes virtual dispatch

STATUS

Thank you!