Felix Mulder

Getting into Scaladoc Development

Over the past months I’ve been working on the new Scaladoc - and it’s coming along nicely. Together with @heathercmiller and @vladureche we’ve overhauled the overall look of Scaladoc as well as adding a bunch of useful features - like global member search!

There are a lot more things we want done - and we’d love help! This post serves as an intro to developing Scaladoc. If that piques your interest - read on.

Scaladoc for 2.12.x

The new major version of Scala is scheduled for later this year, and currently its doc tool generates sites like this one:

Scala Collections

The new search functionality allows you to not only look for so called “top-level entities” (class, trait, object and package) but members like methods and even val and vars. The new search is not at all taxing, on a big library like the collections - the results are shown almost instantaneously. Nevertheless, we’ve got a progress bar showing you something’s actually happening:

Scala Collections

How does Scaladoc work?

The scaladoc tool is quite easy to understand - there are basically three things that happen once you’ve specified what you want documented:

  1. Scaladoc compiles the sources using the first step of the compiler (the frontend). This will fill a tree structure known as Universe with all the information about the sources (classes, members, variables etc). See: DocFactory.scala

  2. Copy all needed assets to out location (i.e. scripts and CSS-files)

  3. Traverse the Universe tree and for each top-level entity create an HTML-page using the Entity class. See: Entity.scala

The bulk of this is done by the HtmlFactory class in HtmlFactory.scala.

How do I generate the docs?


$ git clone git@github.com:scala/scala.git && cd scala
$ ant docs # "ant docs.lib" if you just want the standard library

Where to begin

The markup of the entity page is defined in Entity.scala. If you’re looking the change the HTML that’s where you will want to look first.

If you want to add some static asset, remember that HtmlFactory needs to copy it to the destination. So be sure to add your new resource to libResources in HtmlFactory.scala.

All the current static assets are located in the lib directory. That’s where you’d want to put new things (it is also where index.js etc live!)

How can I see my changes? You have two options, if you changed the markup - you’ll need to regenerate the docs:

$ ant docs.clean && ant docs

When the changes are to scripts or style sheets I use a small Makefile:

    cp src/scaladoc/scala/tools/nsc/doc/html/resource/lib/{index.css,index.js} build/scaladoc/library/lib/

Regenerating the docs all the time is a pain - if you just want to generate them for a single file, do this:

$ ant quick.bin
$ build/quick/bin/scaladoc path/to/source/File.scala

Where to really begin

There’s still a lot of things to do before this can be considered release ready. Here’s a laundry list of things we need to do:

  • General
    • Async all the things! We should be able to parallelize the generation of the doc pages
    • CSS cleanup! There are a lot of things not necessary anymore, these could simply be deleted
  • Mobile, the docs look decent on mobile - but there’s still a ton to do!
    • Landscape kills some of our layout
    • There is no good package overview (list on the right) - ideas welcome!
  • Desktop
    • Keyboard navigation on entity pages (i.e. between members)
    • Better keyboard navigation on search results
    • Minor layout issues
    • Showing existence of companion class/trait/object (maybe "You won’t believe who Companion object Boolean went home with tonight!")

The full list of ideas and things to do can be found on scala-dev. Comment on the issue or reach out to one of us on the scala/contributors gitter if you’re interested in knowing more!

Stacking Futures and Eithers in Scala using Monads

In a pet project I’m designing a simple CRUD application for an advert platform using akka-http and slick as the backend. Using Akka for the backend is an interesting experiment, but my initial impression is that this is going to be great.

The API for this application is extremely simple.

User makes API request -> API returns a JSON response

Always returning a type that can be converted to JSON would be fine, but, if an error occurs far down the stack - we’d like this propagated to the API user. Therefore, it makes sense to use something like Scala’s Either[+A,+B] for the Response type.

import scala.concurrent.Future
sealed trait ErrorResponse // converts to JSON

type Response[JsonResponse]  = Either[ErrorResponse, JsonResponse]
type ResponseF[JsonResponse] = Future[Response[JsonResponse]]

All of the interaction with the database occurs asynchronously. As such, these functions return a ResponseF[JsonResponse]. If you’re not familiar with Scala’s Either, below is an example of combining the use of Either with that of Futures.

val x = Right(1)
val y: Left[String, Int] = Left("OH NO!")
val z: Left[String, Int] = Left("Later OH NO!")

for (a <- x.right; b <- y.right; c <- right) yield a + b + c
// result: Left("OH NO!")

val futX = Future.successful(x)
val futY = Future.successful(y)
val futZ = Future.successful(z)

for {
  eitherA <- futX
  eitherB <- futY
} yield for {
  a <- eitherA.right
  b <- eitherB.right
} yield a + b
// result: Left("OH NO!")

We can’t actually flatMap over Either, we first have to choose a projection, i.e. if we want the left or the right value. This poses an issue to us since it is likely that we would want to perform a series of calls dependent on each other’s output. The resulting code would therefore be riddled with chained for-comprehensions.

The solution would be if we could flatMap over these futures to get the right value inside the Either. This would allow us to write very simple code:

def signup(socialNetworkToken: String): ResponseF[AuthToken] =
  for {
    user     <- userInfoFromSocialNetwork(socialNetworkToken) // ResponseF[User]
    userInDb <- UserService.insertOrUpdate(user)              // ResponseF[User]
    token    <- getToken(userInDb)                            // ResponseF[Token]
  } yield token

  // Unfortunately, this happens:
  // error   : type mismatch;
  // found   : scala.util.Right[Error,User]
  // required: User

The trouble is that the value mapped to user is not of type User but rather Response[User]. We’re mixing monads, this bad. Enter scalaz.

Redefining the responses to:

import scalaz.{EitherT, \/}
import scala.concurrent.Future

type Response[JsonResponse]  = ErrorResponse \/ JsonResponse
type ResponseF[JsonResponse] = EitherT[Future, ErrorResponse, JsonResponse]

The Either is replaced with scalaz’s counterpart, \/[+A,+B]. We also define our ResponseF as the monad EitherT. This monad’s flatMap method returns the value inside the future’s \/, or if you’re familiar with monads: binds through the right of its disjunction. We can now write the code that we originally wanted to write:

def signup(socialNetworkToken: String): ResponseF[AuthToken] =
  for {
    user     <- userInfoFromSocialNetwork(socialNetworkToken) // ResponseF[User]
    userInDb <- UserService.insertOrUpdate(user)              // ResponseF[User]
    token    <- getToken(userInDb)                            // ResponseF[Token]
  } yield token

If an error were to occur in the first line of the for-comprehension, the following statements would be ignored and the error would propagate as expected.

So how do we create an EitherT from a future? Depending upon your need for error handling it can be as simple as:

def insertOrUpdate(user: User): ResponseF[User] = EitherT {
  val query = db.run(insertOrUpdateQuery(user)).map(_.right)

  query recover {
    case _ => UserUpdateError.left

Monads are not always simple to understand, but they allow for elegant code if used correctly. In this case I would argue that the hour spent on understanding EitherT will help us write elegant code throughout this project.

Checkout Andy’s scala-errorhandling repository for a full implementation of the error handling - including collection support.

JSON conversion for supertypes using Spray-Json

When using JSON as the standard response for a project it can become a burden to keep defining implicit vals when working with spray-json.

In a current project we’re basically always responding to an erroneous request by issuing the following response structure:

    "error": "<name of error>",
    "reason": "<string explaining error>"

When using spray-json you basically do this to get be able to convert case classes to json-strings:

object JsonModels extends DefaultJsonProtocol {
    implicit val userFormatter = jsonFormat2(User)

// which allows you to do this:

This is fine when each class has a different apply function, but it gets tedious when all of your objects are subclasses of an error trait that forces two fields on the implementing class.

sealed trait Error {
  def error: String
  def reason: String

case object UserUpdateError extends Error {
  override def error  = "user_update_error"
  override def reason = "couldn't insert or update supplied user"

... more case objects extending Error

Instead of defining an implicit val for each subclass of Error what you can do instead is define an object that extends JsonFormat[A] and define the write and read functions yourself.

sealed trait Error { self =>
  def toJson: JsValue = Error.write(self)

  def error: String
  def reason: String

object ErrorFormat extends JsonFormat[Error] {
  def write(err: ErrorResponse) = JsObject(
    "error"  -> JsString(err.error)
    "reason" -> JsString(err.reason)

  def write(value: JsValue) =
    deserializationError("not necessary since not converting from JSON")

As such, each type that extends Error will have .toJson available - but won’t need to define it.

Implementing malloc

This semester at LTH is all about operating systems and implementing effective C programs. In the operating systems course we mainly discuss Linux, but emphasize problems applicable to any OS design.

In the OS course project, we implemented the standard C memory functions malloc(), calloc() and realloc().

The full implementation is available here and here

The naive approach

In Unix like operating systems there are two functions that are useful for manipulating a program’s data segment - sbrk() and brk().

These functions move the end of the process’s data segment, also known as the program break. A simplistic approach to allocate memory would be to let malloc() increase the data segment for each call.

We also know that at some point the user’s probably going to want to call free() on the allocated memory. As such, we need to know how many bytes we can return to the OS. To solve this, we’ll allocate a bit more memory than the user has requested, and prepend a small struct to each chunk of memory we allocate.

typedef struct block_t block_t;
struct block_t {
        int     prepad; /* padding added to align block */
        size_t  size;
        block_t *prev;
        block_t *next;
        char    data[]; /* flexible array member pointing to data */

static block_t *available;

As you can see, I’ve added a static pointer to available. We will later use this to keep track of free blocks within our data segment.

void *malloc(size_t size)
        if (size < 1)
                return NULL;

        /* find an existing block which satisfies our needs */
        block_t *block = find_available(size);

        if (block == NULL) {
                 * sbrk() returns the current program break, and increases it by
                 * the argument - here PADDING is the maximum amount of bytes
                 * that we'll need in order to align allocated segment and the
                 * size of block_t
                char *start = sbrk(size + PADDING);
                int  offset = (uintptr_t)start % sizeof(double);

                block = (void *)(start + offset);

                /* sbrk will return -1 if unsuccessful */
                if (block == (void *)-1)
                        return NULL;

                block->prepad = offset;
                block->size = size;
                block->prev = block;
                block->next = block;

        return block->data;

When malloc() is implemented, we can easily use the implementation to implement calloc() and realloc().

void *calloc(size_t nmemb, size_t size)
        size_t total = nmemb * size;
        char   *alloc = malloc(total);

        /* if malloc fails, calloc should fail */
        if (alloc == NULL)
                return NULL;

        memset(alloc, 0, total);
        return alloc;

void *realloc(void *ptr, size_t size)
        /* realloc can be called with NULL, if so it should work as malloc */
        if (ptr == NULL)
                return malloc(size);

        /* get the block_t containing information about the allocated memory */
        block_t *old_alloc = (void *)((char *)ptr - sizeof(block_t));
        char    *new_alloc = malloc(size);

        if (new_alloc == NULL)
                return NULL;

        for (size_t i = 0; i < old_alloc->size && i < size; i++)
                new_alloc[i] = old_alloc->data[i];

        free(ptr); /* free old memory */
        return new_alloc;

You should now be able to use the functions described above (if you’ve redefined free()). The program, however, won’t be able to return memory to the system.

Let’s fix this by implementing free(). By praxis you should allow other calls to sbrk() and brk() than from within malloc(). This poses a problem to us. If someone outside of malloc() expands the data segment, we won’t know about it. As such, we can only safely shrink the data segment when freeing a block that ends on the program break.

What do we do with data that isn’t on the end? We put it in our linked list available.

void free(void *ptr)
        if (ptr == NULL)

        block_t *block = (void *)((char *)ptr - sizeof(block_t));

        /* a call to sbrk() with 0 as param, yields the program break */
        char *end = sbrk(0);

        if ((char *)ptr + block->size == end) {
                sbrk(-(block->size + PADDING);
                insert_avail(block); /* insert into available */

But wait, there’s more. As aforementioned, we can’t decrease the program break if we’re not freeing the last block before the program break. This is where available comes in. Available will keep a list of available memory. If we find a block big enough, we can use it instead of allocating new memory. Keep in mind that the block found might be bigger than what we need. If we find such a block, it should be split into two and the piece not being used should be inserted into the available list. Vice versa when we insert into available, we will merge adjacent blocks.

The realistic implementation

The above implementation works, and is pretty memory efficient. I.e. it uses only as many resources as it has to, and returns the rest to the OS. It, however, is not very fast. When increasing the data segment, the kernel gets involved. As such, getting the memory is actually one of the slowest steps in the above implementation. Because of this, we want to allocate a large chunk at once, and then portion it out until it’s all used up.

Another shortcoming is the linked list of available blocks. The time to traverse this list increases linearly. Searching throught the linked list to find adjacent blocks is an O(n^2) operation.

The Buddy System

Most of these shortcomings can be addressed by allocating a big chunk and using a different structure to save available blocks in. One possible solution is using the buddy system. The buddy system is very intuitive. This is what we’re going to do:

  1. Allocate a big chunk of data with a size the power of two (1ULL << 32)
  2. When malloc() is called, we figure out the nearest power of two (rounding up) - we’ll call this “the order”
  3. We grab the first free block of the nearest order and split it in halves
  4. If the halves are of the needed order, we stop. Otherwise, we continue splitting the left half into smaller pieces.

This means that each block has a “buddy.” This buddy can be found by doing a simple XOR operation:

/* finding the first block's buddy */
block_t *buddy = pool ^ (1 << block->order);

/* finding the buddy of any block */
block_t *buddy = (void *)(pool + (((char *)block - pool) ^ (1 << block->order)));

In the buddy system, we can easily merge available blocks in worst case with a time of O(log n).

So let’s implement malloc(). We change the block structure to look like this:

#define MAX_ORDER (26)
#define FREE      (0)
#define RESERVED  (1)

typedef struct block_t block_t;
struct block_t {
        unsigned status : 1; /* RESERVED/FREE */
        size_t order;
        block_t *succ;
        char data[];

static block_t *blocks[MAX_ORDER + 1];
static char *pool;

This way we can know if the buddy is available for merge. In each slot of the array we’ll save available blocks of the same order.

void *malloc(size_t size)
        /* initialize pool */
        if (pool == NULL) {
                char *start = sbrk((1 << MAX_ORDER) + PADDING);

                /* thankfully, we'll only need to align the first block */
                int offset = (uintptr_t)start % sizeof(double);

                block_t *init = blocks[MAX_ORDER] = (void *)(start + offset);
                pool = (void *)blocks[MAX_ORDER];

                init->status = FREE;
                init->order  = MAX_ORDER;
                init->succ   = NULL;

        /* calculate the closest order */
        size_t order = 2;
        while ((1 << order) < (size + sizeof(block_t)))

        /* take out the first available block */
        if (blocks[order] != NULL) {
                block_t *block = blocks[order];
                blocks[order] = block->succ;
                block->status = RESERVERD;

                return block->data;
        } else {
                /* split blocks and return first available of correct order */
                return split(order);

We don’t even need to change the implementation of calloc() and realloc(). The only thing we need to change is free().

void free(void *ptr)
        if (ptr == NULL)

        block_t *block = (char *)ptr - sizeof(block_t);
        char    *offset = ((char *)block - pool) ^ (1 << block->order);
        block_t *buddy = (void *)(pool + offset);

        /* set current block to free */
        block->status = FREE;

        /* merge with buddies until buddy found not to be free */
        while (block->order < MAX_ORDER &&
               block->status == FREE    &&
               buddy->order == block->order) {
                remove_block(buddy); /* remove from blocks */

                /* some arithmetic to make sure block is leftmost */
                uintptr_t mask = ~(1 << (uintptr_t)block->order);
                uintptr_t ptr = ((char *)block - pool) & mask) + pool;

                block = (block_t *)ptr;
                block->order += 1;

                /* find new buddy! */
                char *offset = ((char *)block - pool) ^ (1 << block->order);
                buddy = (void *)(pool + offset);

We could’ve compared the pointers to see which of the blocks were leftmost. This is, however, much more effective.


Implementing malloc() is a fun exercise and gives a look into the difficulties with memory allocation. Using the buddy system is not fool proof. This method does not solve the problem of internal fragmentation. If you’re interested you should have a look at slab allocation. Also, there are several optimizations that can be made to the examples above, but for the sake of brevity and clarity they have been left out.