Thomas Denney

JIT Compilation on an ARM Cortex M0

For my bachelor's thesis last year I implemented a JIT compiler for the BBC micro:bit, a micro-controller that is being used around the world to teach programming in schools. My compiler took bytecode for a virtual machine designed by my supervisor, compiled it to ARM Thumb bytecode, and executed the generated bytecode. The main challenge here is getting the JIT compiler to store the original virtual machine bytecode, auxiliary data structures, compiled bytecode, and program data in 16 KB of RAM. Rather than replicate my thesis in full here I'm just going to run through the basic steps involved to JIT compile and execute functions on a Cortex M0; I assume familiarity with C++ and ARM Assembly and calling conventions.

At a minimum, a JIT code generators needs to perform three basic steps:

  1. Generate a sequence of instructions encoded in the processor's bytecode;
  2. Jump to that sequence of instructions, allowing the processor to execute them; and
  3. Return from the generated code to the environment.

The simplest way to get this working is to generate a sequence of instructions terminated by bx lr, which corresponds to a return in ARM assembly, and then to call that sequence of instructions as if it is a function:

typedef void (*f_void)(void);

static const uint16_t instrs[] = { 0x4770 };
asm("DSB");
asm("ISB");
f_void fp = (f_void)((unsigned)instrs | 0x1);
fp();

In this basic example our generated code is a compile time constant, but this example adapts to dynamically allocated arrays too. 0x4770 is the hexadecimal encoding of bx lr, the ARM instruction that corresponds to branching to the address in the link register, which will be the return address:

Note that the Cortex M0 executes Thumb bytecode, which is a compact version of the full ARM 32-bit instruction set; most instructions are represented in 2 bytes. The most significant 9 bits of this instruction encode that this is a branch and exchange instruction and the subsequent 4 bits encode the register that contains the address to jump to (the link register is register r14). The last three bits should be zero.

Next, two inline assembly instructions are issued. DSB is a data synchronisation barrier, and ensures that no instruction in program executes until after this instruction completes, i.e. the instruction ensures that all memory operations complete. Next, ISB is an instruction synchronisation barrier, and flushes the fetch pipeline of the processor. In this case, where our JIT'd code is actually in a static array, these instructions are not necessary. However, when generating instructions it is necessary to ensure that (a) our generated instructions are fully written to memory, and (b) the fetch pipeline doesn't attempt to fetch the old value. These instructions must be issued across all ARM architectures (not just the Cortex M0) when running JIT'd code.

Memory protection typically prevents memory being both writeable and executable, so in JITs on other OSes it would generally be necessary to prevent writing and only permit executing the code. No such protection exists on the Cortex M0, so only DSB and ISB must be executed.

Next, we cast the pointer to the instructions sequence to a pointer to a function pointer so we can call the code. Our array will be 2-byte aligned (which is required for Thumb instructions), but when branching the LSB of the address we jump to must be 1 to indicate that the instruction we are jumping to is encoded in Thumb bytecode rather than full 4-byte ARM bytecode.

Supposing we have access to a memory allocator, we can adapt the above to dynamically generate the same function at runtime:

uint16_t* instrs = (uint16_t*)calloc(1, sizeof(uint16_t));

instrs[0] = 0x4770;
asm("DSB");
asm("ISB");
f_void fp = (f_void)((unsigned)instrs | 0x1);
fp();

free(instrs);

This function isn't particularly interesting: it has no observable side-effects and doesn't return anything. As a toy example, the following function JIT-compiles a function that sums between 0 and 4 integer values. The first four integer arguments are always passed in registers under ARM calling convention, and the result of an integer function is returned in r0.

typedef int (*f_4_int_to_int)(int, int, int, int);

int jit_int_sum(unsigned int n)
{
    std::vector<uint16_t> instrs;
    if (n == 0) {
        // MOV   Rd  Value
        // 00100 000 00000000
        // 2   0     0   0
        instrs.push_back(0x2000);
    } else {
        for (unsigned k = 1; k != n; ++k) {
            // Rd := Rn + Rm
            // Add     Rm  Rn  Rd
            // 0001100 000 xxx 000
            // 1   8    0    0
            instrs.push_back(0x1800 | (k << 3));
        }
    }
    instrs.push_back(0x4770); // Return

    asm("DSB");
    asm("ISB");
    auto fp = (f_4_int_to_int)((unsigned)instrs.data() | 0x1);
    return fp(1, 2, 3, 4);
}

Note that this time I've used a C++ vector to simplify the memory management. In this example we either clear the return register or sum up to three remaining registers into the first register. For simplicity, the function pointer is always cast to the same type and called with the same number of arguments, but the return value of the function will differ depending on the value of n.

Encoding instructions manually in hexadecimal (or binary) isn't a particularly good use of your time. In my project I implemented a library for all Thumb instructions, and eventually I also implemented a decoder to help with debugging. Helpfully the decoder (printFunction) can take a pointer to an arbitrary ARM function, allowing runtime bytecode -> assembly conversion.

The next challenge is dealing with branches. In a traditional compiler this can be delegated to a linker, but in a JIT where all code bytecode must be generated it is necessary to deal with them yourself. The conditional branch is encoded as:

The condition is a 4-bit value and is used in conjunction with the condition flags of the CPU to determine if the branch is taken. I specify the values of these conditions in my instruction encoder here. The immediate value is a signed (two's complement) 8-bit integer. If the condition holds, the value is sign extended to a 32-bit value, doubled, and added to the current program counter, so that the branch supports offsets from -256 bytes to +254 bytes. However, it is important to note that the program counter, at the time the branch instruction begins, is the address of the branch + 4 bytes. Therefore if the conditional branch needs to skip forward by 2 instructions, then the offset should be 0. If the branch needs to skip back 10 instructions, then the offset should be -12. The same encoding scheme is necessary for unconditional branches and bl/blx instructions (used for functional calls), although these both admit 11-bit offsets.

The examples above are able to freely use registers because they conform to the basics of ARM calling convention; dealing with function calls and access to other registers only requires that you follow this convention in generated code.

The full code of microjit is on GitHub. I've also uploaded a barebones example showing the examples from this post executing.

Pilot E95s

The Pilot E95s is a pocket fountain pen released by Pilot to celebrate their 95th anniversary. The pen sports a classic mid-century design based on the Pilot Elite, and it makes for a great combination of vintage design with modern manufacturing. I own the ivory and burgundy pen, but it is also comes in black (as all great products should). Both colour schemes feature a large 14K carat gold nib, which has been superb to write with.

Japanese nibs tend to come up narrower than American or European nibs and the Pilot E95s is no exception. I have the medium nib, but it comes up no thicker than fine Lamy nibs, as shown below.

J. Herbin Rouge Opéra, Pilot Iroshizuku Take-Sumi, and De Atramentis Dunkelblau inks on Tomoe River paper (the nicest paper I've ever written on, but sadly difficult to get hold of in the UK).
J. Herbin Rouge Opéra, Pilot Iroshizuku Take-Sumi, and De Atramentis Dunkelblau inks on Tomoe River paper (the nicest paper I've ever written on, but sadly difficult to get hold of in the UK).

The pen is compatible with Pilot's CON-20 and CON-40 converters. I'm not certain that there is enough space to use a cartridge. To the pen's detriment the converter is almost entirely covered by the screw thread for the body, making it impossible to see how much ink remains in the converter. The converter itself contains a ball that will rattle when the converter is empty, which can be irritating whilst writing.

The E95s is indisputably a premium fountain pen, and it's an affordable introduction to gold nibs. However, I found that it didn't feel like a premium fountain pen: it feels too lightweight. This is entirely down to resin construction, which makes it a lot lighter than other pens in its price category. At 17g, its weight is comparable to the larger TWSBI Go or Lamy Safari, although each of these are larger. I'm currently using a Lamy 2000 as my primary pen (25g) alongside a Pilot Vanishing Point (30g), so I've found that I notice the E95s's weight every time I write with it. Regardless of its weight, I could replace either of my daily drivers with the E95s if need be: its nib is incredibly smooth and it writes well on a variety of surfaces.

Whilst I researched this review I attempted to find out a little about the history of the pen, but instead I found a myriad of inconsistencies! Firstly, the name of the pen is different in the United States and Japan. In the US it is marketed as E95s (although the case of the “s” varies between suppliers and Pilot's websites), whilst it is called the Elite 95s in Japan. The name also affects the design, with the Japanese version inscribed with “Elite” rather than “E”, as in the American version. I think the “Elite” inscription looks better.

Most distributors describe that the E95s is based on the Pilot Elite, which was originally released in either 1962 or 1968. However, neither date is correct! Whilst it is certainly true that the Pilot Elite was sold in the 1960s, and embodies the design of that decade, Pilot Japan's website (Google Translate) states that the first Elite was released in 1954, and that the E95s is actually based on a later second edition released in 1974. This corroborates with the appearance of Elites that I've found on eBay, and this excellent post on the fountain pens subreddit compared the E95s with an older Elite. Noticeably, the nib of the E95s is much larger.

Pilot E95s boxed; the Vanishing Point comes in the same box
Pilot E95s boxed; the Vanishing Point comes in the same box

As well as its history, pricing and availability of the pen is sadly also inconsistent. In Japan, the pen retails for ¥10,000 (around $90 at the time of writing), but in the United States its retails at $170, although The Goulet Pen Company sells it at $136. I definitely prefer both the Lamy 2000 and Pilot Vanishing Point, which are priced similarly, but neither features a gold nib.

Older Pilot Elites, in varying conditions, are also available on eBay for prices that dip below $30, so this is a more affordable option. However, as a great modern introduction to gold nibs, the E95s serves nicely.

Oxford through the pinhole

Last summer I bought a Pinhole Pro “lens” for my camera, but I hadn't experimented with until I went for a walk round central Oxford this morning. The lens I have has an focal length of 58mm, so many of the images below are out of focus. I think it produces an interesting ethereal effect, and achieves a lot of Instagram-esque effects in camera.

I had to touch up each of the following a little; before I set out I didn't realise that there was a small amount of dirt inside the lens, which meant that all the images were initially marked with the same pattern.

Tourists in Radcliffe Square. Even in bright sunlight (Oxford is seeing unusually nice weather for February) I had to expose the image for 4 seconds at ISO 100. Any motion thus appears blurred, so a tripod was a necessity.
Tourists in Radcliffe Square. Even in bright sunlight (Oxford is seeing unusually nice weather for February) I had to expose the image for 4 seconds at ISO 100. Any motion thus appears blurred, so a tripod was a necessity.
Looking east on Broad Street. This was a 1 second exposure at ISO 400. Balliol College, left, has a nice glowing effect, which was achieved entirely in camera.
Looking east on Broad Street. This was a 1 second exposure at ISO 400. Balliol College, left, has a nice glowing effect, which was achieved entirely in camera.
Brasenose Lane from Radcliffe Square. 1 second/ISO 400.
Brasenose Lane from Radcliffe Square. 1 second/ISO 400.
Bridge of Sighs from the Old Bodleian. 0.8 seconds/ISO 400.
Bridge of Sighs from the Old Bodleian. 0.8 seconds/ISO 400.
Sports fields and Merton College from Christ Church Meadow. 0.5 seconds/ISO 400.
Sports fields and Merton College from Christ Church Meadow. 0.5 seconds/ISO 400.
The Cherwell, Christ Church Meadow. 0.5 seconds/ISO 400.
The Cherwell, Christ Church Meadow. 0.5 seconds/ISO 400.
The bridge that joins The Cherwell and The Thames. 2.0 seconds/ISO 100.
The bridge that joins The Cherwell and The Thames. 2.0 seconds/ISO 100.

TWSBI Go

The TWSBI Go, a new demonstrator pen from the Taiwanese manufacturer, was released in August 2018 and I purchased mine shortly after its release. I’ve been using TWSBI pens on and off for the last couple of years (I wrote all my finals with a Diamond 580 last year), but until this week I hadn’t spent much time with the Go.

The body of the Go is an all plastic construction, and this makes the pen very lightweight. All the plastic is translucent, and the body is available in either sapphire, pictured, or "smoke". By far the most interesting aspect of the pen is its filling mechanism. Like other TWSBI pens it doesn’t require a converter or cartridges, and instead includes a spring-based vacuum filling system. All my other pens use screw-based systems, so this was a novelty for me. I’m not a fan of this system because I nearly knocked an ink bottle over whilst releasing the spring, so it serves to be careful.

Unlike other TWSBI pens, the cap doesn’t feature a clip. Given that the pen is marketed a pocket pen, I think this is a disadvantage. Instead, TWSBI replaced the clip by a hook for a lanyard. I haven’t seen this on a pen before, but it doesn’t strike me as very useful. The cap is made of plastic, and it is only a little wider than the body. This means that it posts tightly, but I’ve avoiding doing so out of fear that the plastic may crack.

Overall, for the price, the pen writes well and although a little, a comparable experience to the TWSBI Eco. As far as I can tell they both use the same steel nib, although the markings are clearer on the Eco. Both sections are a traditional hourglass shape, but the triangulated shape of Eco is easier to hold.

Left-to-right: The TWSBI Go, Eco, Diamond 580
Left-to-right: The TWSBI Go, Eco, Diamond 580

At $19 it is around $10 cheaper than the Eco, previously their most affordable offering. If you want a great demonstrator I’d recommend getting the Eco over the Go, because you’ll get a slightly bigger pen that rests more comfortably in the hand. Alternatively, if you’d like a pen that your friends will mistake for a vape, the Go is a great choice.

Namisu Nova

Namisu is a company that designs fountain pens in Fife, Scotland. They've been operating since 2013, but I only discovered their work earlier this year. As far as I'm aware there are relatively few British pen manufacturers, with most expertise now found in Germany and Japan, so I was keen to try out their work. Predominantly focused on minimalist design, their pens are devoid of any of their own branding. They compensate for this with distinctive shapes and high-quality materials.

The Nova is one of their earliest designs, and has been available for three years. I have the aluminium model in red, although it is also available in brass, titanium, and other colours. Of these, I think the titanium looks the best and I imagine they don't differ in their ergonomics. Namisu currently also ship the similar Orion and Ixion pens, along with a few rollerballs, although these are more expensive than the Nova.

Each of their pens ships with nibs produced by Peter Bock in Germany. The choice of Bock nibs is a pragmatic one, I think, as they are both a reliable and relatively inexpensive choice. My Nova has a fine steel nib, although titanium and broader nibs are also available. I initially struggled to get ink flowing through the pens, and experimented with a number of inks and notepads, but continued to find that the nib scratched the surface of the paper, often without flow. However, after flexing the nib a little I eventually managed to get ink flowing comfortably from the pen without issue.

The Nova ships in an attractive box and with a standard international converter
The Nova ships in an attractive box and with a standard international converter

Although it looks very nice, I didn't find the Nova terrificly comfortable to write with, especially for longer periods. The section — the part you grip with your fingers — is wider than any other pen I own, and its surface is completely flat, rather than a gentle curve or moulded grip seen on other pens. Those with small hands may find the pen uncomfortable to maneuver, but I can certainly see the pen being more comfortable for those with larger fingers. The flat surface was a bigger issue, especially considering that the pen seemed to initially require significant pressure to get ink flowing, with my fingers left with pen-shaped indentations after an hour or two of use. The only other significant downside of the design is that it tended to roll a lot on my desk.

The Nova pictured with J. Herbin Bleu Myosotis ink
The Nova pictured with J. Herbin Bleu Myosotis ink

Overall, I wouldn't recommend the Nova over other pens in its price category — TWSBI certainly make significantly nicer pens for a similar price — but from an aesthetic perspective it is certainly worth it. There are very few pens with all-metal bodies available at that price, and after about six months of ownership I haven't noticed a single scratch anywhere on the body; this pen is durable. I haven't yet experimented with posting the pen, but I wouldn't recommend it as the cap screws on and off. Prior experience suggests that the screw thread is likely to eventually damage the body.

The Nova is available at Namisu's online store for £45.

The KOSMOS Pen

Aluminium KOSMOS ink in Night Sky
Aluminium KOSMOS ink in Night Sky

In general I am not a fan of ball point pens, but the magnetic cap design of the KOSMOS pen intrigued me. Rather than pressing a button or removing a cap, the KOSMOS pen by Stilform, a design studio based in Munich, allows you to pull back on the cap, which is then held in place magnetically. It is a simple but effective idea.

Stilform clearly care about design, and the KOSMOS pen is certainly the best designed ballpoint that I’ve used. That design does come at a price; my aluminium model cost €50, but a titanium bodied model is also available at €100. The aluminium models come in five colours, three of which are intentionally similar to Apple’s MacBook lineup.

Aside from the magnetic cap, one of pen’s best features is barely mentioned in Stilform’s marketing: the pen rolls very little. I suspect they’ve weighted the body on one side, but I couldn’t tell where after playing with it for a few minutes. I was a little disappointed to find that it didn’t always roll to show the logo, which is engraved near the top of the pen. I think the engraving could be a little deeper, but the rest of pen's design seems to call for minimalism.

I struggled to find good lighting to show the logo’s shallow engraving
I struggled to find good lighting to show the logo’s shallow engraving

Before I received the pen I wasn’t sure which part of the pen actually moves when you shift the cap. The cap screws onto the black segment that initially separates the cap from the body, and both of these move when the cap slides — the refill always stays in place. I’d like the cap to screw in a little tighter, although only because I fiddled with it repeatedly this week! When I did unscrew the cap I found that the spring inside it would often come loose and fall out; I think a grove near the end of the cap could solve this problem.

The pen, with the cap covering the refill cartridge
The pen, with the cap covering the refill cartridge

The pen is loud. Not to write with, but with the snap as you shift the cap in and out of place. By way of comparison, it is much louder than closing the lid of my AirPods case and certainly louder than any other retractable pen that I’ve used before.

The included Stilform cartridge let the rest of the pen down. I initially found that I had to apply a lot of pressure — a lot more than I would apply with other ballpoints — to get a consistent stroke. Thankfully the cartridge is a standard (it is the same ISO G2 cartridge that Parker include in their ballpoints), so it can be easily replaced. Otherwise, the pen is a very comfortable weight and I didn’t have any issues writing with it for a few hours.

On the top line I applied extra pressure whilst on the second I allowed the ballpoint to roll across the paper.
On the top line I applied extra pressure whilst on the second I allowed the ballpoint to roll across the paper.

As much as I like the elegant design of the KOSMOS pen, I’ll continue using fountain pens for regular writing. At the moment I’m carrying it in my backpack for situations where I have to write on paper that ink would bleed on.

Later in 2018 Stilform will release the KOSMOS Ink, a fountain pen with a removable magnetic cap. I’ve pre-ordered the pen, and I look forward to seeing their ideas applied to other stationery.

Learn to Code 2018

Yours truly presenting the final session
Yours truly presenting the final session

Over the course of the last term I presented five introductory Python sessions as part of the Oxford University Computer Society's Learn to Code series. This year we saw the highest demand ever for the course, with over 1,100 people “interested” on Facebook and more than 200 attending the sessions — we packed out both lecture theatres in the Department of Computer Science!

All of the session materials from the course are available here. Note that this is my personal fork of the course materials with the version that I presented (it is quite likely that a future commmittee will present a different version next year). A playlist of all the screen recordings is available on YouTube. This was the first time I'd ever done any video editing, but in the end it turned out to be more of an exercise in audio editing.

Editing the final session in FCP
Editing the final session in FCP

The first video was edited entirely in iMovie, but from the second session onwards all the video was edited in Final Cut Pro. I imagine that iMovie would have been sufficient for all the videos, but I fancied trying out Final Cut Pro.

We used a Focusrite Scarlett Solo interface along with the cheapest wireless mic I could find on Amazon to record the audio. I later had to redub a few of the sessions due to recording issues in the lecture theatre, but most of the audio was recorded live. The sessions recorded in the lecture theatre sound a lot better and, as a friend commented, avoid the “teenager making Minecraft videos” vibe. I struggled to get audio levels right until the very end of the course, so many of the early videos are far too quiet. By the end of the course I was also a lot more leniant in what I allowed through the editing process; some of the early sessions don't have a single utterance of “erm”, “OK”, or “so” but by the end I gave up and tried to keep the editing process straightforward.

It would have been nice to record video for the session in addition to the screen captures, but a lack of preparation on my part killed this plan. Besides, each session required around a total of ten hours a week to prepare the resources, reheasrse, execute the session, and then edit the video and I don't think I could have spared much more time out of an already packed term.

I am indebted to the dozen-or-so Oxford CS students that dedicated their Thursday evenings with me to run the course this year, in particular to Sauyon Lee, who presented sessions in a second lecture theatre. The student helpers managed everything from registration to helping out students with exercises, which allowed me to focus on just the presentation during the sessions.

Although Learn to Code was undoubtedly my greatest success as president of Oxford's Computer Society, there are many ways in which the course could be improved, and I leave this advice to my successor next year:

  • Don't be afraid to go slowly. In the first session I discovered that around half the audience had done some programming before, and sped up the session to cater for them. This was a mistake — I should have continued with my plan to aim the content at complete beginners
  • Encourage students to do exercises as much as possible, as early as possible. A small number of students went away and did nearly every exercise in the notes, and they definitely made the most progress over the course
  • Provide other materials that students can look at after the course or between sessions
  • Video editing is really hard! If the next committee plans to film the 2019 sessions then I hope they can find somebody that can edit the video content, because it certainly consumed far too much of my time
  • Focus on getting syntax right before introducing algorithms. My experience this year and last year was that once students mastered the basics of Python syntax they were far more able to solve problems than introducing algorithms and syntax concurrently. We definitely introduced the binary search algorithm too early this year, for example

Overall, teaching Learn to Code this year was an absolute pleasure. Even after relatively high demand for the course last year I was shocked to see so much demand this year, and I loved seeing so many students returning each week.

At the beginning of term I met with David Malan, the originator of Harvard's CS50 course, and after hearing of such high demand at Harvard for a programming course I was delighted we could replicate similar demand on this side of the pond. I wish the very best to my successors teaching our course, and I've no doubt its popularity will continue to grow.

Play Time 3.1

Play Time, my music statistics app, now supports the iPhone X and has lots of fixes for iOS 11. Here's what's new in this release:

  • Support for new iPhone sizes and iOS 11
  • Use the predominant colour of an album's artwork when artwork is disabled
  • Sorting by Album Artist
  • Sorting options when breaking down by album, artist, or genre
  • Removed option to display play count - it is now displayed based on context
  • Explicit songs play time

Fixed:

  • Crashing bug when collecting library data
  • Resolved bugs related to row selection
  • Ensured the typeface is consistent across the app
  • Removed old style app rating

You can download the Play Time for $0.99 on the App Store.

Scala's Privates

As well as the usual public, protected, and private access control modifiers, which have with same rules as their Java siblings, Scala also supports the private[this] modifier, which prevents access to a property from other instances. For example, the following code would not compile:

class A {
  private[this] var x = 0

  def inc(a: A) = a.x = a.x + 1
}

Whereas this would be fine:

class B {
  private var x = 0

  def inc(a: A) = a.x = a.x + 1
}

Whilst the access control rules are applied at compile time, they lead to slightly different code generation. In the case of a property marked private a private field, a getter method, and a setter method are added to the class. All code that accesses the field (aside from its initialisation) will go through either the getter or setter. On the other hand, a private[this] field will only add the field to the class, and all access to the field is direct, rather than through accessor methods (as, after all, all access is only done by the instance).

The behaviour of the Scala compiler can be verified using the javap tool, part of the JDK that prints Java bytecode in a human readable form. I compiled the following two Scala classes:

class A { private var x: Int = 37; def add() = x = x + 1 }
class B { private[this] var x: Int = 37; def add() = x = x + 1 }

The output of javap -p -c A, the class that doesn't use private[this], is:

Compiled from "Classes.scala"
public class A {
  private int x;

  private int x();
    Code:
       0: aload_0
       1: getfield      #13                 // Field x:I
       4: ireturn

  private void x_$eq(int);
    Code:
       0: aload_0
       1: iload_1
       2: putfield      #13                 // Field x:I
       5: return

  public void add();
    Code:
       0: aload_0
       1: aload_0
       2: invokespecial #22                 // Method x:()I
       5: iconst_1
       6: iadd
       7: invokespecial #24                 // Method x_$eq:(I)V
      10: return

  public A();
    Code:
       0: aload_0
       1: invokespecial #27                 // Method java/lang/Object."<init>":()V
       4: aload_0
       5: bipush        37
       7: putfield      #13                 // Field x:I
      10: return
}

The particulars of the JVM's instructions aren't important for our discussion, but we can see that Scala methods x() and x_=(int) are generated here, and we can see that they are used in the add method. Meanwhile, we see the following for javap -c -p B, the class using private[this], is:

Compiled from "Classes.scala"
public class B {
  private int x;

  public void add();
    Code:
       0: aload_0
       1: aload_0
       2: getfield      #14                 // Field x:I
       5: iconst_1
       6: iadd
       7: putfield      #14                 // Field x:I
      10: return

  public B();
    Code:
       0: aload_0
       1: invokespecial #19                 // Method java/lang/Object."<init>":()V
       4: aload_0
       5: bipush        37
       7: putfield      #14                 // Field x:I
      10: return
}

Here the add method directly accesses the field via the getfield and putfield instructions, rather than generating accessor methods. The astute reader may wonder whether Scala's use of accessor methods for all private properties is slower than using private[this] or writing similar code in Java. Should you aim to use private[this] whenever you can to give yourself a performance boost?

The short answer is no. The JVM will eventually inline the getter and setter where they are used so that each access is executing the same native code as the private[this] version. In my tests, over 10 million invocations of the add method the class that used private was around 1% slower than the one using private[this], but over 100 million invocations it was around 0.3% slower.

By passing the flags -J-XX:+PrintCompilation -J-XX:+UnlockDiagnosticVMOptions -J-XX:+PrintInlining to the scala command you can view when a method is inlined by the JVM (these are Java flags, hence the need to prepend them with -J). I found on my machine that it generally took fewer than 20,000 invocations of the add method for the x_= method to get inlined inside it, leading to identical execution of both versions. The number of invocations of the method isn't a fantastic metric to go by, as other factors (CPU load, timing, etc) also play a part in when a method gets inlined, and not all methods can be inlined because they are too large. For more complex methods, the cost of method invocation is much smaller than actually executing the method.

private[this] should therefore be used when you need it for the specific access control for which it was designed. There is no need to hand optimise your code to use private[this], the JVM will do it for you!

More articles for Microsoft Faculty Connection

After attending the Microsoft Build conference in Seattle in May, I wrote a bunch of blog posts for the Microsoft Faculty Connection blog:

Yeah, I quite like functional programming :).