Highlights from recent sessions
With the next compiler hacking meeting due to take place in a couple of days it's time for a look back at some results from our last couple of sessions.
The front end
The front end (i.e. the parser and type checker) saw a number of enhancements.
Succinct functor syntax
Syntax tweaks are always popular, if often contentious. However, reaching agreement is significantly easier when adding syntax is a simple matter of extending an existing correspondence between two parts of the language. For example, it was clear which syntax to use when adding support for lazy patterns: since patterns generally mirror the syntax for the values they match, patterns for destructing lazy values should use the same
lazy keyword as the expressions which construct them.
A second correspondence in OCaml's syntax relates modules and values. Module names and variables are both bound with
=; module signatures and types are both ascribed with
:; module fields and record fields are both projected with
.. The syntax for functors and functions is also similar, but the latter offers a number of shortcuts not available in the module language; you can write
fun x y z -> e
instead of the more prolix equivalent:
fun x -> fun y -> fun z -> e
but multi-argument functors must be written out in full:
functor (X : R) -> functor (Y : S) -> functor (Z : T) -> M
functor (X : R) (Y : S) (Z : T) -> M
The patch also adds support for a corresponding abbreviation at the module type level. Defining the type of a multi-argument functor currently involves writing a rather clunky sequence of
module type F = functor (X : R) -> functor (Y : S) -> functor (Z : T) -> U
With Thomas's patch all but the first occurrence of
module type F = functor (X : R) (Y : S) (Z : T) -> U
Since Thomas's patch has been merged into trunk, you can try out the new syntax using the 4.02.0 beta, which is available as a compiler switch in the OPAM repository:
opam switch 4.02.0+trunk
The next step is to find out whether the verbose syntax was a symptom or a cause of the infrequency of higher-order functors in OCaml code. Will we see a surge in the popularity of higher-order modules as the syntax becomes more accommodating?
match mdn with None when h >= 0 && h <= 23 -> Some h | Some false when h > 0 && h <= 11 -> Some h | Some false when h = 12 -> Some 0 | Some true when h > 0 && h <= 11 -> Some (h + 12) | Some true when h = 12 -> Some 12 | Some _ -> None | None -> None
Although this is fairly clear, it could be made even clearer if we had a less powerful language for expressing the tests involving
h. Since the whole OCaml language is available in the
when guard of a case, the reader has to examine the code carefully before concluding that the tests are all simple range checks. Perhaps worse, using guards inhibits the useful checks that the OCaml compiler performs to determine whether patterns are exhaustive or redundant. David's patch makes it possible to rewrite the tests without guards, making the simple nature of the tests on
h clear at a glance (and making it possible once again to check exhaustiveness and redundancy):
match mdn, h with None , 0..23 | Some false, 1..11 -> Some h | Some false, 12 -> Some 0 | Some true , 1..11 -> Some (h + 12) | Some true , 12 -> Some 12 | _ -> None
The work on range patterns led to a robust exchange of views about which other types should be supported -- should we support any enumerable type (e.g. variants with nullary constructors)? or perhaps even any ordered type (e.g. floats or strings)? For the moment, there seems to be a much clearer consensus in favour of supporting integer types than there is for generalising range patterns any further.
Since the compiler hacking group only meets for an evening every couple of months or so, most of the projects we work on are designed so that it's possible to implement them in a few hours. Leo's proposal for extensible variants is a notable exception, predating both the compiler hacking group and OCaml Labs itself.
Extensible variants generalise exceptions: with Leo's patch the exception type
exn becomes a particular instance of a class of types that can be defined by the user rather than a special builtin provided by the compiler:
(* Define an extensible variant type *) type exn = .. (* Extend the type with a constructor *) type exn += Not_found (* Extend the type with another constructor *) type exn += Invalid_argument of string
Even better, extensible variants come with all the power of regular variant types: they can take type parameters, and even support GADT definitions:
(* Define a parameterised extensible variant type *) type 'a error = .. (* Extend the type with a constructor *) type 'a error = Error of 'a (* Extend the type with a GADT constructor *) type 'a error : IntError : int -> int error
Extensible variants are a significant addition to the language, and there's more to them than these simple examples show. A forthcoming post from Leo will describe the new feature in more detail. In the meantime, since they've been merged into the 4.02 release candidate, you can try them out with OPAM:
opam switch 4.02.0+trunk
Lazy record fields
Not everything we work on makes is destined to make it upstream. A few years ago, Alain Frisch described an OCaml extension in use at Lexifi for marking record fields lazy, making it possible to delay the evaluation of initializing expressions without writing the
lazy keyword every time a record is constructed. Alain's post was received enthusiastically, and lazy record fields seemed like an obvious candidate for inclusion upstream, so in April's meeting Thomas put together a patch implementing the design. Although the OCaml team decided not to merge the patch, it led to an enlightening discussion with comments from several core developers, including Alain, who described subsequent, less positive, experience with the feature at Lexifi, and Xavier, who explained the rationale underlying the current design.
The back end
The OCaml back end (i.e. the code generation portion of the compiler) also saw a proposed enhancement.
Constant arithmetic optimization
Stephen submitted a patch improving the generated code for functions that perform constant arithmetic on integers.
In OCaml, integers and characters are represented as shifted immediate values, with the least significant bit set to distinguish them from pointers. This makes some arithmetic operations a little more expensive. For example, consider a function that
int_of_digits that builds an integer from three character digits:
int_of_digits '3' '4' '5' => 345
We might define
int_of_digits as follows:
let int_of_digits a b c = 100 * (Char.code a - Char.code '0') + 10 * (Char.code b - Char.code '0') + 1 * (Char.code c - Char.code '0')
-dcmm flag to ocamlopt shows the results of compiling the function to the C-- intermediate language.
ocamlopt -dcmm int_of_digits.ml
The generated code has the following form (reformatted for readability):
200 * ((a - 96) >> 1) + 20 * ((b - 96) >> 1) + 2 * ((c - 96) >> 1) + 1
The right shifts convert the tagged representation into native integers, and the final
+ 1 converts the result back to a tagged integer.
Stephen's patch floats the arithmetic operations that involve constant operands outwards, eliminating most of the tag-munging code in favour of a final correcting addition:
(a * 100) + (b * 10) + c - 10766
Although these changes are not yet merged, you can easily try them out, thanks to Anil's script that makes compiler pull requests available as OPAM switches:
opam switch 4.03.0+pr17
Standard library and beyond
Our compiler hacking group defines "compiler" rather broadly. As a result people often work on improving the standard library and tools as well as the compiler proper. For example, in recent sessions, David added a small patch to expose the is_inet6_addr function, and Philippe proposed a patch that eliminates unnecessary bounds checking in the buffer module. The last session also saw Raphaël and Simon push a number of patches for integrating merlin with the acme editor to OPAM, improving OCaml support in Plan 9.
The compiler hacking group is open to anyone with an interest in contributing to the OCaml compiler. If you're local to Cambridge, you're welcome to join us at the next session!