This post is part of Sketch.sh development series
Alright folks, in the first post, I introduced briefly about js_of_ocaml being used for executing the code in the browser. I call this part of the code base Sketch engine.
#Overview about OCaml's REPL
Official distribution of OCaml comes with a built-in REPL (OCaml folks usually call them Toplevel). you can execute it by typing ocaml
in your shell. Though it works, the user interface is very primitive so utop was born to address the problem. utop has an excellent user interface with context aware code completion and better typing experience. utop is the best REPL I've ever used.
You might wondering what is the relation between
rtop
andutop
?
rtop
is anutop
version that supports ReasonML syntax. We'll explore about it closely in this post as well.
$ ocaml
OCaml version 4.02.0+dev12-2014-07-30
# 1 + 1;;
- : int = 2
Toplevel provides an interactive interface over the module Toploop. It works by compiling the input code into bytecode, executing and getting the result back.
#js_of_ocaml
js_of_ocaml is a compiler for compiling OCaml bytecode to Javascript. Bytecode backend is relatively stable so maintaining js_of_ocaml
for newer compiler releases doesn't require much effort.
For compiling from OCaml bytecode to Javascript with js_of_ocaml, you need to link the package js_of_ocaml
and the ppx js_of_ocaml-ppx
with ocamlc.
ocamlfind ocamlc -package js_of_ocaml -package js_of_ocaml-ppx \
-linkpkg -o my_app.byte my_app.ml
By calling js_of_ocaml
on the output bytecode, you get a my_app.js
file ready to be embed in your website.
js_of_ocaml my_app.byte
#Goals of this post
As Sketch's codebase changes rapidly, it would be hard to document the code and explain about each module. Instead, I'm going to walk you through building a naive version of Sketch engine. I hope that you can easily understand Sketch's codebase after this.
There are several code examples in this post, they are located here.
I'm using esy as the packages manager so you need to have Node.js available and install esy:
npm install --global esy
Clone the code:
git clone https://github.com/thangngoc89/blog
cd content/sketch/examples
esy install
esy build
You can run esy build
for building all the examples. In each section, details instruction will be given.
#First version of Sketch engine
So getting back to our Sketch engine, Toploop
can be compiled to Javascript by default and that's great because it's the heart of our Sketch engine. In the early version of Sketch engine, I used JsooTop
module which is an abstraction over Toploop
module providing by js_of_ocaml-toplevel
package.
Our first version of the engine is simple, you expose a function that take the source code and return the result:
open Js_of_ocaml;
let execute: Js.t(Js.js_string) => Js.t(Js.js_string) = code => {
let code = Js.to_string(code);
let buffer = Buffer.create(100);
let formatter = Format.formatter_of_buffer(buffer);
JsooTop.execute(true, formatter, code);
let result = Buffer.contents(buffer)
Js.string(result);
};
Because js_of_ocaml
doesn't use Javascript types for representing OCaml types. You need to convert from Js.t(Js.js_string)
to string
:
let code = Js.to_string(code);
Next, we need a formatter for storing the evaluated result of the code:
let buffer = Buffer.create(100);
let formatter = Format.formatter_of_buffer(buffer);
Notice how we need to pass a buffer to the formatter? Buffers are extensible strings, you create it with an estimated initial size. Don't worry about the size that much, it will grow to fit the dataset.
The next step would be executing the code with JsooTop.execute
function:
JsooTop.execute(true, formatter, code);
Buffers are mutable so we can get the result back as a string like this:
let result = Buffer.contents(buffer)
Finally, convert from string
to Js.t(Js.js_string)
and return it. That's pretty simple right? We get ourselves a working engine in 8 lines of code. Asides from the main execute
function, you need to have these lines as well:
JsooTop.initialize();
let () = Js.export_all(
[%js {
val execute = execute
}]
);
The first line is for initialization of the toplevel, it does several things like populating the toplevel environment, setting up look-up paths,... but we don't need to worry about it right now.
The next line, we are telling js_of_ocaml
that "Hey, we want the execute
function to be callable in Javascript side" and js_of_ocaml
exports it as window.execute
if you embed the script or const { execute } = require("./engine.bc.js")
if you're requiring it from Node.js.
You can build our first version of the engine with this command:
esy dune build one/engine.bc.js --profile release
The generated file is located in _build/default/one/engine.bc.js
, let's try that in Node REPL:
❯ node
> const { execute } = require("./_build/default/one/engine.bc.js")
undefined
> execute("1 + 1;;")
'- : int = 2\n'
Great! So we got ourselves a usable engine that you can embed in your website or calling it from Node.js. But this naive version of the engine has some limitations:
- Errors and logs (stdout, stderr in OCaml) is printed directly to Javascript console, we want to catch them and return them as result of
execute
function for displaying in the UI.
> execute(`print_endline "Hello world from OCaml";;`);
Hello world from OCaml
'- : unit = ()\n'
> execute(`syntax error`);
File "", line 1, characters 12-12:
Error: Syntax error
''
- It executes all code at once. We want it to return the result for each individual statement (they are called phrases in OCaml's terminology).
> execute(`let add = (+);; add 1 2;; add 3 4;;`);
'val add : int -> int -> int = <fun>\n- : int = 3\n- : int = 7\n'
- It doesn't support ReasonML syntax yet.
#Turn stdout
and stderr
into values
stdout
and stderr
are out_channel
. They are flushed to Javscript console by default but we can flush them into a buffer instead.
let stderr_buffer = Buffer.create(100);
let stdout_buffer = Buffer.create(100);
Sys_js.set_channel_flusher(stdout, Buffer.add_string(stdout_buffer));
Sys_js.set_channel_flusher(stderr, Buffer.add_string(stderr_buffer));
Add these lines to the end of execute
function:
let result = Buffer.contents(buffer);
let stderr_result = Buffer.contents(stderr_buffer);
let stdout_result = Buffer.contents(stdout_buffer);
Buffer.clear(stderr_buffer);
Buffer.clear(stdout_buffer);
[%js {
val result = Js.string(result);
val stderr = Js.string(stderr_result);
val stdout = Js.string(stdout_result)
}]
Here we bind the content of execution result, stderr and stdout into variables and clear the buffer after that. We also return an Javascript object with this shape:
{
result: string,
stderr: string,
stdout: string
}
You can build this example with:
esy dune build two/engine.bc.js --profile release
Test it:
❯ node
> const { execute } = require("./_build/default/two/engine.bc.js")
undefined
> execute("1 + 1;;")
{ result: '- : int = 2\n', stderr: '', stdout: '' }
> execute(`print_endline "Hello world from OCaml";;`);
{ result: '- : unit = ()\n',
stderr: '',
stdout: 'Hello world from OCaml\n' }
> execute(`syntax error`);
{ result: '',
stderr: 'File "", line 1, characters 12-12:\nError: Syntax error\n',
stdout: '' }
That looks more promising! We get stderr
and stdout
as return values. In Sketch UI, This information will be display inline like this
#Support ReasonML syntax
As I mentioned above, rtop
is utop
with ReasonML syntax support. Adding ReasonML syntax, we need to install reason
package first:
esy add @esy-ocaml/reason
Add these lines before JsooTop.initialize();
in engine.re
:
let reasonSyntax = () => {
open Reason_toolchain.From_current;
let wrap = (f, g, fmt, x) => g(fmt, f(x));
Toploop.parse_toplevel_phrase :=
Reason_util.correctly_catch_parse_errors(x =>
Reason_toolchain.To_current.copy_toplevel_phrase(
Reason_toolchain.RE.toplevel_phrase(x),
)
);
Toploop.parse_use_file :=
Reason_util.correctly_catch_parse_errors(x =>
List.map(
Reason_toolchain.To_current.copy_toplevel_phrase,
Reason_toolchain.RE.use_file(x),
)
);
Toploop.print_out_value :=
wrap(copy_out_value, Reason_oprint.print_out_value);
Toploop.print_out_type := wrap(copy_out_type, Reason_oprint.print_out_type);
Toploop.print_out_class_type :=
wrap(copy_out_class_type, Reason_oprint.print_out_class_type);
Toploop.print_out_module_type :=
wrap(copy_out_module_type, Reason_oprint.print_out_module_type);
Toploop.print_out_type_extension :=
wrap(copy_out_type_extension, Reason_oprint.print_out_type_extension);
Toploop.print_out_sig_item :=
wrap(copy_out_sig_item, Reason_oprint.print_out_sig_item);
Toploop.print_out_signature :=
wrap(List.map(copy_out_sig_item), Reason_oprint.print_out_signature);
Toploop.print_out_phrase :=
wrap(copy_out_phrase, Reason_oprint.print_out_phrase);
};
What's going here? Toploop
module contains several functions for parsing and printing values: parse_toplevel_phrase
, parse_use_file
, print_out_*
,... These functions are mutable so we can swap out the original OCaml parser and printer with ReasonML ones. I didn't figure this out myself, the ReasonML team did.
We also need to copy reason_utils.ml
from rtop's source code to our project. I chose to vendor this file instead of installing it because rtop
depends on utop
, and that's a heavy dependencies with lots of primitives that js_of_ocaml
can't compile to Javscript.
You can build this example with:
esy dune build three/engine.bc.js --profile release
Let's test it:
> const { execute } = require("./_build/default/three/engine.bc.js")
undefined
> execute(`print_endline("Hello from " ++ "ReasonML");`);
{ result: '- : unit = ()\n',
stderr: '',
stdout: 'Hello from ReasonML\n' }
You can notice the ++
operator for string concatenation as well as semicolon as expression seperator indicating ReasonML syntax. If you try the OCaml syntax again, you'll get syntax error:
> execute(`print_endline ("Hello from " ^ "OCaml") ;;`);
{ result: '',
stderr: 'File "", line 1, characters 31-41:\nError: Syntax error\n',
stdout: '' }
This is understandable since we replaced OCaml parser and printer with Reason ones. What if we want to support both OCaml and ReasonML syntax at the same time just like [https://sketch.sh]? A naive solution would be keeping 2 seperate version of the engine but engine.bc.js
weights about 5.0MB. That's huge for a client-side Javascript file.
#Wrapping up
Whoops. That's a lot of stuff going on. In this post, I showed you how to build the simplest version of our engine, turn stdout
and stderr
into first class values and add ReasonML syntax support.
I want to thank you Matthias Kern and Bryan Phelps for giving me early feedbacks about this post.
#What's next?
In the next post, I'll explain about supporting OCaml and ReasonML at the same time as well as getting result for each invidiual phrase