Creating Domain Specific Languages with OMeta

I've been researching different ways of constructing the Forge Architect DSL. There are tons of different tools and different algorithms used to lex, tokenise, and evaluate context free languages:

I found OMeta early on, and after reading Alessandro Warth's PHD dissertation, it appeared that OMeta was well suited to prototyping new domain specific languages.

OMeta is based on Parsing Expression Grammars (PEG) that has been extended to handle some of the limitations that the original specification of PEGs had. For example it supports left recursion easily through a simple seed parsing and memoization trick that I don't yet fully grok. (It's all in the PHD paper)

OMeta is implemented through host languages, it's currently available in JavaScript, Squeak, Python, Ruby, C#, Scheme, Common Lisp at varying levels of maintenance.

My plan now is to create an OMeta hosted on Elixir in order to allow Elixir to interpret the Architect as an external DSL.

But before I can start, I need to practice my understand on OMeta. I found a blog post by Jeff Moser who worked on the OMeta# on C# 6 years ago. He published a small compiler for a toy Fizzbuzz language that compiled the language into C# and executed.

Now I'm not familiar in C#, but I wanted to try to rewrite this in JavaScript, so it would run inside OMetaJS.

So here's our mission. We have a new language that looks like this:

for every number from 1 to 100  
  if the number is a multiple of 3 and it is a multiple of 5 then
    print "FizzBuzz"
  else if it is a multiple of 3 then
    print "Fizz"
  else if it is a multiple of 5 then
    print "Buzz"
  else
    print the number

We need to compile this into JavaScript and execute it.

Here is my solution (gist):

// our FizzBuzz language
var code =  
"for every number from 1 to 100\n\
    if the number is a multiple of 3 and it is a multiple of 5 then\n\
        print \"FizzBuzz\"\n\
    else if it is a multiple of 3 then\n\
        print \"Fizz\"\
    else if it is a multiple of 5 then\n\
        print \"Buzz\"\n\
    else\n\
        print the number\n\
";

ometa FizzBuzz {  
    // number is overwritten to parse digit characters and return them as a string
    number = spaces ('+' | '-' | empty):prefix digit+:ds -> (
        parseInt(
            (prefix.length > 0) ? 
            prefix + ds.join('') : 
            ds.join('')
        )
    ),
    // quotedString matches strings inside quotes
    quotedString = spaces '"' (~'"' anything)*:string '"' -> ( 
      string.length == 0 ?
      "" :
      string.join("") 
    ),
    // variables can be prefixed with `the`, we need to track it as `_it` in the state table so it can be referenced again
    variableName = 
        ("the" | empty) spaces 
        firstAndRest('letter', 'letterOrDigit'):x 
        !(self.set("_it", x.join(""))) 
        -> (x.join("")),
    // expressions are either an andExpression, multipleExpression, numberExpression or a quotedString
    // all expressions are translated into functions
    expression = andExpression 
               | multipleExpression 
               | numberExpression 
               | quotedString:qs -> (function () { return qs; }),
    // and expressions are left recursive allowing nested and expressions, and they evaluate into a function returning a boolean
    andExpression = andExpression:l "and" booleanExpression:r -> (
                        function () { 
                            return !!l() && !!r();
                        }
                    ) 
                  | booleanExpression,
    // a boolean expression is just a boolean function
    booleanExpression = expression:e -> (function () { 
        var object = e();
        if (typeof object == "boolean") {
            return object;
        } else if (typeof o == "number") {
            return object != 0;
        } else {
            return (String(object).length > 0) && object !== null && object !== undefined;
        }
    }),
    // number expressions are functions that return an integers
    // this is where `_it` can be resolved from the previously assigned `the`
    numberExpression = number:n -> (function () { 
                           return parseInt(n); 
                       })
                     | "it" -> (function () { 
                           return parseInt(
                               self.get(
                                   self.get("_it")
                               )
                           ); 
                       })
                     | variableName:vn -> (function () { 
                           return parseInt(self.get(vn)); 
                       }),
    // `is a multiple of` is a primitive infix operator
    multipleExpression = numberExpression:left "is a multiple of" numberExpression:right -> (
         function () {
             return (left() % right()) == 0;
         }
    ),
    // statements represent top expressions
    // we have `print`, `if then else` and `for every`
    statement = "print" expression:e -> (function () { console.log(e()); })
              | "if" andExpression:condition "then" statement:first ("else" statement | empty):second -> (
                    function () {
                        if (condition()) {
                            first();
                        } else if (String(second).length > 0 && second != null) {
                            second();
                        }
                    }
                )
              | "for every" variableName:vn "from" number:low "to" number:high statement:s -> (
                    function () {
                        for (var i = low; i <= high; i++) {
                            self.set(vn, i);
                            s();
                        }
                    }
                ),
    // a block is zero or more statements
    block = statement*:ss -> (
        function () {
            ss.forEach(function (statement) {
                statement();
            });
        }
    ),
    // our program is just one block!
    program = block
}

FizzBuzz.initialize = function() {  
    // our global state table
    this.vars = {};
    this.set = function(k, v){
        this.vars[k] = v;
        return this;
    };
    this.get = function(k) {
        return this.vars[k];
    };
};

// compiles our language into JavaScript with the top level program rule
var result = FizzBuzz.matchAll(  
    code,
    'program'
);

// execute the code!
result();  

For the purpose of brevity, this code directly executes the Fizzbuzz language, it doesn't create an intermediate abstract syntax tree. Notice how to it essentially converts all the expressions into functions to be executed at the top level rule.

Copy the above code into the online OMeta interpreter (Source field): http://www.tinlizzie.org/ometa-js/ Or you can use https://github.com/alexwarth/ometa-js

Hit the run, and you get this inside your console:

1  
2  
Fizz  
4  
Buzz  
Fizz  
7  
8  
Fizz  
Buzz  
11  
Fizz  
13  
14  
FizzBuzz  
16  
17  
Fizz  
19  
Buzz  
Fizz  
22  
23  
Fizz  
Buzz  
26  
Fizz  
28  
29  
FizzBuzz  
31  
32  
Fizz  
34  
Buzz  
Fizz  
37  
38  
Fizz  
Buzz  
41  
Fizz  
43  
44  
FizzBuzz  
46  
47  
Fizz  
49  
Buzz  
Fizz  
52  
53  
Fizz  
Buzz  
56  
Fizz  
58  
59  
FizzBuzz  
61  
62  
Fizz  
64  
Buzz  
Fizz  
67  
68  
Fizz  
Buzz  
71  
Fizz  
73  
74  
FizzBuzz  
76  
77  
Fizz  
79  
Buzz  
Fizz  
82  
83  
Fizz  
Buzz  
86  
Fizz  
88  
89  
FizzBuzz  
91  
92  
Fizz  
94  
Buzz  
Fizz  
97  
98  
Fizz  
Buzz  

Later I created a simple Markdown compiler written in OMetaJS: https://gist.github.com/CMCDragonkai/4bebe4156fcc5fdd76b0 Which was derived from the original here: http://joshondesign.com/2013/03/05/ometa1

Discovered some interesting OMetaJS idiosyncrasies when it comes to string matching: https://gist.github.com/CMCDragonkai/963bf8066ade0253bb78

Now OMetaJS is no longer being maintained. But there are 3 active forks that I am going to investigate:

  1. https://github.com/xixixao/meta-coffee
  2. https://github.com/Page-/ometa-js
  3. https://github.com/veged/ometa-js

Now for more OMeta research...