Racc (Ruby yACC) is a LALR(1) parser generator for Ruby.

—- http://i.loveruby.net/en/projects/racc/

Racc is always my first option to implement a small DSL parser in Ruby. It’s simple and clean. The followings are some useful code snippets and tips for implementing a parser by Racc.

Result var option

Not sure why there are 2 options. It is easy to have an attribute to store result. I always choose no_result_var option. Hence all the following code examples have this option enabled.

Example:

class YourModule::Parser
  options no_result_var

...

Recursive rules

Common case: use Array to collect result

names
: name names               { [val[0]] + val[1] }
| name                     { [val[0]] }
;

name
: ID
| DEFINE
| DOMAIN
...

Token inlined recursive rule example:

symbols
: SYMBOL symbols           { [val[0]] + val[1] }
| SYMBOL                   { [val[0]] }
;

Optional recursive: use Hash to collect result

problem_primaries
: problem_primary problem_primaries                  { val[0].merge(val[1]) }
| problem_primary                                    { val[0] }
;

problem_primary
: objects                                            { { objects: val[0] } }
| init                                               { { init: val[0] } }
| goal                                               { { goal: val[0] } }
;

Parse with context

---- inner ----

def parse(str, domains)
  @domains = domains
  @tokens = []
  str = "" if str.nil?
  scanner = StringScanner.new(str + ' ')
...
Parser.new.parse(pddl, self.domains)

Token analyzer

2 parts. Part 1: parse method:

token TOKEN1 TOKEN2
...
def parse(str)
  @tokens = []
  str = "" if str.nil?
  scanner = StringScanner.new(str + ' ')
  until scanner.eos?
    case
    when scanner.scan(/\s+/)
    # ignore space
    when m = scanner.scan(/token1/)
      @tokens.push [:TOKEN1, m]
    when m = scanner.scan(/token2/)
      @tokens.push [:TOKEN2, m]
    ...
    else
      raise "unexpected characters: #{scanner.peek(5).inspect}"
    end
  end
  @tokens.push [false, false]
  do_parse
end

Part 2: next_token method:

def next_token
  @tokens.shift
end

Error handling

Simple inhencement for better parsing error when source file has grammar error.

def on_error(t, val, vstack)
  trace = vstack.each_with_index.map{|l, i| "#{' ' * i}#{l}"}
  raise ParseError,
        "\nparse error on value #{val.inspect}\n#{trace.join("\n")}"
end

References

  1. Most of code snippets are from Longjing’s PDDL Parser.
  2. Racc document: http://i.loveruby.net/en/projects/racc/doc/.