Which tool to use to parse programming languages in Python?

I really like pyPEG. Its error reporting isn’t very friendly, but it can add source code locations to the AST.

pyPEG doesn’t have a separate lexer, which would make parsing Python itself hard (I think CPython recognises indent and dedent in the lexer), but I’ve used pyPEG to build a parser for subset of C# with surprisingly little work.

An example adapted from fdik.org/pyPEG/: A simple language like this:

function fak(n) {
    if (n==0) { // 0! is 1 by definition
        return 1;
    } else {
        return n * fak(n - 1);
    };
}

A pyPEG parser for that language:

def comment():          return [re.compile(r"//.*"),
                                re.compile("/\*.*?\*/", re.S)]
def literal():          return re.compile(r'\d*\.\d*|\d+|".*?"')
def symbol():           return re.compile(r"\w+")
def operator():         return re.compile(r"\+|\-|\*|\/|\=\=")
def operation():        return symbol, operator, [literal, functioncall]
def expression():       return [literal, operation, functioncall]
def expressionlist():   return expression, -1, (",", expression)
def returnstatement():  return keyword("return"), expression
def ifstatement():      return (keyword("if"), "(", expression, ")", block,
                                keyword("else"), block)
def statement():        return [ifstatement, returnstatement], ";"
def block():            return "{", -2, statement, "}"
def parameterlist():    return "(", symbol, -1, (",", symbol), ")"
def functioncall():     return symbol, "(", expressionlist, ")"
def function():         return keyword("function"), symbol, parameterlist, block
def simpleLanguage():   return function

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)