2011-09-26

C++ source code generation with inline Python

While struggling with boilerplate code needed for Concrete's built-in Python object implementations and other repetitive structures that are involved in gluing dynamic and static worlds together, I decided to go ahead and generate some of the C++ source code files from templates (not to be confused with C++ template types). One alternative I considered was to use a full C++ parser for building an abstract syntax tree and to generate augmented C++ sources from that, but the options didn't seem too straightforward: clang 2.8 (included in Ubuntu) can't parse Concrete and the information I found about its AST output support wasn't encouraging. GCC-XML is based on a GCC version which is too old, and doesn't promise full C++ template support.

I use standard C pre-processor macros for some of the object boilerplate, like declaring copy constructors and such. Those macro invocations double as meta data which is parsed by the C++ source template processor. The template processor (implemented in Python) reads all standard and template source files, uses regular expressions to parse interesting bits into convenient data structures, and converts the template files into standard C++. The template files contain Python code snippets enclosed in {{{ and }}} which access the parsed data and output C++ code.

Example

Standard C++ header declares an object type with inheritance information (and implicit properties implied by the "default" macro flavor):

    class StringObject: public Object {
            CONCRETE_OBJECT_DEFAULT_DECL(StringObject, Object)
            // ...
    };

Template processor gathers that information:

    Objects = set()

    @parse(r"\s*CONCRETE_OBJECT_.*_DECL\((.+), (.+)\)")
    def ParseObject(filename, name, parent):
            Objects.add(Object(name, parent))


C++ source template generates code using that information:

    void InstantiateAllObjectsJustForTheFunOfIt()
    {
            {{{ for o in Objects:
                    lowername = o.name.lower()
                    echo("{o.name} {lowername}_inst;") }}}
    }

The generated C++ source would look like:

    void InstantiateAllObjectsJustForTheFunOfIt()
    {
            Object object_inst;
            StringObject stringobject_inst;

            // ...
    }


Complete real-world examples:

It would be cool to replace the C macros and the regex-parsing with more inline Python which populates the meta data structures directly.  That would make the template processor generic and the applications self-contained.

There's more

    for i in seq:
            if pred(i):
                    func(i)

and

    [func(i) for i in seq if pred(i)]

work but

    for i in seq if pred(i):
            func(i)

doesn't, and Python 3 doesn't accept

    for i in seq: if pred(i): func(i)

as a one-liner which would be really handy in templates.  But thanks to a regular expression you can write

    {{{ for i in seq if pred(i): func(i) }}}

in a Concrete template file.