As it turns out, code generation is the way to go!
I start out with an object-oriented representation of the grammar of the target language, in this case T-SQL. A (custom-written) lexer turns the input stream into a sequence of lexemes or tokens, which the parser engine feeds into the grammar’s rule definitions.
The result of this action is an XML document whose structure is equivalent to the grammatical definition of the respective SQL statement.
This is where code generation magic sets in: The grammar definition is also used to generate a C# object model, mapping each rule into a C# class. Each class implements an interface requiring a SetXml() method, which reads the contents of the instantiated object from an XmlNode structure, and thus populates its child objects recursively.
The application logic (which can be seen as sort of a “parser client”) only needs to instantiate the grammar object and the parser object, feed the parser the file, and retrieve a typed object containing an OO representation of the parsed file.
The next version of dbscript will be built on these mechanisms.