Parsing SQL into XML (2)

I recently starting re-writing my SQL parser for dbscript and reported about the first results.

Things always turn out more complicated than they initially appear, and one thing I realized was that a simple .sql file is not merely SQL, but also contains some “meta-magic” such as TSQL’s GO statement, which is not a valid SQL command, but really a batch separator for Query Analyzer or Management Studio.

Anyway, there is some kind of progress: I described all SQL statements that the AdventureWorks .sql scripts use, and parsing is pretty quick.

The sample statement of my previous post

CREATE TABLE [Production].[TransactionHistoryArchive](
    [TransactionID] [int] NOT NULL,
    [ProductID] [int] NOT NULL,
    [ReferenceOrderID] [int] NOT NULL,
    [ReferenceOrderLineID] [int] NOT NULL ,
    [TransactionDate] [datetime] NOT NULL ,
    [TransactionType] [nchar](1) COLLATE Latin1_General_CI_AS NOT NULL,
    [Quantity] [int] NOT NULL,
    [ActualCost] [money] NOT NULL,
    [ModifiedDate] [datetime] NOT NULL ,
    CONSTRAINT [PK_TransactionHistoryArchive_TransactionID] PRIMARY KEY CLUSTERED
    (
        [TransactionID] ASC
    ) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF,
        IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON)
    ON [PRIMARY]
) ON [PRIMARY]

is now parsed into an XML representation which looks like this:

<?xml version="1.0" encoding="Windows-1252"?>
<TSQL>
  <cmdCreateTable>
    <CREATE>CREATE</CREATE>
    <TABLE>TABLE</TABLE>
    <id>
      <id2s>
        <id2>[Production]</id2>
        <x2E>.</x2E>
        <id1>[TransactionHistoryArchive]</id1>
      </id2s>
    </id>
    <x28>(</x28>
    <columns>
      <column>
        <id>[TransactionID]</id>
        <datatype>
          <id>
            <id1s>
              <id1>[int]</id1>
            </id1s>
          </id>
        </datatype>
        <attr>
          <notnull>
            <NOT>NOT</NOT>
            <NULL>NULL</NULL>
          </notnull>
        </attr>
      </column>
      <x2C>,</x2C>
....
      <x2C>,</x2C>
      <column>
        <id>[ModifiedDate]</id>
        <datatype>
          <id>
            <id1s>
              <id1>[datetime]</id1>
            </id1s>
          </id>
        </datatype>
        <attr>
          <notnull>
            <NOT>NOT</NOT>
            <NULL>NULL</NULL>
          </notnull>
        </attr>
      </column>
      <x2C>,</x2C>
      <constraint>
        <CONSTRAINT>CONSTRAINT</CONSTRAINT>
        <constraintid>[PK_TransactionHistoryArchive_TransactionID]</constraintid>
        <type>
          <pk>
            <PRIMARY>PRIMARY</PRIMARY>
            <KEY>KEY</KEY>
            <clustered>
              <CLUSTERED>CLUSTERED</CLUSTERED>
            </clustered>
            <indexcolumns>
              <x28>(</x28>
              <indexcolumns>
                <indexcolumn>
                  <colasc>
                    <columnid>[TransactionID]</columnid>
                    <ASC>ASC</ASC>
                  </colasc>
                </indexcolumn>
              </indexcolumns>
              <x29>)</x29>
            </indexcolumns>
            <with>
              <WITH>WITH</WITH>
              <x28>(</x28>
              <params>
                <param>
                  <key>PAD_INDEX</key>
                  <x3D>=</x3D>
                  <value>OFF</value>
                </param>
                <x2C>,</x2C>
....
                <x2C>,</x2C>
                <param>
                  <key>ALLOW_PAGE_LOCKS</key>
                  <x3D>=</x3D>
                  <value>ON</value>
                </param>
              </params>
              <x29>)</x29>
            </with>
            <onfilegroup>
              <ON>ON</ON>
              <filegroupid>[PRIMARY]</filegroupid>
            </onfilegroup>
          </pk>
        </type>
      </constraint>
    </columns>
    <x29>)</x29>
    <onfilegroup>
      <ON>ON</ON>
      <filegroupid>[PRIMARY]</filegroupid>
    </onfilegroup>
  </cmdCreateTable>
</TSQL>

The complete result file is here.

So as I manage to complete parse and represent the contents of a SQL statement, the question remains how to evaluate the result:

  • Analyze internal representation (messy)
  • Analyze XML representation (complicated)
  • Trigger events for each parsed entity and their attributes (complicated)

The problem is always that the evaluating routine seems to be de-coupled from the parser.

The first solution is too much dependent on internal representation. Changing a little aspect of the parsing mechanism or the grammar would result in breaking code.

The other two solutions are less dependent, but still a change in the grammar needs to be reflected in the evaluator, without any immediate error reporting if the evaluator does not match the parser.

I read about Xtext the other day, a DSL parser for Eclipse, which seems to include a code generator that generates Java classes to match the grammar. An interesting solution, since a change in the grammar and subsequent code generation would immediatly break compilation of the evaluator.

To be continued… 😉

2 Responses to Parsing SQL into XML (2)

  1. […] SQL into XML and C# Objects I recently wrote about my attempts to parse SQL statements and extract schema information from them. (dbscript already has a SQL parser, but it needs to be […]

  2. supriya says:

    Can you provide me some info to generate xml for the given sql script. I am looking for C# code.

    Thanks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: