On Converting Data

I had to analyze SQL Server Database Projects (available from SQL Server Data Tools for Visual Studio), and these projects offer a menu item “Create Snapshot” which creates a snapshot file with the extension .dacpac.

It turns out that a .dacpac is a zipped XML file (plus some other files) containing a structured representation of the database objects defined in the project. However Visual Studio does not provide a way to display them (double-clicking the file will only display binary data).

So I thought about how to best display the contents of a .dacpac? Two methods came to my mind.

First, inspired by my work on wpxslgui, create an XSLT style sheet which transforms the contents of the XML file to some legible text, for example CREATE TABLE and similar TSQL statements.

Intuitively I called this approach Symbolic Transformation.

Symbolic Transformation (e.g. XSLT)
Representation 1 => Representation 2

On the other hand, a Logical Transformation contains one module parsing the information contained in “Representation 1” into some kind of model, and another module creating the “Representation 2” of that model. The two module can be  implementations of the Interpreter pattern and the Builder or Factory patterns, respectively.

Logical Transformation (simple)
Interpreter Builder
Representation 1 => Model => Representation 2

If we take Representation 1 and Representation 2 as two separate interfaces to the same business model, and want to support two-way operations, we can extend the last table like this:

Logical Transformation (extended)
Converter Converter Interpr.
Repr. 1 => Model 1 => Business Model => Model 2 => Repr. 2
<= <= <= <=

Why do we need to have Model 1 and Model 2, as they seem to make the whole thing even more complex?

Let’s have a look at a simple CREATE TABLE statement and some of their representations:

  • a SQL parser (think: ANTLR) uses the representation given by the SQL parser
  • the SQL Server catalog views sys.tables, sys.columns, etc. are a different representation
  • a .dacpac archive is another representation

To keep our code simple, our Model X class structure should be as close to the representation as 1) possible 2) necessary (thinking about proxy classes generated by xsd.exe, ANTLR, and ORMs).

Thus, a common data model (named Business Model in the table) is required, as well as 2-way conversion between the Business Model and each of the other models.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: