New SQL Parser in dbscript

November 30, 2009

I wrote about my plans to add a new SQL parser engine into dbscript a couple of months ago. Now the time has come to actually implement it for T-SQL (MS SQL Server; Oracle and PostgreSQL will follow in future versions), and I found that I did not foresee all the consequences of my initial intent. The overall architecture remained the same though.

The grammar definition allows to define attributes on non-terminals. Using the SQL Server versions as attributes (2005, 2008), I can mark commands or clauses as their support or feature has been introduced in a specific version, and store the database version as property of the uploaded Project Version.

The parser skips the parts of the uploaded file that it cannot parse, and writes these parts to the upload log. Up to now, the parser would simply fail if it considered the SQL file somehow invalid.

The (C#) object representation of a parsed SQL command has a boolean flag IsHandled for each non-terminal. The code processing the object representation needs to mark every non-terminal object as being handled (i.e. translated into schema information stored in the database). Objects that have not been flagged will also be listed in the upload log.

Next, the new parser allowed me to rewrite the dependency analysis. Until now, dbscript only analyzed view dependencies to order the CREATE VIEW statements. Dependency analysis has now been extended to all database objects.

One more major issue that has been solved is parsing and interpreting EXECUTE statements. Thus an EXEC sp_addextendedproperty is interpreted as adding a description to a database object.

This obviously led to changes in the XML generation of database schemas. Each object now lists its descriptions and the dependencies on and references from other database objects.

The XSL stylesheets which translate a project version XML into markup or HTML have been revised to reflect the additional information in the generated XML.

All taken together, the new parser adds:

  • better feedback on which parts of the uploaded SQL file have been processed
  • dependency analysis
  • object and column descriptions

See the following links documenting AdventureWorks OLTP 2008 (version 2005) for the effects of the new functionality:

Single HTML file documentation

MediaWiki documentation

ScrewTurn Wiki documentation

In both wikis, compare the (old-style) “wikibot” section with the new section “automatically generated” to see the changes.

The next version of dbscript with the mythical version number “1.0″ will be released soon ;)


dbscript Videos

November 21, 2009

I created a couple of introductory videos describing dbscript covering topics previously handled on this blog or in the online help:

The videos have been created using CamStudio (screen recorder) and VirtualDub (avi editor) and ffmpeg (avi to flv converter). FlowPlayer is embedded by a Joomla plug-in.

The videos can be watched here. dbscript is available for download here.


dbscript New Version 0.99

October 20, 2009

The latest version 0.99 of dbscript has been released today providing new functionality and a couple of fixes.

Data diagrams looked a bit distorted if the data model contained circular foreign key constraints. I sketched the problem in my article on cycle detection, and the data diagram now excludes circular foreign keys in the calculation of the tables’ positions.

Comparison results can be restricted to “scopes”, such as new objects only, dropped objects only, etc. This makes it easier to generate schema migration scripts without dropping objects, for example.

Documentation Generators provide a preview to the generated content, and the generated XML now contains the project and project version identifiers to enable linking and referencing in the generator’s output.

Scripting a table in the object’s Generate/Create page now includes all constraints and indexes. (The project version script always included child objects). The same applies to object comparisons of tables, so that changes to indexes etc are easily identifiable.

New Functions

Besides generating .png data diagrams, dbscript now has the capability to generate data diagrams for Dia, an open-source diagrammer. The layout routine is the same as for png’s, but the output is Dia’s native XML format. Generating for Dia means that developers can freely layout and edit the diagram according to their needs, and export it to other formats. I described this feature earlier, and included samples.

Schema comparison is one basic feature of dbscript, and the new version compares multiple versions in one operation. After defining which schema versions to compare, you get a comparison matrix showing the number of differences between any two versions.

If the selected versions are versions of the same schema at different points of time, the comparison timeline shows each object ever changing in any of the versions, along with an indicator of the change.

Within a project, you can define Branches (as known from version control systems) and assign project versions to a branch. This alone would not be too overwhelming, but branches are a precondition of the update notification system, which I will describe in a future post.

The latest version of dbscript is available for download here.

Please leave comments and feedback.


Multi-Version Comparison – Timeline

October 8, 2009

Results of multi-version comparisons can be displayed as a matrix, where each element displays the number of differences between the versions indicated by row and column. The main diagonal remains empty.

Revised matrix, showing versions in chronological order (instead of reversed order):

If you have available the set of all comparisons, you can also pick out the most relevant (in my opinion): order the versions by date, use only the comparisons between two (chronologically) adjacent versions, and display a timeline of database object changes:

The timeline has a row for each database objects that has ever changed in any of the schemas (= project versions). Symbols in columns indicate whether an object was present in a project version (box), or has changed (create, alter, drop) between two versions.

The symbols will be hyperlinked to existing pages displaying the original information in the database (object in project version or object in comparison).


Multi-Version Comparison

October 5, 2009

If you want to compare multiple versions of a database schema, for example to get an overview of which table or view was created or dropped in which version, you need to compare each version with every other version, resulting in n*(n-1) or n*(n-1)/2 comparisons, depending on what your requirements are.

In my first steps toward multi-version comparison I created a list editor to assign all the project versions (schema versions) to be compared, and display the result in a matrix containing the number of differences.

This screenshot displays the comparison matrix of dbscript versions 0.91 through 0.98.

In case you wonder why the matrix is not symmetrical: CREATE and DROP have a different number of differences. Whereas a dropped table is a difference of 1, a created table usually also has a primary key, constraints, indexes, which add to the number of differences.

The goal of this new functionality is to also include a display similar to the two-version comparison result, but covering all versions.

It’s not very clear to me yet, so your ideas are welcome ;)


Creating Dia Data Diagrams from Database Schema

September 27, 2009

As I stated in earlier posts, you can use dbscript to generate PNG Data Diagrams (MS SQL Server, Oracle, PostgreSQL).

The generated PNG files are intended to give you an overview of the data model. The great drawback is that the information is graphical only, as it contains no information on the original data.

Upcoming version 0.99 will also include the capability to generate Dia files representing a database schema. Dia is an open-source diagram creation program which runs on Windows and Linux.

The advantages of generating in Dia format are that it allows users / developers to modify the generated diagrams, and that general Dia functionality can be used as stated on their website:

[It] can export diagrams to a number of formats, including EPS, SVG, XFIG, WMF and PNG, and can print diagrams (including ones that span multiple pages).

There is one minor issue in generating and opening Dia files: since Dia automatically sizes Table shapes according to their contents, it is not possible to predict what the table shape size will turn out to be (see FAQ). Thus when you open a generated Dia file for the first time, the foreign key connectors will look disconnected from the tables, even though they are not (in the data).

You need to manually enforce connector layout following these steps:

  • Select All (Ctrl-A)
  • Move selected objects (Cursor-Left, then Cursor-Right)

which will layout the foreign key connectors as expected.

Visit the gallery to view PNG images generated by Dia from diagrams created by dbscript.

Download Dia files with database models of

You need to install Dia to view the contents of these files.


Version 0.98 of dbscript Released

September 9, 2009

The latest version 0.98 of dbscript supports PostgreSQL databases in its documentation generation capabilities.

After importing the database dictionary (via direct connection using ADO.Net and Npgsql) can document a PostgreSQL database in all currently supported documentation format:

MediaWiki

Data Diagram (PNG)

HTML

ScrewTurn wiki

Integration support for PostgreSQL had some consequences: More and more functionality is handled separately for each database engine.

Database import was obviously the first one, since the data access classes (SqlConnection, SqlCommand) in .Net are different for every database library. Same goes for the database dictionary, which is best retrieve from the native system catalogs.

For import and upload, data access classes have been introduced to distinguish the different object types and their properties of each database engine. I mention work on the data access classes in a series of articles already.

In version 0.98, XML generation and object script generation are implemented separately. This results in XSL style sheets being now related to certain a database engine.

For Oracle, XML and object script generation have been updated, and the XSL style sheets have been adjusted to Oracle-specific objects and properties. The results were documented earlier.

The latest version of dbscript is available for download here.

Please leave comments and feedback.


Generating Database Documentation for ScrewTurn wikis

August 29, 2009

Updated XSL style sheets for dbscript documentation generators creating ScrewTurn content.

See here for sample output:

MS SQL AdventureWorks

Oracle Demo Schema

PostgreSQL OpenNMS

ScrewTurn also allows external Page Providers (instead of stored static pages)

MS SQL AdventureWorks (by Page Provider)


Documenting Oracle Databases

August 26, 2009

I described the capabilities of dbscript to generate MediaWiki content and a single HTML page documenting an Oracle database in a previous post.

During development of PostgreSQL support (MediaWiki, HTML) it became clear that an XSL style sheet (among other things) needed to become specific to a database engine.

I therefore updated the XSL style sheets for Oracle support, and these are the results:


Creating HTML documentation of PostgreSQL databases

August 10, 2009

dbscript ships with a couple of XSLT style sheets which transform an XML representation of a database schema into MediaWiki, HTML, or, if you create them on your own, any format you wish.

After writing the previous post on PostgreSQL support in dbscript, I fixed the XSLT for HTML generation, created a Documentation Generator in dbscript, and this is the resulting HTML documentation of the OpenNMS data model.