New SQL Parser in dbscript

November 30, 2009

I wrote about my plans to add a new SQL parser engine into dbscript a couple of months ago. Now the time has come to actually implement it for T-SQL (MS SQL Server; Oracle and PostgreSQL will follow in future versions), and I found that I did not foresee all the consequences of my initial intent. The overall architecture remained the same though.

The grammar definition allows to define attributes on non-terminals. Using the SQL Server versions as attributes (2005, 2008), I can mark commands or clauses as their support or feature has been introduced in a specific version, and store the database version as property of the uploaded Project Version.

The parser skips the parts of the uploaded file that it cannot parse, and writes these parts to the upload log. Up to now, the parser would simply fail if it considered the SQL file somehow invalid.

The (C#) object representation of a parsed SQL command has a boolean flag IsHandled for each non-terminal. The code processing the object representation needs to mark every non-terminal object as being handled (i.e. translated into schema information stored in the database). Objects that have not been flagged will also be listed in the upload log.

Next, the new parser allowed me to rewrite the dependency analysis. Until now, dbscript only analyzed view dependencies to order the CREATE VIEW statements. Dependency analysis has now been extended to all database objects.

One more major issue that has been solved is parsing and interpreting EXECUTE statements. Thus an EXEC sp_addextendedproperty is interpreted as adding a description to a database object.

This obviously led to changes in the XML generation of database schemas. Each object now lists its descriptions and the dependencies on and references from other database objects.

The XSL stylesheets which translate a project version XML into markup or HTML have been revised to reflect the additional information in the generated XML.

All taken together, the new parser adds:

  • better feedback on which parts of the uploaded SQL file have been processed
  • dependency analysis
  • object and column descriptions

See the following links documenting AdventureWorks OLTP 2008 (version 2005) for the effects of the new functionality:

Single HTML file documentation

MediaWiki documentation

ScrewTurn Wiki documentation

In both wikis, compare the (old-style) “wikibot” section with the new section “automatically generated” to see the changes.

The next version of dbscript with the mythical version number “1.0″ will be released soon ;)


dbscript Videos

November 21, 2009

I created a couple of introductory videos describing dbscript covering topics previously handled on this blog or in the online help:

The videos have been created using CamStudio (screen recorder) and VirtualDub (avi editor) and ffmpeg (avi to flv converter). FlowPlayer is embedded by a Joomla plug-in.

The videos can be watched here. dbscript is available for download here.


dbscript New Version 0.99

October 20, 2009

The latest version 0.99 of dbscript has been released today providing new functionality and a couple of fixes.

Data diagrams looked a bit distorted if the data model contained circular foreign key constraints. I sketched the problem in my article on cycle detection, and the data diagram now excludes circular foreign keys in the calculation of the tables’ positions.

Comparison results can be restricted to “scopes”, such as new objects only, dropped objects only, etc. This makes it easier to generate schema migration scripts without dropping objects, for example.

Documentation Generators provide a preview to the generated content, and the generated XML now contains the project and project version identifiers to enable linking and referencing in the generator’s output.

Scripting a table in the object’s Generate/Create page now includes all constraints and indexes. (The project version script always included child objects). The same applies to object comparisons of tables, so that changes to indexes etc are easily identifiable.

New Functions

Besides generating .png data diagrams, dbscript now has the capability to generate data diagrams for Dia, an open-source diagrammer. The layout routine is the same as for png’s, but the output is Dia’s native XML format. Generating for Dia means that developers can freely layout and edit the diagram according to their needs, and export it to other formats. I described this feature earlier, and included samples.

Schema comparison is one basic feature of dbscript, and the new version compares multiple versions in one operation. After defining which schema versions to compare, you get a comparison matrix showing the number of differences between any two versions.

If the selected versions are versions of the same schema at different points of time, the comparison timeline shows each object ever changing in any of the versions, along with an indicator of the change.

Within a project, you can define Branches (as known from version control systems) and assign project versions to a branch. This alone would not be too overwhelming, but branches are a precondition of the update notification system, which I will describe in a future post.

The latest version of dbscript is available for download here.

Please leave comments and feedback.


Generating Wiki Documentation from Entity Framework edmx File

October 11, 2009

After introducing the XML format of Entity Framework’s edmx files, let’s use that knowledge to create a small XSLT style sheet which displays the mappings of tables and entities in a Wiki-style table (which can be used in MediaWiki and SharePoint wikis).

In the XSLT root, we need to declare all namespaces used by the edmx to access nodes and attributes:

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:edmx="http://schemas.microsoft.com/ado/2007/06/edmx"
    xmlns:store="http://schemas.microsoft.com/ado/2007/12/edm/
				EntityStoreSchemaGenerator"
    xmlns:ssdl="http://schemas.microsoft.com/ado/2006/04/edm/ssdl"
    xmlns:cs="urn:schemas-microsoft-com:windows:storage:mapping:CS"
    xmlns:edm="http://schemas.microsoft.com/ado/2006/04/edm"
    xmlns:a="http://schemas.microsoft.com/ado/2006/04/codegeneration"
    xml:space="default" >
<xsl:output method="html" omit-xml-declaration="yes"  />

<!-- input file is C:\path\to\Model.edmx -->

This XSLT does not start with the mappings section, but with the tables and views inside the Schema definition, and then looks up their Mappings definition:

<xsl:template match="/">
  <xsl:apply-templates
    select="edmx:Edmx/edmx:Runtime/edmx:StorageModels/ssdl:Schema" />
</xsl:template>
<xsl:template match="ssdl:Schema">
<html>
  <body>
    <table width="100%">
      <xsl:apply-templates select="ssdl:EntityType" >
        <xsl:with-param name="namespace" select="@Namespace" />
      </xsl:apply-templates>
    </table>
  </body>
</html>
</xsl:template>

This code creates a table row for each database table and its class:

<xsl:template match="ssdl:EntityType" >
  <xsl:param name="namespace"></xsl:param>
      <xsl:variable name="table" select="@Name" ></xsl:variable>
      <xsl:variable name="map"
        select="/edmx:Edmx/edmx:Runtime/edmx:Mappings/cs:Mapping/
          cs:EntityContainerMapping/
          cs:EntitySetMapping[cs:EntityTypeMapping/
                            cs:MappingFragment/@StoreEntitySet=$table]" />
      <xsl:variable name="s" select="$map/*/@TypeName" />
      <xsl:variable name="p"
          select="concat('IsTypeOf(',
            substring($namespace, 1, string-length($namespace) - 5))" />
      <xsl:variable name="class"
          select="substring($s, string-length($p) + 1,
            string-length($s) - string-length($p) - 1)">
      </xsl:variable>
  <tr valign="top">
    <td >
      [[<xsl:value-of select="@Name"/>]]
    </td>
    <td>
      <xsl:value-of select="$class" />
    </td>
  </tr>
</xsl:template>
</xsl:stylesheet>

The [[ ]] notation creates a wiki hyperlink that allows developers to document tables and entities, and link to other documentation.


Creating Dia Data Diagrams from Database Schema

September 27, 2009

As I stated in earlier posts, you can use dbscript to generate PNG Data Diagrams (MS SQL Server, Oracle, PostgreSQL).

The generated PNG files are intended to give you an overview of the data model. The great drawback is that the information is graphical only, as it contains no information on the original data.

Upcoming version 0.99 will also include the capability to generate Dia files representing a database schema. Dia is an open-source diagram creation program which runs on Windows and Linux.

The advantages of generating in Dia format are that it allows users / developers to modify the generated diagrams, and that general Dia functionality can be used as stated on their website:

[It] can export diagrams to a number of formats, including EPS, SVG, XFIG, WMF and PNG, and can print diagrams (including ones that span multiple pages).

There is one minor issue in generating and opening Dia files: since Dia automatically sizes Table shapes according to their contents, it is not possible to predict what the table shape size will turn out to be (see FAQ). Thus when you open a generated Dia file for the first time, the foreign key connectors will look disconnected from the tables, even though they are not (in the data).

You need to manually enforce connector layout following these steps:

  • Select All (Ctrl-A)
  • Move selected objects (Cursor-Left, then Cursor-Right)

which will layout the foreign key connectors as expected.

Visit the gallery to view PNG images generated by Dia from diagrams created by dbscript.

Download Dia files with database models of

You need to install Dia to view the contents of these files.


Version 0.98 of dbscript Released

September 9, 2009

The latest version 0.98 of dbscript supports PostgreSQL databases in its documentation generation capabilities.

After importing the database dictionary (via direct connection using ADO.Net and Npgsql) can document a PostgreSQL database in all currently supported documentation format:

MediaWiki

Data Diagram (PNG)

HTML

ScrewTurn wiki

Integration support for PostgreSQL had some consequences: More and more functionality is handled separately for each database engine.

Database import was obviously the first one, since the data access classes (SqlConnection, SqlCommand) in .Net are different for every database library. Same goes for the database dictionary, which is best retrieve from the native system catalogs.

For import and upload, data access classes have been introduced to distinguish the different object types and their properties of each database engine. I mention work on the data access classes in a series of articles already.

In version 0.98, XML generation and object script generation are implemented separately. This results in XSL style sheets being now related to certain a database engine.

For Oracle, XML and object script generation have been updated, and the XSL style sheets have been adjusted to Oracle-specific objects and properties. The results were documented earlier.

The latest version of dbscript is available for download here.

Please leave comments and feedback.


Generating Database Documentation for ScrewTurn wikis

August 29, 2009

Updated XSL style sheets for dbscript documentation generators creating ScrewTurn content.

See here for sample output:

MS SQL AdventureWorks

Oracle Demo Schema

PostgreSQL OpenNMS

ScrewTurn also allows external Page Providers (instead of stored static pages)

MS SQL AdventureWorks (by Page Provider)


Documenting Oracle Databases

August 26, 2009

I described the capabilities of dbscript to generate MediaWiki content and a single HTML page documenting an Oracle database in a previous post.

During development of PostgreSQL support (MediaWiki, HTML) it became clear that an XSL style sheet (among other things) needed to become specific to a database engine.

I therefore updated the XSL style sheets for Oracle support, and these are the results:


Retrieving Table and Column descriptions in SQL Server

August 19, 2009

SQL Server stores column descriptions as so-called Extended Properties, using the extended property named ‘MS_Description’.

Even though the user interface in Enterprise Manager or Management Studio does not support setting descriptions of tables and other database objects, this is possible using the sp_addextendedproperty and sp_updateextendedproperty stored procedures.

The descriptions added by the developer can be retrieved by the following SQL statements (SQL Server 2005 or higher).

To retrieve the descriptions of all tables:

SELECT sys.objects.name AS TableName, ep.name AS PropertyName,
       ep.value AS Description
FROM sys.objects
CROSS APPLY fn_listextendedproperty(default,
                                    'SCHEMA', schema_name(schema_id),
                                    'TABLE', name, null, null) ep
WHERE sys.objects.name NOT IN ('sysdiagrams')
ORDER BY sys.objects.name

To retrieve the descriptions of all table columns:

SELECT sys.objects.name AS TableName, sys.columns.name AS ColumnName,
       ep.name AS PropertyName, ep.value AS Description
FROM sys.objects
INNER JOIN sys.columns ON sys.objects.object_id = sys.columns.object_id
CROSS APPLY fn_listextendedproperty(default,
                  'SCHEMA', schema_name(schema_id),
                  'TABLE', sys.objects.name, 'COLUMN', sys.columns.name) ep
ORDER BY sys.objects.name, sys.columns.column_id

In SQL Server 2000, you need to call the function ::fn_listextendedproperty, and the CROSS APPLY operation is not supported.


Creating HTML documentation of PostgreSQL databases

August 10, 2009

dbscript ships with a couple of XSLT style sheets which transform an XML representation of a database schema into MediaWiki, HTML, or, if you create them on your own, any format you wish.

After writing the previous post on PostgreSQL support in dbscript, I fixed the XSLT for HTML generation, created a Documentation Generator in dbscript, and this is the resulting HTML documentation of the OpenNMS data model.