Introducing oraddlscript

October 25, 2009

A recent question on StackOverflow inspired me to write a utility called oraddlscript:

How to generate scripts for Oracle schema objects, such as tables, procedures, etc.

The answers directed me to the DBMS_METADATA package (9i documentation, 10g documentation). The function DBMS_METADATA.GET_DDL is the functional equivalent of the MSSQL SMO library, which prompted me to adapt my command-line utility SMOscript to Oracle databases. Voilà, oraddlscript.

oraddlscript 0.14.3584.16268 (c) by devio.at 2009

    list and script databases and database objects.

    usage: oraddlscript [options] [command]

    options: (leading '-' or '/')

    -s server       TNS name or host name or host:port
    -svc service    service name (if host name is provided)
    -o owner        owner name

    -u username     username (default: integrated authentication)
    -p password     password (if -u is missing, password for sa)

    -f filename     output to file
    -F directory    output to directory

    -A              current ANSI codepage
    -O              ASCII
    -T              Unicode
    -U              UTF8

    commands:

    l               list objects
    s               script object/s (using dbms_meta.get*ddl)
    -xml            use dbms_meta.get*xml

    list object owners on server (implied by -s)
    list objects in database (implied by -s -o)
    script all objects (implied by -s -o -F/-f)

The command-line arguments are consistent with SMOscript, except for -d (database) which has been replaced by -o (owner name).

The list of objects is retrieved by querying DBA_OBJECTS, ALL_OBJECTS and USER_OBJECTS depending on which of the catalog views is accessible by the user specified by -u.

The package also contains a function GET_XML which is used to retrieve the XML representation of a database object.

The functions of oraddlscript are:

  • list usernames of object owners
  • list objects of specific owner
  • generate CREATE scripts of objects owned by specific user
  • generate XML files representing objects owned by specific user
  • generate one file per object or single file for all objects

Of course, logging and batch operations work just as previously described for SMOscript.

oraddlscript is available for download here.


Generating Table of Contents from WordPress Export

October 23, 2009

WordPress implements an Export function which allows bloggers to download the contents of their blog as a single XML file.

In a previous post I described an XSLT file to convert the exported XML into a single HTML file. A click on an article’s title displayed the whole article using a little JavaScript and CSS.

This time, I wanted to create an HTML table of contents displaying all the blog’s categories. Upon selection of a category, only the posting titles of that category should be displayed. The titles link to the original blog articles.

The new XSLT is available for download here.

Applied to this blog, the generated Table of Contents is here.


Structure and Analysis of edmx Files (3)

October 5, 2009

The Entity Designer of Visual Studio 2008 stores its data in an XML file with the extension .edmx. In the first part of this series I covered the Storage Model, the second part dealt with the Conceptual Model.

This part covers the Mappings section with regard to C# mappings. The Mapping section uses the xmlns:cs namespace

  • xmlns:cs=”urn:schemas-microsoft-com:windows:storage:mapping:CS”

For tables and views, the mapping section defines the mappings from database object to C# class (EntitySets and EntityTypes), along with their scalar properties:

    <edmx:Mappings>
      <cs:Mapping Space="C-S" >
        <cs:EntityContainerMapping
            StorageEntityContainer="MyNamespaceEntitiesStoreContainer"
            CdmEntityContainer="MyEntities">
          <cs:EntitySetMapping Name="TableFooSet">
            <cs:EntityTypeMapping TypeName="IsTypeOf(My.Namespace.Entities.TableFoo)">
              <cs:MappingFragment StoreEntitySet="TABLEFOO">
                <cs:ScalarProperty Name="ID" ColumnName="ID" />
                <cs:ScalarProperty Name="Name" ColumnName="NAME" />
              </cs:MappingFragment>
            </cs:EntityTypeMapping>
          </cs:EntitySetMapping> 

          <cs:AssociationSetMapping Name="FK_TABLEFOO_TABLEBAR"
            TypeName="My.Namespace.Entities.FK_TABLEFOO_TABLEBAR"
            StoreEntitySet="TABLEFOO">
            <cs:EndProperty Name="TABLEFOO">
              <cs:ScalarProperty Name="ID" ColumnName="ID" />
            </cs:EndProperty>
            <cs:EndProperty Name="TableBar">
              <cs:ScalarProperty Name="ID" ColumnName="BAR_ID" />
            </cs:EndProperty>
            <cs:Condition ColumnName="BAR_ID" IsNull="false" />
          </cs:AssociationSetMapping>

          <cs:FunctionImportMapping FunctionImportName="SPBarFromFoo"
            FunctionName="My.Namespace.Entities.Store.SP_BAR_FROM_FOO" />

End of Mapping section, end of Runtime declarations

        </cs:EntityContainerMapping>
      </cs:Mapping>
    </edmx:Mappings>
  </edmx:Runtime>

The rest of the edmx file contains the edmx:Designer section with layout information (shapes, connectors).


Parsing SQL into XML and C# Objects

June 29, 2009

I recently wrote about my attempts to parse SQL statements and extract schema information from them. (dbscript already has a SQL parser, but it needs to be replaced to support different SQL dialects)

As it turns out, code generation is the way to go!

I start out with an object-oriented representation of the grammar of the target language, in this case T-SQL. A (custom-written) lexer turns the input stream into a sequence of lexemes or tokens, which the parser engine feeds into the grammar’s rule definitions.

The result of this action is an XML document whose structure is equivalent to the grammatical definition of the respective SQL statement.

SQL Parser Data Model

This is where code generation magic sets in: The grammar definition is also used to generate a C# object model, mapping each rule into a C# class. Each class implements an interface requiring a SetXml() method, which reads the contents of the instantiated object from an XmlNode structure, and thus populates its child objects recursively.

The application logic (which can be seen as sort of a “parser client”) only needs to instantiate the grammar object and the parser object, feed the parser the file, and retrieve a typed object containing an OO representation of the parsed file.

The next version of dbscript will be built on these mechanisms.


Parsing SQL into XML (2)

June 25, 2009

I recently starting re-writing my SQL parser for dbscript and reported about the first results.

Things always turn out more complicated than they initially appear, and one thing I realized was that a simple .sql file is not merely SQL, but also contains some “meta-magic” such as TSQL’s GO statement, which is not a valid SQL command, but really a batch separator for Query Analyzer or Management Studio.

Anyway, there is some kind of progress: I described all SQL statements that the AdventureWorks .sql scripts use, and parsing is pretty quick.

The sample statement of my previous post

CREATE TABLE [Production].[TransactionHistoryArchive](
    [TransactionID] [int] NOT NULL,
    [ProductID] [int] NOT NULL,
    [ReferenceOrderID] [int] NOT NULL,
    [ReferenceOrderLineID] [int] NOT NULL ,
    [TransactionDate] [datetime] NOT NULL ,
    [TransactionType] [nchar](1) COLLATE Latin1_General_CI_AS NOT NULL,
    [Quantity] [int] NOT NULL,
    [ActualCost] [money] NOT NULL,
    [ModifiedDate] [datetime] NOT NULL ,
    CONSTRAINT [PK_TransactionHistoryArchive_TransactionID] PRIMARY KEY CLUSTERED
    (
        [TransactionID] ASC
    ) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF,
        IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON)
    ON [PRIMARY]
) ON [PRIMARY]

is now parsed into an XML representation which looks like this:

<?xml version="1.0" encoding="Windows-1252"?>
<TSQL>
  <cmdCreateTable>
    <CREATE>CREATE</CREATE>
    <TABLE>TABLE</TABLE>
    <id>
      <id2s>
        <id2>[Production]</id2>
        <x2E>.</x2E>
        <id1>[TransactionHistoryArchive]</id1>
      </id2s>
    </id>
    <x28>(</x28>
    <columns>
      <column>
        <id>[TransactionID]</id>
        <datatype>
          <id>
            <id1s>
              <id1>[int]</id1>
            </id1s>
          </id>
        </datatype>
        <attr>
          <notnull>
            <NOT>NOT</NOT>
            <NULL>NULL</NULL>
          </notnull>
        </attr>
      </column>
      <x2C>,</x2C>
....
      <x2C>,</x2C>
      <column>
        <id>[ModifiedDate]</id>
        <datatype>
          <id>
            <id1s>
              <id1>[datetime]</id1>
            </id1s>
          </id>
        </datatype>
        <attr>
          <notnull>
            <NOT>NOT</NOT>
            <NULL>NULL</NULL>
          </notnull>
        </attr>
      </column>
      <x2C>,</x2C>
      <constraint>
        <CONSTRAINT>CONSTRAINT</CONSTRAINT>
        <constraintid>[PK_TransactionHistoryArchive_TransactionID]</constraintid>
        <type>
          <pk>
            <PRIMARY>PRIMARY</PRIMARY>
            <KEY>KEY</KEY>
            <clustered>
              <CLUSTERED>CLUSTERED</CLUSTERED>
            </clustered>
            <indexcolumns>
              <x28>(</x28>
              <indexcolumns>
                <indexcolumn>
                  <colasc>
                    <columnid>[TransactionID]</columnid>
                    <ASC>ASC</ASC>
                  </colasc>
                </indexcolumn>
              </indexcolumns>
              <x29>)</x29>
            </indexcolumns>
            <with>
              <WITH>WITH</WITH>
              <x28>(</x28>
              <params>
                <param>
                  <key>PAD_INDEX</key>
                  <x3D>=</x3D>
                  <value>OFF</value>
                </param>
                <x2C>,</x2C>
....
                <x2C>,</x2C>
                <param>
                  <key>ALLOW_PAGE_LOCKS</key>
                  <x3D>=</x3D>
                  <value>ON</value>
                </param>
              </params>
              <x29>)</x29>
            </with>
            <onfilegroup>
              <ON>ON</ON>
              <filegroupid>[PRIMARY]</filegroupid>
            </onfilegroup>
          </pk>
        </type>
      </constraint>
    </columns>
    <x29>)</x29>
    <onfilegroup>
      <ON>ON</ON>
      <filegroupid>[PRIMARY]</filegroupid>
    </onfilegroup>
  </cmdCreateTable>
</TSQL>

The complete result file is here.

So as I manage to complete parse and represent the contents of a SQL statement, the question remains how to evaluate the result:

  • Analyze internal representation (messy)
  • Analyze XML representation (complicated)
  • Trigger events for each parsed entity and their attributes (complicated)

The problem is always that the evaluating routine seems to be de-coupled from the parser.

The first solution is too much dependent on internal representation. Changing a little aspect of the parsing mechanism or the grammar would result in breaking code.

The other two solutions are less dependent, but still a change in the grammar needs to be reflected in the evaluator, without any immediate error reporting if the evaluator does not match the parser.

I read about Xtext the other day, a DSL parser for Eclipse, which seems to include a code generator that generates Java classes to match the grammar. An interesting solution, since a change in the grammar and subsequent code generation would immediatly break compilation of the evaluator.

To be continued… ;)


Converting WordPress Export XML to HTML

June 18, 2009

WordPress implements an Export function which allows bloggers to download the contents of their blog as a single XML file.

Based on my previous work in dbscript to generate an HTML documentation of a database schema using XML and XSL I modified the “Single HTML” XSL of dbscript to create an XSL to transform the WordPress Export XML into an HTML page.

This HTML page lists all article titles with dates, categories and link to the original URL in the blog.

Clicking the article’s title will expand the list to display the contents of the selected article.

To display paragraphs properly, I needed to replace newlines with <br /> elements, as described on this page implementing various replace operations in XSL.

The WordPress HTML XSL file is available for download here.


Parsing SQL into XML

May 11, 2009

dbscript has a built-in SQL script parser so that users can upload SQL scripts and analyze and document their database schemas.

While the parser is good enough to both extract all the relevant information from a SQL script and also skip over unsupported commands or simply plain text, I felt that the current approach would not scale well if I wanted to support more commands or different SQL dialects.

I started to play with a couple of C# classes, re-using the original dbscript T-SQL tokenizer, and soon had a small class library to define terminals, rules, and a parser class that kept everything together.

Additionally, the parser result can be output as XML.

Some parts of the parsing process turned out to be rather tough, such as the Repeater (or Loop) construct (as I said, it was playing rather than planning ahead), but I guess I now have a good framework to start implementing a T-SQL parser.

A table definition taken out of MS AdventureWorks

CREATE TABLE [Production].[TransactionHistoryArchive](
 [TransactionID] [int] NOT NULL,
 [ProductID] [int] NOT NULL,
 [ReferenceOrderID] [int] NOT NULL,
 [ReferenceOrderLineID] [int] NOT NULL ,
 [TransactionDate] [datetime] NOT NULL ,
 [TransactionType] [nchar](1) COLLATE Latin1_General_CI_AS NOT NULL,
 [Quantity] [int] NOT NULL,
 [ActualCost] [money] NOT NULL,
 [ModifiedDate] [datetime] NOT NULL ,
 CONSTRAINT [PK_TransactionHistoryArchive_TransactionID] PRIMARY KEY CLUSTERED
(
 [TransactionID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,
  ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

is parsed and translated into XML like this:

<TSQL><cmdCREATETABLE><CREATE>CREATE</CREATE><TABLE>TABLE</TABLE>
<id13a><id2s><id2>[Production]</id2><x2E>.</x2E>
<id1>[TransactionHistoryArchive]</id1></id2s></id13a>
<columnsr><column><colid><id1>[TransactionID]</id1></colid>
<datatype><id1s><id1>[int]</id1></id1s></datatype>
<attr><notnull><NOT>NOT</NOT><NULL>NULL</NULL></notnull><rep-end>,</rep-end></attr>
</column>
...
</columnsr><x29>)</x29>
<onfg><ON>ON</ON><filegroup><id1>[PRIMARY]</id1></filegroup></onfg>
</cmdCREATETABLE></TSQL>

See the whole generated XML file here.

A “GO” command is transformed into this XML document:

<TSQL><cmdGO><GO>GO</GO></cmdGO></TSQL>

This is a work-in-progress. The naming of the symbols is not final, and the parser code needs to be cleaned up and separated into parsing and TSQL functionality.


Generating XML from Database Schema and Table Data

December 23, 2008

As described in my previous posts on dbscript, version 0.92 supports generation of XML files based on schema information imported from an SQL Server database or uploaded via SQL files containing CREATE statements for all supported object types.

I provided two examples of how these XML files can be used by XSL transformations integrated into dbscript to generate wiki content.

In this post, I will introduce the capability to generate a Stored Procedure which in turn generates an XML representation of data stored in a database table.

Currently, the XML script generator supports 3 modes, called:

  • Field Elements
  • Column Name Elements
  • Column Name Attributes

to reflect the various ways data can be mapped onto XML.

The root element for all XML data files is named <data>, its child elements <table> have the attributes name and schema. Each record within a table is marked by the element <record>, its attributes being the names of the defined primary key columns.

All examples are constructed from tables of the dbscript database.

Field Elements

Field Elements mode have <field name=”column name”> child elements, with the record field values as node texts:

<table name="ProjectVersionType" schema="dbo">
  <record OID="1">
    <field name="OID" datatype="int">1</field>
    <field name="ID" datatype="nvarchar">Major</field>
    <field name="DispSeq" datatype="int">100</field>
  </record>
</table>

Column Name Elements

The child elements of the <record> element are named according to the column names of the table:

<table name="ObjectType" schema="dbo">
  <record OID="1" >
    <OID datatype="int">1</OID>
    <ID datatype="nvarchar">Table</ID>
    <ProgID datatype="varchar">Table</ProgID>
    <SysobjectType datatype="varchar">U</SysobjectType>
    <Seq2000 datatype="int">100</Seq2000>
    <Seq2005 datatype="int">100</Seq2005>
    <WikiTitleFormat datatype="nvarchar">_(table)</WikiTitleFormat>
  </record>
</table>

Column Name Attributes

The <record> element contains attributes according to the column names of the table:

<table name="ScriptType" schema="dbo">
  <record
    OID="1"
    ID="T-SQL Value Script"
    HasSCols="1"
    ProgID="TSqlValues"
  >
  </record>
</table>

The latest version of dbscript (0.92) is available for download here.


Wiki Documentation from MSSQL Database Schema

December 8, 2008

The upcoming version 0.92 of my database tool dbscript provides the functionality to generate XML documents on single objects as well as on the whole database schema. Using XSLT, this XML document can then be transformed into any XML-based or text-based representation of the schema information.

In a recent post, I demoed the function on a table out of dbscript’s data model. The result of an XSLT applied to the XML representing AdventureWorks’ Person.Contact can be viewed here.

New functionality allows an XSL to be applied on the XML result of the database schema (called ProjectVersion in dbscript), and the result is an overview of the whole database catalog in MediaWiki syntax. This list shows all objects in AdventureWorks (MSSQL 2005) and is a direct and unmodified output of dbscript.

Since you are free to add and edit XSL style sheets in dbscript, users can adapt the XSL to reflect any kind of syntax their wikis use.


Handling XSD dateTime in SQL Server 2005

November 27, 2008

A client is running an application which exchanges XML documents with other installations of the same application. The application’s documentation also includes XSD schema definitions.

What an opportunity for me to get to study SQL Server 2005’s XML capabilities.

I open the .xsd files in a text editor, and invoke CREATE XML SCHEMA COLLECTION statements for each xsd file.

Next, I create a table which contains the XML data originally stored as ntext, and an untyped XML column. Additionally, each XSD gets a separate typed XML column:

CREATE TABLE [dbo].[XmlTable](
    [OID] [int] IDENTITY(1,1) NOT NULL,
    [Data] [ntext] NULL,
    [XmlData] [xml] NULL,
    [XsdData1] [xml] (CONTENT [dbo].[Xsd1] NULL
    ...
)

I import data from the production system into the Data column, and copy the data as untyped XML using

UPDATE XmlTable SET XmlData = Data

If every record contains valid XML data, the operation should complete successfully, with XmlData now containing the (untyped) XML documents. Remember that the Data column starts with an XML tag identifying the XSD. We can now start to copy the data to the typed (XSD-based) XML columns:

UPDATE XmlTable SET XsdData1 = XmlData WHERE Data LIKE '<Xsd1%'

which works fine unless you have an xsd:dateTime value in a format that SQL Server cannot process, generating an error message

XML Validation: Invalid simple type value

As this MS feedback page explains, SQL Server 2005 cannot handle datetime values with timezones. An xsd:dateTime value therefore has to end with a ‘Z’.

To make the XML data match SQL Server’s interpretation of the XSD schema information, we need to replace each xsd:dateTime from yyyy-mm-ddThh:mm:ss.sssssss+hh:mm (day, time, second fractions, timezone offset) to yyyy-mm-ddThh:mm:ssZ.

So, first find all XSD columns of type xsd:dateTime (Michael’s post was very inspiring):

SELECT sc.name XSD, sel.name Element
FROM sys.xml_schema_component_placements scp
INNER JOIN sys.xml_schema_types sct
    ON scp.placed_xml_component_id=sct.xml_component_id
INNER JOIN sys.xml_schema_elements sel
    ON scp.xml_component_id = sel.xml_component_id
INNER JOIN sys.xml_schema_collections sc
    ON sel.xml_collection_id = sc.xml_collection_id
WHERE sct.name = 'dateTime' AND sct.xml_namespace_id = 1
ORDER BY 1, 2

The resultset contains the XML Schema Collection and Element names.

Next we take any one of the retrieved elements and check their values in the imported XmlData column using XQuery syntax:

SELECT XmlData.value('(//TimeStamp)[1]', 'nvarchar(50)')
FROM XmlTable

The pattern of date and time requires 19 character.

Using XMLDML, we can now update the timestamp value by taking the first 19 characters and appending ‘Z’ using UPDATE .modify. The result is calculated using XQuery functions.

UPDATE XmlTable
SET XmlData.modify(
    'replace value of (//TimeStamp/text())[1]
     with concat(substring((//TimeStamp)[1], 1, 19), "Z")')

After this step is repeated for each dateTime XML element, the XML content can be inserted into the typed XML columns, as stated above:

UPDATE XmlTable SET XsdData1 = XmlData WHERE Data LIKE '<Xsd1%'