New SQL Parser in dbscript

November 30, 2009

I wrote about my plans to add a new SQL parser engine into dbscript a couple of months ago. Now the time has come to actually implement it for T-SQL (MS SQL Server; Oracle and PostgreSQL will follow in future versions), and I found that I did not foresee all the consequences of my initial intent. The overall architecture remained the same though.

The grammar definition allows to define attributes on non-terminals. Using the SQL Server versions as attributes (2005, 2008), I can mark commands or clauses as their support or feature has been introduced in a specific version, and store the database version as property of the uploaded Project Version.

The parser skips the parts of the uploaded file that it cannot parse, and writes these parts to the upload log. Up to now, the parser would simply fail if it considered the SQL file somehow invalid.

The (C#) object representation of a parsed SQL command has a boolean flag IsHandled for each non-terminal. The code processing the object representation needs to mark every non-terminal object as being handled (i.e. translated into schema information stored in the database). Objects that have not been flagged will also be listed in the upload log.

Next, the new parser allowed me to rewrite the dependency analysis. Until now, dbscript only analyzed view dependencies to order the CREATE VIEW statements. Dependency analysis has now been extended to all database objects.

One more major issue that has been solved is parsing and interpreting EXECUTE statements. Thus an EXEC sp_addextendedproperty is interpreted as adding a description to a database object.

This obviously led to changes in the XML generation of database schemas. Each object now lists its descriptions and the dependencies on and references from other database objects.

The XSL stylesheets which translate a project version XML into markup or HTML have been revised to reflect the additional information in the generated XML.

All taken together, the new parser adds:

  • better feedback on which parts of the uploaded SQL file have been processed
  • dependency analysis
  • object and column descriptions

See the following links documenting AdventureWorks OLTP 2008 (version 2005) for the effects of the new functionality:

Single HTML file documentation

MediaWiki documentation

ScrewTurn Wiki documentation

In both wikis, compare the (old-style) “wikibot” section with the new section “automatically generated” to see the changes.

The next version of dbscript with the mythical version number “1.0″ will be released soon ;)


Checking MS SQL Server Stored Procedures

November 26, 2009

When you create stored procedures, MS SQL Server does not really compile them and check them against the system catalog, as for example Oracle does.

This makes it nearly impossible for TSQL developers to make sure their code executes without faults in production environments.

My approach to check stored procedures is to execute them with all parameters set to NULL inside a transaction that will be rolled back after stored procedure execution.

BEGIN TRANSACTION
EXEC (@storedproc)
ROLLBACK TRANSACTION

The simplest strategy is to iterate through all procedures, and execute them using this code inside a TRY … CATCH block. If an error is raised, the CATCH block logs the error.

However, when I ran the code in an actual development environment with lots of SPs (around 120), two problems occurred that needed to be dealt with:

  • If any SQL statement run in Management Studio creates more than 100 result sets, you get the error message

The query has exceeded the maximum number of result sets that can be displayed in the results grid. Only the first 100 result sets are displayed in the grid.

  • Therefore, we stored the results of analysed procedures in a temporary table, which needs to be created only once if it does not yet exist
  • If you execute several stored procedures in a row, and each one probably has some imperfect transaction handling code, you need to make sure to rollback all open transactions.
/*
DROP TABLE #p
DELETE FROM #p
*/

SET ANSI_WARNINGS OFF

IF OBJECT_ID('tempdb..#p') IS NULL
  CREATE TABLE #p (
    ProcedureName sysname COLLATE Latin1_General_CI_AS,
    ErrorNumber int,
    ErrorSeverity int,
    ErrorState int,
    ErrorProcedure sysname COLLATE Latin1_General_CI_AS NULL,
    ErrorLine int,
    ErrorMessage NVARCHAR(2048) COLLATE Latin1_General_CI_AS
  )

DECLARE @proc sysname, @params VARCHAR(1000), @exec VARCHAR(1000)

DECLARE c CURSOR for
SELECT	p.name, SUBSTRING(REPLICATE(', null', COUNT(pa.name)), 3, 1000)
FROM sys.procedures p
LEFT OUTER JOIN sys.parameters pa  ON  p.object_id = pa.object_id
LEFT OUTER JOIN #p ON p.name = #p.ProcedureName
WHERE	p.name LIKE '%stored-procedure-name-filter%'
AND	#p.ProcedureName IS null
GROUP BY p.name
ORDER BY p.name

OPEN c
FETCH c INTO @proc, @params

WHILE @@FETCH_STATUS = 0 BEGIN

  PRINT @proc

  WHILE @@TRANCOUNT > 0
    ROLLBACK

  INSERT INTO #p (ProcedureName) VALUES (@proc)

  BEGIN TRANSACTION

  BEGIN TRY
    SET @exec = 'EXEC ' + @proc + ' ' + ISNULL(@params, '')
    EXEC (@exec)
    ROLLBACK TRANSACTION
  END TRY
  BEGIN CATCH
    WHILE @@TRANCOUNT > 0
      ROLLBACK

    UPDATE #p
    SET ErrorNumber = ERROR_NUMBER(),
      ErrorSeverity = ERROR_SEVERITY(),
      ErrorState = ERROR_STATE(),
      ErrorProcedure = ERROR_PROCEDURE(),
      ErrorLine = ERROR_LINE(),
      ErrorMessage = ERROR_MESSAGE()
    WHERE ProcedureName = @proc
  END CATCH

  FETCH c INTO @proc, @params
END

CLOSE c
DEALLOCATE c

SELECT * FROM #p
WHERE ErrorNumber IS NOT NULL
AND ErrorNumber <> 515

Some final comments:

  • I left the DROP and DELETE statements in TSQL comments to make it easier to clear or delete the temp table.
  • The collation declarations are necessary for the JOIN condition if your working db and your temp db have different collations.
  • If the script exits with the 100-result-sets error, rerun it until it only shows the check result for all stored procedures.
  • ErrorNumber 515 is the error indicating a NULL value has been inserted into a not-nullable column. Since the script creates this error condition, we need not list it.
  • Temp table name #p is used here for brevity. You should use a name which is never used as temp table in your TSQL code.

Entity Designer opens edmx file only once – another workaround

November 26, 2009

Why do people complain? Because sometimes, after they complained, things get fixed, and they were right complaining. Other times, if complaints don’t help, they know they still know better ;)

Gabriel provides an analysis and a workaround to the problem that Entity Designer opens an edmx file only once. His solution is for 64bit XP though, and differs from the 32bit Windows solution in the registry path:

Open regedit and navigate to

HKEY_LOCAL_MACHINE
  \SOFTWARE
    \Microsoft
      \VisualStudio
        \9.0
          \Editors
            \{c99aea30-8e36-4515-b76f-496f5a48a6aa}

add a string value named LinkedEditorGuid

and set its value to

{FA3CD31E-987B-443A-9B81-186104E8DAC1}.

Applying this registry setting on my machine resulted in Entity Designer and XML Editor opening alternately.


Constants in MS SQL Server

November 23, 2009

Even though TSQL does not allow constant declarations per se, there are several ways to handle constant values in SQL Server databases.

The simplest one is to document special values (apart from keeping a list on a piece of paper which drowns in the sediments on your desk) is to list them in the Description property of a column, although this is not what databases are made for.

Next is a check constraint on columns which only allow specified values and cause INSERT/UPDATE statements to fail if the new value does not match the constraint.

My favorite method is to have a table of constant values with a numeric identifier, a programmatic identifier for use in a programming language, and a natural-language string to be displayed to the user.

My application dbscript supports this method and generates code to link C# code to the values stored in the database. I described this mechanism earlier here and here.

As an example, take a simple table describing command execution states:

ID Name ProgID
10 Command Pending Pending
20 Command Running Running
30 Command Done Done

This table can be represented in C# as

public class ExecutionState
{
    public const int Pending = 10;
    public const int Running = 20;
    public const int Done = 30;
}

One disadvantage remains though: if you want to use the constant value in TSQL (for example, in a stored procedure), you still have to know the numeric value, or retrieve the numeric value by its programmatic identifier. In both cases, there is no way to check automatically that the values or identifiers actually exist.

Yet another method came to my mind recently: create a single-row view with the column names representing the constant identifiers and the fields containing the constant values:

CREATE VIEW V_Execution_State AS
SELECT 10 AS Pending, 20 AS Running, 30 AS Done

This way, constant identifiers can be checked by TSQL, and values easily assigned, like this:

DECLARE @state INT
SELECT @state = Pending FROM V_Execution_State

By the way, this is post # 200 ;)


Maximum Capacity Specifications for SQL Server

November 23, 2009

From MSDN:

Specification ↓ \ SQL Server Version → 7 / 2000 2005 2008
Maximum Capacity Specifications Article Article Article
Maximum Numbers of Processors Article Article
Maximum Amount of Physical Memory Article Article
Hardware and Software Requirements Article Article
http://msdn.microsoft.com/en-us/library/ms143506%28SQL.90%29.aspx

dbscript Videos

November 21, 2009

I created a couple of introductory videos describing dbscript covering topics previously handled on this blog or in the online help:

The videos have been created using CamStudio (screen recorder) and VirtualDub (avi editor) and ffmpeg (avi to flv converter). FlowPlayer is embedded by a Joomla plug-in.

The videos can be watched here. dbscript is available for download here.


SMOscript 0.15

November 21, 2009

Version 0.15 of SMOscript fixes a bug when scripting the CREATE statements for AD-based users.

The problem resulted from the assumption that the name of every SQL Server database object can be taken as the file name for the generated script, which is for AD users only true if there is a directory named as the domain of those users.

Version 0.15 now translates every character in an object name that is not a valid filename character into a dot.

The lastest version of SMOscript is available for download here.


ExpectOne and ExpectOneOrNull vs. First and FirstOrDefault

November 14, 2009

Linq provides the extension methods First() and FirstOrDefault(), which return the first object in a typed IEnumerable<T> collection, or default<T> if the collection is empty.

Sometimes you don’t only want to use the first element, but require the collection to have at most 1 element.

The methods ExpectOne and ExpectOneOrNull implement the requirement to have exactly 1 (or at most 1) element in an IEnumerable collection:

public static class LinqExtensions
{
    /*
     * ExpectOne: returns exactly one record. if 0 or >1 found then exception
     * ExpectOneOrNull: returns exactly one record or null. if >1 found then exception
     */

    public static TSource ExpectOne(this IEnumerable source)
    {
        TSource result = default(TSource);
        bool b = true;
        foreach (TSource o in source)
        {
            if (b)
                result = o;
            else
                throw new ExpectOneException("too many elements in result set. one element expected.");
            b = false;
        }
        if (b)
            throw new ExpectOneException("no element in result set. one element expected.");
        return result;
    }

    public static TSource ExpectOneOrNull(this IEnumerable source)
    {
        bool b = true;
        TSource result = default(TSource);
        foreach (TSource o in source)
        {
            if (b)
                result = o;
            else
                throw new ExpectOneException("too many elements in result set. one or no element expected.");
            b = false;
        }
        return result;
    }
}

A separate exception class makes it easier for the application code (and developers!) to handle cases where more than 1 record is returned, indicating an error in the query.

If you want to replace for calls to First(<condition>) by ExpectOne(), replace First(<condition>) by Where(<condition>).ExpectOne().

As ExpectOne() contains compiled code, you cannot use these methods in Compiled Queries.

This may seem as a disadvantage. In reality it makes program code simpler because compiled queries are now required to always return an IEnumerable<T>, and the calling code needs to specify the expected result set size of the query.

It may seem desirable to replace all occurrences of First() and FirstOrDefault() by ExpectOne() and ExpectOneOrNull(). There are cases where original Linq methods are still necessary and valid:

Any() to check whether a collection is not empty, i.e. elements exist that match criteria

First() and FirstOrDefault() to find the first element in a sorted collection (e.g. the greatest element less than the original value)

 


Solving Firebug’s “Unable to show styles”

November 8, 2009

I recently installed Firebug to debug some CSS problems, but the only CSS information I got was the message “Unable to show styles”. The link in the message to the FAQ was not very helpful, either.

Fortunately, this post by Ingo was, as it solved this problem.

The problem is caused by a couple of ancient files under the Mozilla Firefox directory belonging to an ancient version of DOM Inspector, which seem to have remained there since the earliest version 1.0x of Firefox, and are easily identifiable by the file date (in my case, the year 2005).

To solve the problem, uninstall Firebug, remove the offending files, restart Firefox and reinstall Firebug.

The files are:

components.ini
defaults.ini
chrome\inspector.jar
chrome\pipnss.jar
components\inspector.dll
components\inspector.xpt
components\inspector-cmdline.js
defaults\pref\inspector.js
extensions\Extensions.rdf
extensions\installed-extensions-processed.txt
res\platform-forms.css
res\builtin\platformHTMLBindings.xml
res\inspector\search-registry.rdf
res\inspector\viewer-registry.rdf

(list taken from above link, but as I said, you find the outdated files by their age)


Entity Designer opens edmx file only once, and other VS bugs

November 6, 2009

I found that Entity Designer in Visual Studio 2008 has some strange bugs.

→ One issue is that the installation of Visual Studio can open an .edmx file only once. Double-clicking on the edmx in the Solution Explorer after closing the file will simply expand and collapse the edmx node, but the file does not open. (Strangle, a fellow programmer never experienced this behavior on his machine).

But, I’m not alone, and somebody else already described a solution for the bug:

Right-click the edmx file, choose Open with… and select XML Editor to open the file. Close it, then the Entity Designer will open the next time (once).

→ Another issue is that VS freezes if you open the edmx in Entity Designer and navigate through it when the file is not checked out from TFS.

→ And then I wonder why, as Entity Designer implements auto-routing of foreign keys, it sometimes loses some connection points, and the connections/relations end somewhere in the diagram, instead of ending at the referencing entities, and the lines are criss-cross instead of horizontal or vertical.

→ And if you delete an entity from the diagram, the underlying object still remains in the edmx, and there is no way the Designer allows you to add the deleted entity. You either have to re-create it manually, which is cumbersome.

The solution I found was to rename the object in the database, execute Update Model so that Designer drops the object, rename the object back to its original name, and Update Model again.

→ And if you create a function mapping for a stored procedure, and define the return type as any of the entities, then the stored procedure needs to return a SELECT with the *mapped column names* instead of the original datamodel column names. In my opinion, this behavior breaks the concept of “mapping”.

→ And if you create a function mapping for a stored procedure with a SELECT that is not mappable to an entity, or returns no result or an integer result, you won’t find the mapped function in your C# code, because the generator does not create any code for it. The same is true for user defined SQL functions.

(Actually I found a solution for the missing stored procedure code generation, which I will post here soon)

Some of these issues can also be found on Connect. If you have a solution, please let me know.

Certain things will always remain a mystery to me ;)