Custom XSD Validation in Visual Studio 2013

We developed an XSD schema, and wanted to create valid .cshtml markup using Visual Studio.

Based on tips found on the internet, I managed to get this working on VS2010 (sorry, didn’t write about it 😉 ), but for VS2013, the solution seems to be different again.

Here’s how I managed to get custom XSD validation in VS2013:

  • Put a copy of your XSD in the directory
C:\Program Files[ (x86)]\Microsoft Visual Studio 12.0\Common7\Packages\schemas\html
  • Open regedit.exe and navigate to the key
HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\12.0_Config\Packages\{1B437D20-F8FE-11D2-A6AE-00104BCC7269}\Schemas

This reg key holds all schemas for the VS “package” Visual Studio HTM Editor Package”.

The value of “File” is the relative path to the .xsd, in our case “html\my.xsd”.

  • Add a string named “Friendly Name”

The value of “Friendly Name” is the name to be displayed in the validation settings.

  • Add a DWord named IsBrowseable

Not sure whether this is required, but HTML5 and XHTML5 also have it. Set its value to 1.

Alternatively, you can also create a .reg file, set the values according to your environment, and import using regedit:

Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\12.0_Config\Packages\{1B437D20-F8FE-11D2-A6AE-00104BCC7269}\Schemas\http://tempuri.org/my.xsd]
"File"="html\\my.xsd"
"Friendly Name"="My Schema"
"IsBrowseable"=dword:00000001

Start Visual Studio 2013, and navigate to Tools, Options, Text Editor, HTML (Web Forms). Or simply type “valid” in the search box of the Options dialog.

You will find your entry “My Schema” in the dropdown list “Target when no doctype found:”.

If you only want to edit files confirming to your schema, then select it here.

Otherwise, close the dialog, and right-click the toolbar (right below the menu bar) and check the item “HTML Source Editing”. Now the toolbar will also contain the Target Schema selection dropdown.

BUT: If you open a .cshtml by double-clicking, the source will be displayed, but the Target Schema dropdown is inactive!

The reason is that, whoever invented this whole thing, there is a difference between “HTML Editor” and “HTML (Web Forms) Editor”!

Right-click the .cshtml file, and select “Open With…”. In my installation, HTML Editor is the default for .cshtm, but you can set the Web Forms editor as default using the, uhm, “Set as Default” button.

So now, either way you open the .cshtml in the “HTML (Web Forms) Editor”, the Target Schema selection dropdown is now active, and your custom schema can be selected for validation.

Unfortunately, Razor does not seem to be supported in the Web Forms Editor (high-lighting etc), so you need to choose between Razor-enabled unvalidated or Razor-disabled validated editing. Really?

Sources: SO, AngularJS, MSDN Social.

Calling xsd.exe in VS 2013 Build Event

While working on an XML project, I wanted to call xsd.exe on an .xsd file during the build process, and found this solution on SO, which works for VS 2010.

For VS 2013, the solution did not work anymore, especially on systems that had no prior version of VS installed, since xsd.exe hides in a different location.

A comment to the answer illustrated how to query the registry correctly on x64 systems.

So my modified pre-build event looks like this:

call "$(ProjectDir)GenerateFromVSPrompt.cmd"
  "$(ProjectDir)"
  "$([MSBuild]::GetRegistryValueFromView(
    'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v8.1A',
    'InstallationFolder', null, RegistryView.Registry64, RegistryView.Registry32)
    )bin\NETFX 4.5.1 Tools\xsd.exe"

all in 1 line.

If you use TFS as source control, you know that generated files need to be checked out before they can be overwritten.

I already wrote about TFS and code generation, and used the vcvarsall.bat then.

However, since we just need the path to tf.exe, and use the same VS version, we can just open a VS Command Prompt, run

where tf

and get the answer

C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\TF.exe

for VS 2013.

So our batch file GenerateFromVSPrompt.cmd looks like this:

set tf="C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\TF.exe"
%tf% checkout %1MyXsdClasses.cs
call %1XSDBuilder.cmd %1 %2
%tf% checkin /comment:"build event" /noprompt %1MyXsdClasses.cs
exit 0

In case tf.exe cannot check in the file because it did not change during code generation, it will return with exit code 1 which in turn will cause the build process to issue a build error and break. So we use exit 0 to clear the error condition

Finally, my version of XSDBuilder.cmd is based on an SO answer, but stripped down to only what is necessary, since I only have 2 XSD files, AND they need to be processed together:

pushd %1
%2 MyXsd1.xsd MyXsd2.xsd /c /n:My.Project.Xsd
popd

and, as I write, I really should merge both .cmd files into one … 😉

The build event is now executed correctly from VS, the VS Command Prompt, and on the build server.

On Converting Data

I had to analyze SQL Server Database Projects (available from SQL Server Data Tools for Visual Studio), and these projects offer a menu item “Create Snapshot” which creates a snapshot file with the extension .dacpac.

It turns out that a .dacpac is a zipped XML file (plus some other files) containing a structured representation of the database objects defined in the project. However Visual Studio does not provide a way to display them (double-clicking the file will only display binary data).

So I thought about how to best display the contents of a .dacpac? Two methods came to my mind.

First, inspired by my work on wpxslgui, create an XSLT style sheet which transforms the contents of the XML file to some legible text, for example CREATE TABLE and similar TSQL statements.

Intuitively I called this approach Symbolic Transformation.

Symbolic Transformation (e.g. XSLT)
Representation 1 => Representation 2

On the other hand, a Logical Transformation contains one module parsing the information contained in “Representation 1” into some kind of model, and another module creating the “Representation 2” of that model. The two module can be  implementations of the Interpreter pattern and the Builder or Factory patterns, respectively.

Logical Transformation (simple)
Interpreter Builder
Representation 1 => Model => Representation 2

If we take Representation 1 and Representation 2 as two separate interfaces to the same business model, and want to support two-way operations, we can extend the last table like this:

Logical Transformation (extended)
Interpr.
Builder
Converter Converter Interpr.
Builder
Repr. 1 => Model 1 => Business Model => Model 2 => Repr. 2
<= <= <= <=

Why do we need to have Model 1 and Model 2, as they seem to make the whole thing even more complex?

Let’s have a look at a simple CREATE TABLE statement and some of their representations:

  • a SQL parser (think: ANTLR) uses the representation given by the SQL parser
  • the SQL Server catalog views sys.tables, sys.columns, etc. are a different representation
  • a .dacpac archive is another representation

To keep our code simple, our Model X class structure should be as close to the representation as 1) possible 2) necessary (thinking about proxy classes generated by xsd.exe, ANTLR, and ORMs).

Thus, a common data model (named Business Model in the table) is required, as well as 2-way conversion between the Business Model and each of the other models.

Dealing with “Circular group reference” errors in xsd.exe

I continued to research the problem of XSLT files that cannot be processed by xsd.exe. In case of xslt.xsd (contained in the Visual Studio 2010 installation under the XML directory), xsd generates the error message

>xsd "C:\Program Files\Microsoft Visual Studio 10.0\Xml\Schemas\xslt.xsd" 
      /classes

Error: Error generating classes for schema ‘C:\Program Files\Microsoft Visual Studio 10.0\Xml\Schemas\xslt.xsd’.
– Group ‘char-instructions’ from targetNamespace=’http://www.w3.org/1999/XSL/Transform&#8217; has invalid definition: Circular group reference.

Here is how I proceeded:

Create copy1.xsd

To work around this error, I created a copy of the original xslt.xsd (named here copy1), and located the offending XSD definitions.

Replace circular references by reference to new (dummy) element

The two definitions that cause the circular reference error are the groups “char-instructions” and “instructions”. To find out where these definitions are used in the generated C# classes, the groups’ references are replaced by a reference to a new element:

  <xs:group name="char-instructions">
    <xs:choice>
<!--    
      <xs:element name="apply-templates" type="apply-templates" />
      <xs:element name="call-template" type="call-template" />
      <xs:element name="apply-imports" type="apply-imports" />
      <xs:element name="for-each" type="for-each" /> 
      <xs:element name="value-of" type="value-of" />
      <xs:element name="copy-of" type="copy-of" />
      <xs:element name="number" type="number" />
      <xs:element name="choose" type="choose" />
      <xs:element name="if" type="if" />
      <xs:element name="text" type="text" />
      <xs:element name="copy" type="copy" />
      <xs:element name="variable" type="variable" />
      <xs:element name="message" type="message" />
      <xs:element name="fallback" type="fallback" />
      -->
      <xs:any namespace="##other" processContents="lax" />
      <xs:element name="ci-dummy" type="ci-dummy" />
    </xs:choice>
  </xs:group>
  <xs:group name="instructions">
    <xs:choice>
      <xs:group ref="char-instructions" />
<!--    
      <xs:element name="processing-instruction" type="processing-instruction" />
      <xs:element name="comment" type="comment" />
      <xs:element name="element" type="element" />
-->
      <xs:element name="i-dummy" type="i-dummy" />
      <xs:element name="attribute" type="attribute" />
    </xs:choice>
  </xs:group>

Of course, the new dummy types also need to be declared in copy1.xsd:

  <xs:complexType name="ci-dummy" mixed="true">
    <xs:attribute name="dummy" type="xs:string" />
  </xs:complexType>
  <xs:complexType name="i-dummy" mixed="true">
    <xs:attribute name="dummy" type="xs:string" />
  </xs:complexType>

Running xsd.exe on copy1.xsd will now run successfully and generate copy1.cs.

Replace references to dummy classes by original classes

Search the generated classes for references to the dummy classes. You can start by commenting out the declaration of the dummy class and follow the compiler errors. In the example of xslt.xsd, replace

[System.Xml.Serialization.XmlElementAttribute("ci-dummy", typeof(cidummy))]

by

[System.Xml.Serialization.XmlElementAttribute("attribute-set", typeof(attributeset))]
[System.Xml.Serialization.XmlElementAttribute("decimal-format", typeof(decimalformat))]
[System.Xml.Serialization.XmlElementAttribute("include", typeof(include))]
[System.Xml.Serialization.XmlElementAttribute("key", typeof(key))]
[System.Xml.Serialization.XmlElementAttribute("namespace-alias", typeof(namespacealias))]
[System.Xml.Serialization.XmlElementAttribute("output", typeof(output))]
[System.Xml.Serialization.XmlElementAttribute("param", typeof(param))]
[System.Xml.Serialization.XmlElementAttribute("preserve-space", typeof(preservespace))]
[System.Xml.Serialization.XmlElementAttribute("strip-space", typeof(stripspace))]
[System.Xml.Serialization.XmlElementAttribute("template", typeof(template))]
[System.Xml.Serialization.XmlElementAttribute("variable", typeof(variable))]

However, the types that were previously referenced by other elements are not part of the generated code, as no reference to the elements exist anymore.

Add elements for previously referenced types

Create a second copy of the xsd file by copying copy1.xsd to copy2.xsd. Add the xs:element definitions that have been commented out in the first copy

    <xs:element name="apply-templates" type="apply-templates" />
    <xs:element name="apply-imports" type="apply-imports" />
    <xs:element name="call-template" type="call-template" />
    <xs:element name="for-each" type="for-each" />
    <xs:element name="value-of" type="value-of" />
    <xs:element name="copy-of" type="copy-of" />
    <xs:element name="number" type="number" />
    <xs:element name="choose" type="choose" />
    <xs:element name="if" type="if" />
    <xs:element name="text" type="text" />
    <xs:element name="copy" type="copy" />
    <xs:element name="variable" type="variable" />
    <xs:element name="message" type="message" />
    <xs:element name="fallback" type="fallback" />
    <xs:element name="processing-instruction" type="processing-instruction" />
    <xs:element name="comment" type="comment" />
    <xs:element name="element" type="element" />

Run xsd on copy2.xsd generating copy2.cs. copy2.cs need not be part of the C# project.

Copy C# classes

Next, copy all C# classes missing in copy1.cs from copy2.cs until copy1.cs compiles successfully.

Clean up attributes

Your C# classes can now be compiled and will successfully load an XML file conforming to the original XSD.

In case of the xslt.xsd, I noticed that elements and texts are handled by two different arrays, namely

// many more XmlElementAttribute declarations
[System.Xml.Serialization.XmlElementAttribute("some-element-name", typeof(someelementname))]
public object[] Items {
  get { return this.itemsField; }
  set { this.itemsField = value; }
}
[System.Xml.Serialization.XmlTextAttribute()]
public string[] Text {
  get { return this.textField; }
  set { this.textField = value; }
}

This will cause your code to lose the original order of elements and texts in the XML file. To combine both types of data into one array, add the XmlTextAttribute to the Items property as well:

[System.Xml.Serialization.XmlTextAttribute(typeof(string))]
public object[] Items

and completely remove the declaration of the string[] Text property.

Official XSD definition for XSLT?

I tried to look for an official XSD definition for XSLT files for some XSLT experiments.

What I did find was an xslt.xsd file on w3.org, and an xsd in the Xml directory of my Visual Studio installation (“C:\Program Files\Microsoft Visual Studio 10.0\Xml\Schemas\xslt.xsd”). Unfortunately, both versions could not be processed using the xsd.exe tool from VS resulting in different error messages.

Then I found an xslt10.dtd file on w3.org, and converted it to XSD using Trang:

java -jar trang.jar -I dtd -O xsd xslt10.dtd  xslt10.xsd

The resulting .xsd file can be converted to a C# file by xsd.exe

xsd xslt10.xsd /classes /n:Xslt

However I am not sure if the DTD, the XSD and thus the C# definitions are really really standards-compliant, or if the .xsd requires some editing to produce correct code.

Please leave a comment if you have any information on this topic.

Altering XML Schema Collections

SQL Server first implemented XML Schema Collections in version 2005.

However, support for XML schema collections is not exactly developer-friendly: Once a schema collection has been created in a database, the ALTER XML SCHEMA COLLECTION command can only add to an existing schema, but nothing can be removed from it, or changed:

Use the ALTER XML SCHEMA COLLECTION to add new XML schemas whose namespaces are not already in the XML schema collection, or add new components to existing namespaces in the collection.

Change or removal of schema “components” (i.e. elements and attributes) is only possible by dropping and then re-creating the XML schema collection. This is not as straight-forward as it seems, and this lack of efficiency keeps developers from using XML-based data, as this Stack Overflow question shows.

To drop an XML schema collection, the following steps are necessary:

  • If a table column references the schema collection (i.e. typed XML), it has to be converted to plain XML type
  • If the table column has a default constraint, drop the default constraint
  • If a procedure or function has typed XML parameters, the procedure or function has to be dropped
  • If a function has typed XML parameters, the function has to be dropped
  • If there is an XML index on a column referencing the schema collection, the index (primary and secondary indexes) has to be dropped
  • If there are computed columns based on a typed XML column, the computed columns have to be dropped
  • If there are indexes on these computed columns, the indexes have to be dropped
  • If there are schema-bound views, functions, or procedures based on tables containing typed XML columns, these objects have to be dropped

Of course, all these DROP commands have to be executed in the correct order.

After creating the new XML schema collection, all the dropped objects can be re-created using information initially stored in the SQL Server catalog views.

Handling XSD dateTime in SQL Server 2005

A client is running an application which exchanges XML documents with other installations of the same application. The application’s documentation also includes XSD schema definitions.

What an opportunity for me to get to study SQL Server 2005’s XML capabilities.

I open the .xsd files in a text editor, and invoke CREATE XML SCHEMA COLLECTION statements for each xsd file.

Next, I create a table which contains the XML data originally stored as ntext, and an untyped XML column. Additionally, each XSD gets a separate typed XML column:

CREATE TABLE [dbo].[XmlTable](
    [OID] [int] IDENTITY(1,1) NOT NULL,
    [Data] [ntext] NULL,
    [XmlData] [xml] NULL,
    [XsdData1] [xml] (CONTENT [dbo].[Xsd1] NULL
    ...
)

I import data from the production system into the Data column, and copy the data as untyped XML using

UPDATE XmlTable SET XmlData = Data

If every record contains valid XML data, the operation should complete successfully, with XmlData now containing the (untyped) XML documents. Remember that the Data column starts with an XML tag identifying the XSD. We can now start to copy the data to the typed (XSD-based) XML columns:

UPDATE XmlTable SET XsdData1 = XmlData WHERE Data LIKE '<Xsd1%'

which works fine unless you have an xsd:dateTime value in a format that SQL Server cannot process, generating an error message

XML Validation: Invalid simple type value

As this MS feedback page explains, SQL Server 2005 cannot handle datetime values with timezones. An xsd:dateTime value therefore has to end with a ‘Z’.

To make the XML data match SQL Server’s interpretation of the XSD schema information, we need to replace each xsd:dateTime from yyyy-mm-ddThh:mm:ss.sssssss+hh:mm (day, time, second fractions, timezone offset) to yyyy-mm-ddThh:mm:ssZ.

So, first find all XSD columns of type xsd:dateTime (Michael’s post was very inspiring):

SELECT sc.name XSD, sel.name Element
FROM sys.xml_schema_component_placements scp
INNER JOIN sys.xml_schema_types sct 
    ON scp.placed_xml_component_id=sct.xml_component_id
INNER JOIN sys.xml_schema_elements sel 
    ON scp.xml_component_id = sel.xml_component_id
INNER JOIN sys.xml_schema_collections sc 
    ON sel.xml_collection_id = sc.xml_collection_id
WHERE sct.name = 'dateTime' AND sct.xml_namespace_id = 1
ORDER BY 1, 2

The resultset contains the XML Schema Collection and Element names.

Next we take any one of the retrieved elements and check their values in the imported XmlData column using XQuery syntax:

SELECT XmlData.value('(//TimeStamp)[1]', 'nvarchar(50)')
FROM XmlTable

The pattern of date and time requires 19 character.

Using XMLDML, we can now update the timestamp value by taking the first 19 characters and appending ‘Z’ using UPDATE .modify. The result is calculated using XQuery functions.

UPDATE XmlTable
SET XmlData.modify(
    'replace value of (//TimeStamp/text())[1]
     with concat(substring((//TimeStamp)[1], 1, 19), "Z")')

After this step is repeated for each dateTime XML element, the XML content can be inserted into the typed XML columns, as stated above:

UPDATE XmlTable SET XsdData1 = XmlData WHERE Data LIKE '<Xsd1%'