Remove Unused References with Visual Studio 2017

I liked the Reference Assistant extension very much, but unfortunately it only works for Visual Studio versions 2010 through 2013, and I always wanted to adapt the extension to later versions of Visual Studio.

Now I did.

So I forked the original code from the repository on GitHub (really originally hosted on CodePlex).

Since I had no experience in working with the VSSDK, I created a dummy project and clicked my way through it, and tried to figure out, where the original code hooked into VS, and how to migrate that code from 2010 to 2017.

One of the main obstacles was the setup of the Options page, which seems to have changed fundamentally. Fortunately, the Extensibility Samples also contain a project named Options covering the Tools->Options dialog.

Of course, migration also means upgrading .Net versions (implied by creating a VS2017 extension project), replacing assembly references with NuGet packages, and a little bit of tuning and tweaking.

You can find the upgraded Reference Assistant on GitHub, the first release version is here.

What I like about this tool is that is lets you preview the changes it is going to make, and exclude references from removal you know are required, as the tool does not consider occasional implied references and assemblies referenced in .config files.

Importing mysqldump into MS SQL Server

To import data from MySQL into a MS SQL database, we need to first analyze the output of mysqldump, the backup utility of MySQL. I have covered the most obvious differences to T-SQL in my previous post.

Fortunately, the output of mysqldump is line-oriented, i.e. every CREATE TABLE statement, table column or index, and INSERT statement is in a separate line.

In a brute-force approach, we can therefore parse each line using regular expressions, and do not require full parsing involving tokenizer and grammar.

I found a couple of interesting points that need to be considered when generating T-SQL statements:

  • As T-SQL did not support “DROP IF EXISTS” before SQL Server 2016, we need to provide the classical variant “IF OBJECT_ID() IS NOT NULL”
  • NVARCHAR(MAX) columns cannot be indexed
  • BINARY data needs to be 0x-encoded
  • UNIQUE INDEXes should not include NULLable columns
  • Multi-row INSERT INTO statements cannot contain more than 1000 rows
  • ‘0000-00-00’ is a valid date in MySQL, but not in MSSQL. The value is being replaced by ‘1800-01-01’ to avoid the error message

    The conversion of a varchar data type to a datetime data type resulted in an out-of-range value.

Some indexes may require manually editing column collations, e.g. for unique case-sensitive MySQL indexes.

When executing T-SQL statements in SSMS, a number of warnings or errors can occur:

  • A warning that indexed data is supported to a maximum length of 960 bytes
  • Inserting data may result in the error “String or binary data would be truncated.”
  • Some files are too big to execute in SSMS, and need to be executed using sqlcmd
  • A batch with too many too large INSERT INTO statements may raise the error

    There is insufficient system memory in resource pool ‘default’ to run this query

    so we break up the batch using “GO”

  • Multiple cascade paths

MS SQL Server does not support ON DELETE CASCADE clauses such that the grandchild record of a parent-child-grandchild relation cannot be uniquely identified. Creating such a FOREIGN KEY raises the error message

Msg 1785, Level 16, State 0, Line 1
Introducing FOREIGN KEY constraint ‘fk_xxx’ on table ‘xxx’ may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.

In such cases, either drop the ON DELETE CASCADE if possible, and replace the deletion cascade with an INSTEAD OF DELETE trigger.

I developed mysqldump2mssql, a prototype implementing these rules, which is available on my GitHub. The program has been developed and tested against MySQL dumps of MediaWiki, Joomla, Bugzilla and Nextcloud.

The tool generates up to 4 files for each table defined in the source SQL file:

  • create.sql: the CREATE TABLE statement, optional CREATE UNIQUE INDEX statements
  • data.sql: the INSERT INTO statements
  • fk.sql: the FOREIGN KEY relations to other tables
  • fulltext.sql: the CREATE FULLTEXT statement

The filenames of the generated files consist of the sequence number of execution, the table name, and one of the endings mentioned in above table.

Length of UTF-8 VARCHAR in SQL Server

Foreword

For as long as I can remember, a VARCHAR (or CHAR) was always defined as “1 character equals 1 byte”. Different character sets (code pages) where implemented as COLLATIONs, so that you had basic database support for internationalization.

Then came Unicode, and we got NVARCHAR strings (or NCHAR), where the rule was “1 character equals 2 bytes”, and we could store any text from around the world without bothering with code pages, encodings, etc. The .Net framework brought us the string class with similar features and the world was beautiful.

Then, in 2001, came Unicode 3.1 and needed more space:

For the first time, characters are encoded beyond the original 16-bit codespace or Basic Multilingual Plane (BMP or Plane 0). These new characters, encoded at code positions of U+10000 or higher, are synchronized with the forthcoming standard ISO/IEC 10646-2. For further information, see Article IX, Relation to 10646. Unicode 3.1 and 10646-2 define three new supplementary planes.

These additional planes were immediately supported in SQL Server 2012. From now on, using an *_SC collation, NVARCHARs could be 2 or 4 bytes per character.

In C#, the StringInfo class handles supplementary planes, but it seems, they are still a bit behind:

Starting with the .NET Framework 4.6.2, character classification is based on The Unicode Standard, Version 8.0.0. For the .NET Framework 4 through the .NET Framework 4.6.1, it is based on The Unicode Standard, Version 6.3.0. In .NET Core, it is based on The Unicode Standard, Version 8.0.0.

(For the record, the current Unicode version is 12.1, and 13.0 is going to be released soon)

UTF-8 Collations

So now SQL Server 2019 supports UTF-8-enabled collations.

A question on SO quoted the documentation as

A common misconception is to think that CHAR(n) and VARCHAR(n), the n defines the number of characters. But in CHAR(n) and VARCHAR(n) the n defines the string length in bytes (0-8,000). n never defines numbers of characters that can be stored

(emphasis mine) which confused me a little bit, and the quote continues

The misconception happens because when using single-byte encoding, the storage size of CHAR and VARCHAR is n bytes and the number of characters is also n.

(emphasis mine).

This got me investigating, and I had a look into this issue. I create a UTF8-enabled database with a table with all kinds of N/VARCHAR columns

CREATE DATABASE [test-sc] COLLATE Latin1_General_100_CI_AI_KS_SC_UTF8

CREATE TABLE [dbo].[UTF8Test](
  [Id] [int] IDENTITY(1,1) NOT NULL,
  [VarcharText] [varchar](50) COLLATE Latin1_General_100_CI_AI NULL,
  [VarcharTextSC] [varchar](50) COLLATE Latin1_General_100_CI_AI_KS_SC NULL,
  [VarcharUTF8] [varchar](50) COLLATE Latin1_General_100_CI_AI_KS_SC_UTF8 NULL,
  [NVarcharText] [nvarchar](50) COLLATE Latin1_General_100_CI_AI_KS NULL,
  [NVarcharTextSC] [nvarchar](50) COLLATE Latin1_General_100_CI_AI_KS_SC NULL,
  [NVarcharUTF8] [nvarchar](50) COLLATE Latin1_General_100_CI_AI_KS_SC_UTF8 NULL
)

I inserted test data from various Unicode ranges

INSERT INTO [dbo].[UTF8Test] ([VarcharText],[VarcharTextSC],[VarcharUTF8],[NVarcharText],[NVarcharTextSC],[NVarcharUTF8])
    VALUES ('a','a','a','a','a','a')
INSERT INTO [dbo].[UTF8Test] ([VarcharText],[VarcharTextSC],[VarcharUTF8],[NVarcharText],[NVarcharTextSC],[NVarcharUTF8])
    VALUES ('ö','ö','ö',N'ö',N'ö',N'ö')
-- U+56D7
INSERT INTO [dbo].[UTF8Test] ([VarcharText],[VarcharTextSC],[VarcharUTF8],[NVarcharText],[NVarcharTextSC],[NVarcharUTF8])
    VALUES (N'囗',N'囗',N'囗',N'囗',N'囗',N'囗')
-- U+2000B
INSERT INTO [dbo].[UTF8Test] ([VarcharText],[VarcharTextSC],[VarcharUTF8],[NVarcharText],[NVarcharTextSC],[NVarcharUTF8])
    VALUES (N'𠀋',N'𠀋',N'𠀋',N'𠀋',N'𠀋',N'𠀋')

then selected the lengths and data lengths of each text field

SELECT TOP (1000) [Id]
    ,[VarcharText],[VarcharTextSC],[VarcharUTF8]
    ,[NVarcharText],[NVarcharTextSC],[NVarcharUTF8]
FROM [test-sc].[dbo].[UTF8Test]
SELECT TOP (1000) [Id]
    ,LEN([VarcharText]) VT,LEN([VarcharTextSC]) VTSC
    ,LEN([VarcharUTF8]) VU
    ,LEN([NVarcharText]) NVT,LEN([NVarcharTextSC]) NVTSC
    ,LEN([NVarcharUTF8]) NVU
FROM [test-sc].[dbo].[UTF8Test]
SELECT TOP (1000) [Id]
    ,DATALENGTH([VarcharText]) VT,DATALENGTH([VarcharTextSC]) VTSC
    ,DATALENGTH([VarcharUTF8]) VU
    ,DATALENGTH([NVarcharText]) NVT,DATALENGTH([NVarcharTextSC]) NVTSC
    ,DATALENGTH([NVarcharUTF8]) NVU
FROM [test-sc].[dbo].[UTF8Test]

Select Lengths.png

I was surprised to find that the old mantra “a VARCHAR only stores single byte characters” needs to be revised when using UTF8 collations.

Table data only

Note that only table columns are associated with collations, but not T-SQL variables, as you cannot declare a collation on a variable

SELECT @VarcharText = [VarcharText],@NVarcharText = [NVarcharText]
FROM [test-sc].[dbo].[UTF8Test]
WHERE [Id] = 4
SELECT @VarcharText, Len(@VarcharText), DATALENGTH(@VarcharText)
    , @NVarcharText, Len(@NVarcharText), DATALENGTH(@NVarcharText)

SELECT @VarcharText = [VarcharTextSC], @NVarcharText = [NVarcharTextSC]
FROM [test-sc].[dbo].[UTF8Test]
WHERE [Id] = 4
SELECT @VarcharText, Len(@VarcharText), DATALENGTH(@VarcharText)
    , @NVarcharText, Len(@NVarcharText), DATALENGTH(@NVarcharText)

SELECT @VarcharText = [VarcharUTF8], @NVarcharText = [NVarcharUTF8]
FROM [test-sc].[dbo].[UTF8Test]
WHERE [Id] = 4
SELECT @VarcharText, Len(@VarcharText), DATALENGTH(@VarcharText)
    , @NVarcharText, Len(@NVarcharText), DATALENGTH(@NVarcharText)

 

Select Variable Lengths.png

 

VS Solution Dependency Visualizer supports .NET Core Project Files

When I tried to analyze an unknown Visual Studio solution in my tool VS Solution Dependency Visualizer, I noticed that the project dependencies were not analyzed correctly. In fact, the dependencies were completely ignored.

A bit of research turned up that the solution in question was partly migrated to .NET Core, and the .csproj project files have a slightly different structure than good old .Net project files.

First, the classic .Net project files use a namespace

<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

whereas .NET Core project files come without XML namespace, and have the root attribute Sdk:

<Project Sdk="Microsoft.NET.Sdk">

Also, files are not compiled if they are explicitly contained in the .csproj file, but rather by a rule-based Default Compilation mechanism which decides whether files in or below the project directory are compiled, embedded, or ignored.

For a detailed documentation of .NET Core project files, see MS docs or the dotnet GitHub docs.

The latest version of VS Solution Dependency Visualizer with support for .NET Core projects is available for download here.

BTW: In previous versions of this tool, I tried to register VS Solution Dependency Visualizer as an External Tool in all installed versions of Visual Studio.

However, things change all the time, and VS 2013 was the first version that did not support registration via registry, and with VS 2017 things got even more different, and finally I had to give up.

Instructions on how to add VS Solution Dependency Visualizer as an External Tool in your version of Visual Studio can be found in the readme file that comes with the installer.

Handling the “Size Limit” in AD Queries

I created a small tool to mirror AD data into an SQL Server database. The AD queries essentially looked like this

var conn = new ADODB.Connection();
conn.Open("Provider=ADsDSOObject", "", "", 0);

const string where = "objectCategory='group' ";      

var qry = string.Format(@"SELECT objectCategory, displayName, [more attributes]
FROM 'LDAP://{0}/{1}' 
WHERE {2}", server, start, where);

object recs;
var rs = conn.Execute(qry, out recs, 0);

for (; !rs.EOF; rs.MoveNext())
{
    // ... process record
}

The attributes available in AD have been taken from here.

The code worked fine for many months, until one day it threw an exception:

System.Runtime.InteropServices.COMException (0x80072023): The size limit for this request was exceeded.
at ADODB.RecordsetClass.MoveNext()

It turned out that the query result suddenly exceeded 3000 records, which may or may not be a magic or configurable limit of fetches for a single AD query – probably also including the number of records each fetch returns. Who knows.

Thanks goes to the internetz, which provided me with a solution which now fetches more than 3000 records. Just replace the conn.Execute() part with

var cmd = new ADODB.Command {ActiveConnection = conn, CommandText = qry};
cmd.Properties["Page Size"].Value = 10000;
cmd.Properties["Timeout"].Value = 30;
cmd.Properties["Cache Results"].Value = false;

Fixing PDFSharp hangs

To analyse a couple of PDF files whether they contain only images, I used the latest release build of PDFsharp, version 1.32.

However, when processing a certain file (of unknown origin) using code found in an SO answer

public static IEnumerable ExtractText(this PdfPage page)
{       
    var content = ContentReader.ReadContent(page);      
    var text = content.ExtractText();
    return text;
}   

the ExtractText() function simply would not return.

I upgraded to the most current build 1.50 beta 3, included the source in my project, and ran it in Debug mode, where execution halted in the file PDFsharp\src\PdfSharp\Pdf.Content\CParser.cs line 163 failing an assertion:

#if DEBUG
    default:
        Debug.Assert(false);
        break;
#endif

Without digging too deep into the analysis of PDF files, it was clear that the PDF contained a CSymbol that is not being handled by the library, and thus (most likely) ended up in an infinite loop inside CParser.ParseObject().

I fixed this by replacing the Debug.Assert statement with

        throw new Exception("unhandled PDF symbol " + symbol);

which fixed the situation for me.

Unit Testing ASP.Net MVC 5 Controllers using Rhino Mocks

Unit Testing ASP.Net MVC Controllers is not as straight-forward as you may thing, because as you get started, you face questions such as: how to create an HttpContext, an HttpContextBase, should you prefer mocking, and which mocking framework, etc.

After a day of googling and tinkering, I finally came up with a working solution using Rhino.Mocks and MvcContrib, both downloaded via nuget. To analyze Owin black magic, I used the source code repositories of Katana on Codeplex and symbolsource.

[TestClass()]
public class HomeIndexControllerTests
{
  [TestMethod()]
  public void HomeIndexTest()
  {

The authentication routine uses async calls, such as FindByNameAsync(). I tried to declare the test method as public async Task, but after starting the test, the IDE claimed it’s busy, but did not execute the test.

Without Task.Run(), the async method would fail throwing an AggregateException.

I found the work-around to encapsulate the whole test inside a Task.Run():

    Task.Run(() =>
    {
      var builder = new TestControllerBuilder();

First, we mock the ApplicationUserManager

      var us = MockRepository.GenerateStub<IUserStore<AppUser>>();
      var aum = MockRepository.GenerateStub<ApplicationUserManager>(us);

My class for storing users in the user database is called AppUser. The variable appuser is set to null for unauthenticated requests, and contains a valid object if authenticated:

      AppUser appuser = null;
      //appuser = new AppUser { .... values .... };  // uncomment for authenticated request

These stubs return the defined user object:

      us.Stub(u => u.FindByNameAsync(""))
        .IgnoreArguments()
        .Return(Task.FromResult(appuser));
      aum.Stub(m => m.FindByNameAsync((ClaimsIdentity)null, ""))
        .IgnoreArguments()
        .Return(Task.FromResult(appuser));

Create the Owin context and register the user manager:

      var owin = new OwinContext();
      owin.Set(aum);

Use the builder to create the controller under test:

      var controller = builder.CreateController<C.Home.IndexController>();

To test unauthenticated requests, we check whether the class or the method have an [Authorize] attribute

      if (appuser == null)
      {
        var type = controller.GetType();
        var attributes = type.GetCustomAttributes(typeof(AuthorizeAttribute), true);
        var methodInfo = type.GetMethod("Execute" 
          /*, new Type[] { ....parameter types.... } */
        );
        var methodAttributes = methodInfo.GetCustomAttributes(typeof(AuthorizeAttribute), true);
        Assert.IsFalse(attributes.Any() || methodAttributes.Any(), "Unauthorized request not allowed");
      }

Next, we wire up the HttpContext and Owin, and set the request’s cookies:

      var context = controller.HttpContext; 
      // the context is already mocked, so we can stub() it

      var dict = new Dictionary<object, object>();
      dict.Add("owin.Environment", owin.Environment);
      context.Stub(c => c.Items).Return(dict);
      var request = controller.Request;
      request.Stub(r => r.Cookies).Return(new HttpCookieCollection());

Next, we create a principal from the user object (see here for setting the name claim)

      var user = new GenericPrincipal(
        new ClaimsIdentity(
          new Claim[] { 
            new Claim(ClaimTypes.Name, appuser == null ? "" : appuser.UserName) }),
        new string[0]);
      context.User = user;

Then we create a fake HttpContext which matches the mocked HttpContextBase object created above:

      HttpContext.Current = new HttpContext(
        new HttpRequest("", "http://tempuri.org", ""),
        new HttpResponse(new StringWriter())
      );
      HttpContext.Current.Items["owin.Environment"] = owin.Environment;
      if (appuser != null)
        HttpContext.Current.User = user;

We are now ready to execute the controller method

      builder.InitializeController(controller);
      var result = controller.Execute();

And check the results: if the request is successful, the result is typically a ViewResult.

      Assert.IsInstanceOfType(result, typeof(RedirectToRouteResult));
      Assert.IsInstanceOfType(result, typeof(ViewResult));

We can now test the result values, and end the Task definition:

    }).GetAwaiter().GetResult();
  }
}