ExpectOne and ExpectOneOrNull vs. First and FirstOrDefault

November 14, 2009

Linq provides the extension methods First() and FirstOrDefault(), which return the first object in a typed IEnumerable<T> collection, or default<T> if the collection is empty.

Sometimes you don’t only want to use the first element, but require the collection to have at most 1 element.

The methods ExpectOne and ExpectOneOrNull implement the requirement to have exactly 1 (or at most 1) element in an IEnumerable collection:

public static class LinqExtensions
{
    /*
     * ExpectOne: returns exactly one record. if 0 or >1 found then exception
     * ExpectOneOrNull: returns exactly one record or null. if >1 found then exception
     */

    public static TSource ExpectOne(this IEnumerable source)
    {
        TSource result = default(TSource);
        bool b = true;
        foreach (TSource o in source)
        {
            if (b)
                result = o;
            else
                throw new ExpectOneException("too many elements in result set. one element expected.");
            b = false;
        }
        if (b)
            throw new ExpectOneException("no element in result set. one element expected.");
        return result;
    }

    public static TSource ExpectOneOrNull(this IEnumerable source)
    {
        bool b = true;
        TSource result = default(TSource);
        foreach (TSource o in source)
        {
            if (b)
                result = o;
            else
                throw new ExpectOneException("too many elements in result set. one or no element expected.");
            b = false;
        }
        return result;
    }
}

A separate exception class makes it easier for the application code (and developers!) to handle cases where more than 1 record is returned, indicating an error in the query.

If you want to replace for calls to First(<condition>) by ExpectOne(), replace First(<condition>) by Where(<condition>).ExpectOne().

As ExpectOne() contains compiled code, you cannot use these methods in Compiled Queries.

This may seem as a disadvantage. In reality it makes program code simpler because compiled queries are now required to always return an IEnumerable<T>, and the calling code needs to specify the expected result set size of the query.

It may seem desirable to replace all occurrences of First() and FirstOrDefault() by ExpectOne() and ExpectOneOrNull(). There are cases where original Linq methods are still necessary and valid:

Any() to check whether a collection is not empty, i.e. elements exist that match criteria

First() and FirstOrDefault() to find the first element in a sorted collection (e.g. the greatest element less than the original value)

 


Profiling .Net Applications

November 1, 2009

I am working on replacing the current (ugly) SQL parser of dbscript by a nice grammar-based parser as outlined in previous posts. That part of the project has been idle for some months now, but it’s time to integrate a grammer-based parser.

During tests I found that the new parser takes several minutes to parse an SQL file, whereas the old one processed the same file within seconds. Where is this time spent?

Time to profile the code. There are a couple of .Net profilers around, also Microsoft also provides a product called CLR Profiler which can be downloaded here.

When you start the CLRProfiler.exe application, both a console window and a WinForm window open. (This may look unusual, but hey, it’s a developer tool, and it is the functionality that counts)

Click on Start Application to select the executable you want to profile. The application starts up and you perform the tasks to profile just as if you started the application directly. (It just takes a little longer than usual, because the profiler does its work in the background).

When you’re done, close the application. (You may also choose the profiler’s Kill Application button). The profiler will now compute its magic numbers while message “Progress loading (filename).log” is displayed. After computation has finished, a Summary window displays memory and garbage collection statistics.

Click on the Allocated bytes Histogram button, and find the classes with most objects instantiated during the run.

For my purposes, the View Function Graph command was the most useful. It shows the percentage of time spent by each called function relative to the calling function.

It is not obvious from the UI, but each box representing a function can be drilled down by double-clicking it, showing again the percentages of the sub functions. (My guess is that only functions of the same assembly are shown in one graph, and the drill-down is necessary to switch between assemblies).

Anyway, I like that tool, as it showed me in just two (!!) graphs where the code spent its time, and as it turned out, these were just debugging routines (create XML document, retrieve its OuterText property, write to Console).


Parsing Culture-Invariant Floating-Point Numbers

October 15, 2009

I just came across this non-obvious behavior of double.TryParse:

We develop on German Windows, but the input field for float values should be intelligent to recognize both German (“,”) and international (“.”) notations for the decimal point.

Of course the functions to call to parse a string s into a double value are:

double f; string s = "double value as string";

double.TryParse(s, System.Globalization.NumberStyles.Float,
    System.Globalization.CultureInfo.InvariantCulture, out f)

double.TryParse(s, out f)

But the results were surprising

InvariantCulture default culture
string value return (bool) output (double) return (bool) output (double)
“100,03″ false true 100.03
“100.03″ true 100.03 true 10003.0

My reasoning was, that if current (UI) culture could not parse the string, I let Invariant Culture try to parse it.

It turns out, however, that if the current culture simply removes all characters it does not recognize as valid input string for float numbers, and then parses the string, with the undesired result as seen above.

So the right way seems to try with invariant culture first, then use current culture:

double f;
double? Result;
if (double.TryParse(s, System.Globalization.NumberStyles.Float,
        System.Globalization.CultureInfo.InvariantCulture, out f))
    Result = f;
else if (double.TryParse(s, out f))
    Result = f;
else
    Result = null;

Reading LONG Oracle columns in .Net

August 24, 2009

If you use ODP.Net to access an Oracle database from .Net, and select LONG text columns, the DataReader’s GetString() method will return an empty string.

To work around this behavior, you need to set the InitialLONGFetchSize to a non-zero value, and use the OracleDataReader’s GetOracleString() method, and use ToString() to convert the result into a .Net string.

Another possibility is to set InitialLONGFetchSize to -1, so that GetString() works as expected.

The linked documentation also explains how to access LONG RAW data.


C# operators ‘is’, ‘as’, and type casts

July 19, 2009

Background information: starting with version 0.96, dbscript was capable of dealing with more than 1 database engine (i.e. other than MSSQL), and the question I had to solve was how the user interface would reflect the different capabilities of the database modules.

DBEngine
+-- MSSql.DBEngine
+-- Oracle.DBEngine
+.. other derived classes

For example, a database module might implement the “Import”, “Upload”, or “Generate Script” function, but how does the module let its capabilities be known to the rest of the code.

There are several possibilities:

  • Boolean flags (e.g. CanImport) and an implementing method (DoImport())
  • [Flags] enumeration and corresponding implementing methods
  • Interfaces

Both bools and enumerations are easy to implement, but they don’t scale if you have a lot of features to take care of, and there is no strong link between the information and the implementation.

Using interfaces, the capabilities problem is solved by declaring and implementing interfaces (such as IDBEngineImport) and checking whether the module implements an interface. The only question remaining is: How does it perform?

Well, other people have asked that question before, and others have answered it:

C# ‘is’ operator performance (stackoverflow)

What are the performance characteristics of ‘is’ reflection in C#? (stackoverflow)

Prefix-casting versus as-casting in C#

Type casting impact over execution performance in C#

and the answer seems to be: It does not really matter.

However the answers behind above links are the same:

  • If you just want to check for an interface implementation, use “is”.
  • If you need the type-cast object, use “as” and check for null.
  • Don’t use () type cast operators if you don’t know if the cast will always be successful, as exceptions have a performance penalty.

Multi-Value IComparer

July 18, 2009

In .Net, SortedList<>’s cannot contain multiple Key values. This may be caused by the fact that the SortedList index operator [] needs to uniquely identify a list entry. However the restriction is annoying if you just want to sort some arbitrary set of data.

I came across this post Advanced IComparer // Sorting on Multiple Values on creating custom comparers implementing IComparer<T>. I especially liked Simon’s source code he posted in his comment.

However I had to fix 2 problems to make it work in my code:

First was the mishandling of ASC/DESC: when you defined the properties to be compared, you needed to add “ASC” or “DESC” after the property name. “ASC” should be the default, as in SQL SELECTs, and should not be required. I changed args.Length>0 to args.Length>1 to avoid an out-of-range error.

The second problem was more difficult to solve, as it involved the reflection API. The original code supposed all comparable values of the objects to be Properties of the respective class. I wanted the comparison also to include member variables to avoid requiring to declare { get; set; } for every variable.

So I changed the property value handling from

object o1 = x.GetType().GetProperty(prop).GetValue(x, null);

to

pi = x.GetType().GetProperty(prop);
if (pi != null)
    o1 = pi.GetValue(x, null);
else
{
    fi = x.GetType().GetField(prop);
    if (fi != null)
        o1 = fi.GetValue(x);
    else
        throw new Exception("Field or Property " + prop + " not found in " + x.GetType().ToString());
}

which checks the declared names against property names and field (member variable) names.

So here’s the code:

Read the rest of this entry »


Implementing a Feedback form

May 28, 2009

If you want to implement a feedback form for your application or web application without relying on the user configuring mail accounts etc., the simplest solution may be a web service which the application will access to deliver the message.

We would want the user to enter name, email address, and the message. Minimal validation would require the user to enter non-blank values for each field using a RequiredFieldValidator.

Additionally, we can check the email address for formal constraints using regular expressions, but this is quite tricky. Checking the name field for validity seems impossible.

Additionally, the application should also provide its name and version.

Thus, our web service looks something like this:

[WebMethod]
 public string PostFeedback(string Application, string Version,
    string Name, string Email, string Message)

It’s up to you what logic you implement in the web service handler. I chose to check application and version, and send an email to the contact stored for the application.

The web service returns a message to the user notifying successful receipt of the feedback (or an error message) which the invoking application finally displays.


Tools and Libraries for .Net Developers

May 14, 2009

Managed Windows API library for .Net

February 18, 2009

I mentioned accessing native Win32 APIs by .Net some time ago.

Today I came across a Sourceforge project called Managed Windows API, which provides a set of class to access various aspects of Win32. A couple of tools illustrate the use of the library.

I like TreeSizeSharp, a re-implementation of the TreeSize tool I often use if I find there’s not enough free space on my disks (again).


Generating a C# Stored Procedure Wrapper in TSQL

January 23, 2009

I love compiling languages. Whenever you change something that consequently does not fit the rest of the program, something breaks and you get a compiler error. Great!

If you develop database applications, this nice change-it-break-it mechanism stops to work, as the relation between database objects and their usage by a C# application is in no way hard-coded.

Therefore I developed a small piece of T-SQL code which will generate a static C# class which encapsulates all stored procedures in a given database in the form:

public static class MyStoredProc {
    public static SqlCommand MyProc1(SqlConnection conn)
    {
        SqlCommand cmd = new SqlCommand("MyProc1", conn);
        ...
        return cmd;
    }
    ...
}

The strategy is simple: loop through all desired procedures, and for each procedure loop through its parameters twice: once for the declaration, second time for the parameter values.

Create file and class header:

PRINT 'using System;
using System.Data;
using System.Data.SqlClient;
'

PRINT    'namespace ' + @NameSpace
PRINT    '{'
PRINT    'public static class ' + @Class + '
{'

Now we declare the loop for the procedure names (the code uses SQL2000 dictionary objects, which are also present on SQL2005; adaption should be simple):

DECLARE cP CURSOR FOR
SELECT  dbo.sysobjects.name
FROM     dbo.sysobjects
WHERE    (dbo.sysobjects.name NOT LIKE 'dt[_]%')
AND    ... other conditions ...
AND     (dbo.sysobjects.xtype = 'P')
ORDER BY dbo.sysobjects.name

OPEN cP
FETCH cP INTO @Proc

WHILE @@FETCH_STATUS = 0 BEGIN

PRINT '    public static SqlCommand ' + @Proc + '(SqlConnection conn'

Next we declare the cursor for the procedure parameters in the correct order:

DECLARE c CURSOR FOR
SELECT  dbo.syscolumns.name, dbo.systypes.name, dbo.syscolumns.length
FROM     dbo.sysobjects
INNER JOIN dbo.syscolumns ON dbo.sysobjects.id = dbo.syscolumns.id
INNER JOIN dbo.systypes ON dbo.syscolumns.xtype = dbo.systypes.xtype
WHERE    (dbo.sysobjects.name = @Proc) AND (dbo.systypes.name <> N'sysname')
ORDER BY dbo.sysobjects.name, dbo.syscolumns.colid

OPEN c
FETCH c INTO @Column, @Type, @Length

WHILE @@FETCH_STATUS = 0 BEGIN

For parameter declaration, we need to translate the SQL datatypes into C# datatypes. In this example, we use the following mapping:

int -> int?
datetime -> DateTime?
n/varchar -> string
bit -> bool?

PRINT  '        , ' +
    CASE @Type
    WHEN 'int' THEN 'int?'
    WHEN 'datetime' THEN 'DateTime?'
    WHEN 'nvarchar' THEN 'string'
    WHEN 'varchar' THEN 'string'
    WHEN 'bit' THEN 'bool?'
    ELSE '**datatype-not-supported**' END +
    ' ' + SUBSTRING(@Column, 2, LEN(@Column)-1)

All the wrapping method does is to create a SqlCommand using the SqlConnection:

PRINT ' )'
PRINT ' {'
PRINT '   SqlCommand cmd = new SqlCommand("' + @Proc + '", conn);'
PRINT '   cmd.CommandType = CommandType.StoredProcedure;'
PRINT '   cmd.CommandTimeout = 0;'

Then we loop through the parameters again to set the SqlCommand’s Parameters’ values. As shown above, parameter types are mapped to C# Nullable<T> types (question-mark notation), except for strings, as they can have the value null. Thus we have to treat string parameters differently from nullable (in C#) parameters:

OPEN c
FETCH c INTO @Column, @Type, @Length

WHILE @@FETCH_STATUS = 0 BEGIN

    IF @Type='nvarchar' OR @Type='varchar'
        PRINT ' cmd.Parameters.AddWithValue("' + @Column + '", ' +
            SUBSTRING(@Column, 2, LEN(@Column)-1) + ' != null ? (object)' +
            SUBSTRING(@Column, 2, LEN(@Column)-1) + ' : DBNull.Value);'
    ELSE
        PRINT ' cmd.Parameters.AddWithValue("' + @Column + '", ' +
            SUBSTRING(@Column, 2, LEN(@Column)-1) + '.HasValue ? (object)' +
            SUBSTRING(@Column, 2, LEN(@Column)-1) + '.Value : DBNull.Value);'

Finally, return the new SqlCommand:

PRINT '        return cmd;'
PRINT '    }'

Run this and paste the result into your C# application ;)