Open Data

When I developed my YuJisho online dictionary web application, I was looking for freely available fonts and dictionary data related to CJK languages.

For my dbscript database schema management application, I tried to find as many database schema samples as possible to test the application against.

There is a lot of data (raw, processed and visualized) available on the Internet, but occasionally it is hard to find. This raised the idea of providing a collection of references to free data sets on the web like the Guardian Data Store, and I was thinking about a platform to provide such links.

Now news is out that data.gov plan to release their platform as open source software (GitHub), but the code is still labeled as alpha. (data.gov HTML says it is based on Socrata, which also provides lots of links to open data).

Let me know what’s your experience with OpenData, or similar platforms.

Evolving Architecture for Legacy Applications

I started developing database applications before the idea of using data access layers or ORMs became mainstream. If you happened to develop under Delphi or Visual Studio in the early days, your application architecture may look something like this:

There is a database (or any kind of data store, such as files or XML or whatever), and an application that operates on the data.

Any change to the database structure cause a mess in the application code:

  • where was a now-deleted table or field accessed or processed
  • which views and procedures are accessed by the code
  • some business logic implemented as stored procedures, some as application code

Time to separate implicit database access from application code (I’m looking at you, DataGrid, GridView and FormView with your magic DataSources!)

Enter the data access layer, which provides the application code (in my case, C#) a type-safe interface to the database:

This gets rid of the data usage and access problem: changes to the database structure are mirrored into the data access layer, and will cause compiler errors if a table or field has been removed or its data type has changed.

Still, the problem of implementing business functions in the database or in code remains, and the application needs to be aware or where the function is located.

The business layer encapsulated business logic wherever it is implemented, calling either business function in C# (object-oriented) or in the database (taking advantage of set-based operations or database-specific functionality). The application exchanges data with the business layer using the data contract classes.

As I mainly develop web applications, I notice that the object-oriented model occasionally breaks down in the front-end, since HTML is basically all-strings, and you need to parse and re-create your data access objects based on HTML data. Further, if you need to let your users edit hierarchically-structured data, you’ll soon get lost retrieving all your hierarchical business objects.

A user interface functions layer should handle the complexity of translating business objects to and from user interface objects:

Once you use a UI functions layer, it’s up to your aesthetic preferences whether you let the application still access the business data contracts, or pass everything through the UI data contracts. It is one additional step of mapping data, and you have been warned ๐Ÿ˜‰

Notice that, although I have a strong background in MS development tools (Visual Studio, SQL Server), I tried to keep the ideas and diagrams as technology-agnostic as possible. It does not matter whether your business functions reside in a web service (of any kind) or just in a separate library (assembly), or what database (if any) you use.

Fan Spam

All these comments originated from 109.230.246.23 (utrace, dnsstuff) and were posted on my Products page:

What an exciting article, preserve writing companion

Sweetheart, this site is without a doubt fabolous, i simply like it

hi there, your website is wonderful. We do thank you for job

Sweetheart, this amazing site can be fabolous, i recently fantastic

How much of an intriguing posting, keep producing better half

How much of an important article, continue to keep creating mate

What an unique posting, continue to keep crafting special someone

… accidentally,ย of course, naming a web address ending in .pl.

See this WordPress Support page on how to disable comments on your Pages, if you are hit by similar spam.

The Unicode Video

I just came across this video on YouTube showing every displayable (with some restrictions) character in the Unicode BMP. (1 character per frame)

Next, the Unicode code point of the day with Wikipedia links to corresponding alphabet or language, which of course reminds me of the series Every character has a story.

Update 11/04/18:

BabelStone comments on the video and hosts a page called Unicode 6.0 Slide Show implemented in JavaScript. Warning: Since the browser displays the Unicode characters, you need to have the required fonts installed on your machine.

Embedding Images in HTML using C#

In HTML pages, images are typically linked rather than embedded using the

<img src="[protocol]://[url]" />

tag. If you want to embed the image data, for example to handle just a single file containing text and images, you need to replace the src attribute with the Base64-encoded binary content of the image using the data: protocol:

<img src="data:image/jpeg;base64,[data]">

To encode an image file in Base64, use the following function (originally found here):

string MakeImageSrcData(string filename)
{
  FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
  byte[] filebytes = new byte[fs.Length];
  fs.Read(filebytes, 0, Convert.ToInt32(fs.Length));
  return "data:image/png;base64," +
    Convert.ToBase64String(filebytes, Base64FormattingOptions.None);
}

This function can now be used to render an image in either aspx/ascx:

<img src="<%=MakeImageSrcData("c:\path\to\my.png") %>" />

or C#

Response.Output.WriteLine("<img src=\"" + 
    MakeImageSrcData("c:\path\to\my.png") + "\"/>");

The rendering result depends on the browser, though, as Wikipedia describes:

  • Firefox and Chrome render the embedded images correctly
  • Internet Explorer (7 (Vista)/8 (Win7)) only renders some because of a 32kB limit
  • Word 2007 only renders image placeholders

This is too bad, since I originally intended to generate WordHtml and include pictures directly in HTML.

Deleting Controls (Checkboxes, Hidden, etc.) in Excel 2007

If you ever happened to copy web contents, such as tables or forms, into Excel (2007), you know it’s not straight-forward to delete the form controls, such as checkboxes, hidden fields, or edit controls.

The function to delete these controls is located in the Developer tab which may not be visible in the default installation.

To display the Developer tab, click on the Office Button (“Jewel”), select Excel Options, Popular, and check the “Show Developer tab in Ribbon” check box.

Once the Developer tab is visible and selected, enable “Design Mode”.

If you move the mouse over a checkbox or any other control in this mode, the cursor changes to a cross of 4 arrows, which allows you to select by clicking and then hitting the Delete key to delete.

References:

Saving Embedded Pictures in Outlook 2007

Outlook 2000 used to provide a simple way to save an embedded image from an email: simply right-click on the image, select Save…, and you’re done.

In Outlook 2007, you can only Copy an image to the clipboard.

One widely circulating solution involves writing (or pasting from the source) a macro which is executed by pressing a customized button.

I just discovered a different solution which does not involve code or other tools:

  • Open the message in a window (i.e. not only in the preview pane)
  • In the Actions tab, select the Other Actions dropdown
  • Select View in Browser

This opens your default browser displaying the original HTML email and automatically downloading embedded images. Use your browser’s menu and commands to download the images.

Missing Procedure Entry Point in TortoiseSVN

I tried to add version control on a machine with a quite dated installation of TortoiseSVN (1.5.5 Build 14361 of Oct 2008), but could not start any of the programs in the TortoiseSVN Programs menu due to the error message:

The procedure entry point ? _Xbad@tr1@std@@YAXW4error_type@regex_constants@12@@Z
could not be located in the dynamic link library MSVCP90.dll

I checked and found a library MSVCP90.dll in \windows\system32 (version 9.0.21022.8).

On a different PC, where I am using TortoiseSVN regularly, I found a DLL with the same name under \Windows\WinSxS\x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.30729.1_x-ww_6f74963e\, showing version number 9.0.30729.1.

Back on the original PC, the same DLL with the same version is found in the same directory. Surprise.

Well, my conclusion is that different versions of the MS C++ Runtime library install in different places, probably depending on the application installing the libraries.

My problem was solved by copying the most recent version of the 3 DLLs (msvcm90, msvcp90, msvcr90) from the WinSxS subdirectory to \Program Files\TortoiseSVN\bin. Works!

Copy Tab URLs in Firefox

I have no idea whether this function is common knowledge, but yesterday I discovered that copying the URLs of all open tabs of a window is quite simple in Firefox (3.5).

First, execute Bookmarks, Select All Bookmarks... and save the tabs into a bookmark folder.

Next, locate the new folder in your bookmarks, right-click and select Copy.

The effect is that the clipboard now holds all the URLs contained in the folder.

If you paste as plain text (e.g. Notepad), only the URLs will show. If you paste as rich text (Word, Outlook), you get hyperlinks.

Searching the intertubes brings up a lot of add-ons that might save you a click or two, but its good to know that the function is essentially built-in.