Visualizing ECDC’s COVID-19 data

The ECDC (European Centre for Disease Prevention and Control) is providing world-wide data on Coronavirus infections and deaths.

The data format changed a couple of times, beginning with date-stamped Excel files (.xlsx), then .csv files.  then the column names  and as of today, no more .xlsx files are provided.

(I initially tried to analyze the .xlsx files with SheetJS, but always got the error message

Cannot find file [Content_Types].xml in zip

which I could not solve, therefore switching from .xlsx to .csv).

As file hosting moved to opendata.ecdc.europa.eu, column names changed (at least in the .csv), and the date has been removed from the filename.

So I started out with a little pen on CodePen.io (data as of March 25), consisting of loading data from .csv, selecting countries of interest, and the kind of data to be displayed in a Chart.js chart:

  • Number of cases, or number of deaths
  • Count per day (as directly from the data files); total sum per day; and change in cases per day, optionally averaged over a number of days

After experimenting with the pen, I exported it and created the full HTML page containing the original pen, and adjusting a couple of things.

Next, I created a Scheduled Task to fetch the latest ECDC data and store the file on the webserver. (Originally I had a problem with fetching the file directly in Javascript because the webserver did not provide a Access-Control-Allow-Origin header)

You can find the most current version of my Covid visualization on my website.

Opening .url Files in Ubuntu

When browsing the web with Chrome for Android, I save the URLs on my Nextcloud server by sharing using the Nextcloud App. Each URL is then stored as a .url file looking like this

[InternetShortcut]
URL=https://devio.wordpress.com/

Today I noticed that those .url files cannot be opened on Ubuntu, i.e. a double-click won’t start a browser with the contained URL.

Instead, I get a an error dialog

Could not display “<HTML page title>.url”.

There is no application installed for “Internet shortcut” files.
Do you want to search for an application to open this file?

No     Yes

Screenshot from 2020-03-22 07-40-58.png

Clicking the Yes button, a toast message appears

mimetype required.png

which you have to click before it disappears, which finally opens the software installer:

unable to find software.png

Not good.

Surprisingly, Firefox does not register itself as an application to handle the .url file extension on Ubuntu. It also does not know that the Windows Firefox would know how to open the file.

More surprisingly, Ubuntu knows that .url files are “Internet shortcut” files, and have the associated MIME type application/x-mswinurl.

So I had to solve two problems:

  • Retrieve the URL stored in a .url file
  • Start Firefox using this URL using Ubuntu’s MIME type handling

Retrieving the URL stored in a .url file

As shown above, a .url file is simply a text file in .ini format. In it’s simplest form, it contains a section [InternetShortcut] with a single Key “URL=”. The key’s value is the URL to navigate to.

With a little help from askubuntu, I figured out the command to extract the URL value

grep -Po 'URL=\K[^ ]+' *.url

Using the result of the grep operation as argument for firefox would look something like this:

firefox `grep -Po 'URL=\K[^ ]+' "$1"`

After a bit of digging, I found how you can manually add MIME type handlers in Ubuntu. Following those instructions, I created a file

/usr/share/applications/mswinurl.desktop

(you need sudo in this directory) with the following content (spoiler: don’t copy this yet!):

[Desktop Entry]
Name=Firefox Shortcut
GenericName=Firefox Shortcut

Type=Application
Exec=firefox `grep -Po 'URL=\K[^ ]+' %U`
TryExec=firefox
MimeType=application/x-mswinurl;
Icon=firefox

However, this did not work as intended, as I got an error message complaining about the backtick `. So, if I cannot have shell operations in the .desktop file, let’s create a batch file

/usr/local/bin/runurl

and place the shell magic there:

firefox `grep -Po 'URL=\K[^ ]+' "$1"` &

Don’t forget to make the batch file executable using

sudo chmod 755 runurl

and reference the runurl script rather than Firefox in /usr/share/applications/mswinurl.desktop:

[Desktop Entry]
Name=Firefox Shortcut
GenericName=Firefox Shortcut

Type=Application
Exec=runurl %U
TryExec=firefox
MimeType=application/x-mswinurl;
Icon=firefox

After creating the file, run

 sudo update-desktop-database

to register the new .desktop file.

Double-clicking a .url file now opens the URL in a new Firefox tab.

SMOscript 0.50

I added a couple of new options to my scripting tool SMOscript:

  • The new -td command-line switch generates two files for tables: a [table].create.sql file containing the CREATE TABLE statement, and a [table].details.sql file containing the CREATE statements all table details (foreign keys, indexes, triggers), thus allowing recreating of a database schema without referential integrity problems
  • The -o (object name) switch semantics has been extended to other commands than script
  • The new -ol command-line switch implements a filter on object names similar to T-SQL LIKE. The filter is applied on the object name, and if not matching, on the combination “schema.objectname”.

The new command dd operates on data diagrams (see my previous post on SQL Server Data Diagrams):

  • Using dd without parameters will list all data diagrams stored in the database.
  • Using dd -o [diagram name] will list all tables contained in the data diagram, even tables that have been DROPped.
  • Using dd -o [diagram name] -dt lists the DROPped tables only.

The latest version of SMOscript is available for download on my download website.

Accessing MS Access Databases from SQL Server

I recently installed WinTV 8.5, and after scanning TV channels, I noticed that editing channel numbers was incredibly cumbersome.

According to the FAQ, it would involve a lot of scrolling, and 4 clicks – *per channel*. I also noticed a couple other peculiarities in the UI, such as the EPG viewer, which made me research where the program’s data is stored.

Actually it’s all in the FAQ: it’s an MS Access database called hcwChanDB_5.mdb residing in a directory set during installation.

To save your channel database go to:

C:\Users\Public\WinTV\Channel Database (Windows 7 and Vista)

C:\Documents and Settings\All Users\Shared Documents\WinTV\Channel Database (Windows XP)

Copy the  hcwChanDB_5 file to a place  for safe keeping. To replace saved Channels just copy HcwChanDB_5 back to Channel Database Dir.

When I double-click the .mdb file, a nice reminder shows up that I only have a Test edition of MS Access, and not the fully activated Professional edition. Essentially this means that I can view the data, but cannot change it.

Time to start up SSMS, link the Access database and browse the data from SSMS. … which turned out to be not so easy.

Linked Server to MS Access via ODBC System DSN

WinTV creates an ODBC System DSN called HCW_mdb, and you can easily create a linked server in SSMS referencing this DSN. Running

exec sp_catalogs 'hcw'

on the linked server returns the path names of the MS Access databases, but using them as catalog names in a SELECT * FROM statement only raises an error message

OLE DB provider “MSDASQL” for linked server “hcw” returned message “[Microsoft][ODBC Microsoft Access Driver] Syntaxfehler (fehlender Operator) in Abfrageausdruck ‘`Tbl1002`.`id` `Col1005`’.”.
Msg 7321, Level 16, State 2, Line 4
An error occurred while preparing the query “SELECT `Tbl1002`.`id` `Col1005`,`Tbl1002`.`service_id` `Col1006`,`Tbl1002`.`preferred_name` `Col1007`,`Tbl1002`.`preferred_number` `Col1008`,`Tbl1002`.`description` `Col1009`,`Tbl1002`.`audio_lang` `Col1010`,`Tbl1002`.`subtitle_lang` `Col1011`,`Tbl1002`.`audio_mode` `Col1012`,`Tbl1002`.`irblaster_data` `Col1013`,`Tbl1002`.`channel_type` `Col1014`,`Tbl1002`.`language` `Col1015`,`Tbl1002`.`genre` `Col1016`,`Tbl1002`.`duplicate_channel_id` `Col1017`,`Tbl1002`.`duplicate_priority` `Col1018`,`Tbl1002`.`blocked_flag` `Col1019`,`Tbl1002`.`last_used` `Col1020`,`Tbl1002`.`epg_source` `Col1021`,`Tbl1002`.`preferred_minor` `Col1003` FROM `D:\data\WinTV\Channel Database\hcwChanDB_5`.`hcwChannels` `Tbl1002`” for execution against OLE DB provider “MSDASQL” for linked server “hcw”.

However, you can query the database using OPENQUERY()

select * from openquery(hcw, 'select * from hcwChannels')

where hcwChannels is one of the tables in the MS Access database.

Linked Server to MS Access via OLEDB provider

I kept searching, and fortunately found a forum post on social.MSDN in which the last comment provided a hint to another workable solution:

Note: The same issue is present using Sql Server 2012 RC0, and still forcing the use of the deprecated MSDASQL provider (ms ole db provider for odbc drivers) as a workaround.

Update: besides using msdasql, it works, in some cases, to do the following before creating the linked server:
right-click and select “properties” for “Microsoft.ACE.OLEDB.12.0” under Server objects, Linked servers, Providers in SSMS.  Check “nested queries” and “allow inprocess.”   Then create the linked server for access accdb using the ACE.OLEDB.12.0 provider, with the access filename under data source.

So I checked the OLEDB provider settings and changed them accordingly. Then I created a linked server using Microsoft.ACE.OLEDB.12.0 referencing the hcwChanDB_5.mdb database.

While I could not enumerate the catalogs of the new linked server, the database could now be queried using standard SELECT statements such as

select * from hcwole...hcwChannels

I could now create a SYNONYM for each table inside the MS Access database.

See my WinTV repository on GitHub for the SQL files.

 

Anonymizing Personal Data

Databases used to collect all the data ever entered, and running out of disk space was solved by adding some new storage device.

Then came GDPR and companies storing personal data (or personally identifyable data) suddenly had to consider Data Deletion Policies and Data Retention Policies.

The easiest way is of course to delete the records storing obsolete personal data. But sometimes (read: almost always) it is not possible to delete the record, because the mere existence of a record may be important for historical or statistical reasons (e.g. the query “number of purchases per time interval and region” would almost certainly refer to a person and their location).

So it’s more realistic that certain data of a person is kept, some is erased, and the personal identifiers (first name, last name) are simply anonymized or pseudonymized.

What would be easier to replace the name fields with an arbitrary combination of letters?

To make the pseudonyms more beautiful, it would be nice if we considered a typical syllable structure, the most basic would be C-V-C (consonant vowel consonant), or any extension thereof: C-V-C-V-C, C-V-C-C-V, etc.

So we define the letters for consonants and vowels as

DECLARE @cons NVARCHAR(100) SET @cons = 'bcdfghjklmnpqrstvwxyz'
DECLARE @vow NVARCHAR(100) SET @vow = 'aeiouäöüaeiouaeiou'

and select a random consonant or vowel using the expressions

SUBSTRING(@cons, (ABS(CHECKSUM(NEWID())) % LEN(@cons))+1, 1)
SUBSTRING(@vow, (ABS(CHECKSUM(NEWID())) % LEN(@vow))+1, 1)

We use the function ABS(CHECKSUM(NEWID())) to generate random numbers rather than RAND() because this expression is evaluated for every record in a multi-record statement, rather than once. To illustrate the difference, compare the result of the query

SELECT RAND(), ABS(CHECKSUM(NEWID())) FROM sys.objects

Note that by repeating a letter inside either strings you can change the relative frequency of that letter in the resulting name.

By prepending ‘Zy’ to the generated string we can make sure that the generated names are easily recognize, and occur at the end of lists.

Replacing a persons name with a generated pseudonym may be as easy as this code:

DECLARE @cons NVARCHAR(100) SET @cons = 'bcdfghjklmnpqrstvwxyz'
DECLARE @vow NVARCHAR(100) SET @vow = 'aeiouäöüaeiouaeiou'
DECLARE @cons1 NVARCHAR(100) SET @cons1 = 'bcdfghjklmnpqrstvw'
DECLARE @vow1 NVARCHAR(100) SET @vow1 = 'aeiou'

UPDATE dbo.PersonTable
SET LastName = ' Zy' 
    + SUBSTRING(@cons, (ABS(CHECKSUM(NewId())) % LEN(@cons))+1, 1)
    + SUBSTRING(@vow, (ABS(CHECKSUM(NewId())) % LEN(@vow))+1, 1)
    + SUBSTRING(@cons, (ABS(CHECKSUM(NewId())) % LEN(@cons))+1, 1)
    + SUBSTRING(@cons, (ABS(CHECKSUM(NewId())) % LEN(@cons))+1, 1)
    + SUBSTRING(@vow, (ABS(CHECKSUM(NewId())) % LEN(@vow))+1, 1)
    + SUBSTRING(@cons, (ABS(CHECKSUM(NewId())) % LEN(@cons))+1, 1),
    FirstName = 'Zy'
    + SUBSTRING(@cons1, (ABS(CHECKSUM(NewId())) % LEN(@cons1))+1, 1)
    + SUBSTRING(@vow1, (ABS(CHECKSUM(NewId())) % LEN(@vow1))+1, 1)
    + SUBSTRING(@cons1, (ABS(CHECKSUM(NewId())) % LEN(@cons1))+1, 1),
WHERE [AnonymizeCondition] = 1

 

AnonymizeCondition refers to the condition you determined when a personal record needs to be anonymized, and has not yet been anonymized.

See the code in my GitHub repository with T-SQL snippets.

Tracking down ScrewTurn Wiki

Back in 2012, the world was shocked by news that ScrewTurn Wiki would not be developed any further.

The universe of .Net-based open-source wikis was small then, and it is even smaller today.

Time to find out what happened to the original STW source code. I tracked down a couple of versions of the original STW3, as well as successors:

Compilation and initial startup showed the following results:

 

Version .Net Version VS Version License compiled run test
3.0.5.610 3.5 12 (2013) GPL V2 VS 2013 ok
3.0.5.629 3.5 11 (2010) GPL V2 VS 2013 ok
3.0.6.630 4.5 12 (2013) GPL V2 VS 2013 fails to load sql page provider
3.1.0.0 4.7.2 15 (2017) GPL V2 VS 2017 fails to load sql provider
4.0.5.145 4.0 11 (2010) GPL V2 VS 2017 with Azure SDK ok
6.0.0.0 4.6 14 (2015) GPL V3 VS 2017 with Azure SDK ok (run as site)

Comments:

  • VS Version is the version of Visual Studio the original code was developed with, while compiled shows the Visual Studio version I used to compile the code.
  • Versions 3.0.6.630 and 3.1.0.0 compiled and ran, but did not load the SQL Providers.
  • Versions 4 and above reference Azure libraries. Since the Azure SDK from Visual Studio 2010 is not available anymore, I used Visual Studio 2017 with its Azure SDK.
  • Version 6 is an ASP.Net MVC application. All previous versions are ASP.Net.
  • Version 6 requires ScrewTurn Wiki to run as an IIS site due to lack of application-relative paths in code and markup. All previous versions also run as IIS applications inside an IIS site.