SQL Server Sysobjects Types (Version 2019)

February 26, 2020

MS SQL Server stores the information about the objects (tables, views, functions, etc) stored in the database in a table, which is accessible by selecting from the sys.objects view. The type of object is given by the column “type”

I collected this cross-version information from Microsoft Docs online, but fixed a couple of historical mistakes there.

Object type: 2008 2012 2016 2019
sys.objects.type
AF Aggregate function (CLR) x x x x
C CHECK constraint x x x x
D DEFAULT (constraint or stand-alone) x x x x
EC Edge constraint x
ET External Table x x
F FOREIGN KEY constraint x x x x
FN SQL scalar function x x x x
FS Assembly (CLR) scalar function x x x x
FT Assembly (CLR) table-valued function x x x x
IF SQL inline table-valued function x x x x
IT Internal table x x x x
P SQL stored procedure x x x x
PC Assembly (CLR) stored procedure x x x x
PG Plan guide x x x x
PK PRIMARY KEY constraint x x x x
R Rule (old-style, stand-alone) x x x x
RF Replication-filter-procedure x x x x
S System base table x x x x
SN Synonym x x x x
SO Sequence object x x x
SQ Service queue x x x x
TA Assembly (CLR) DML trigger x x x x
TF SQL table-valued-function x x x x
TR SQL DML trigger x x x x
TT Table type x x x x
U Table (user-defined) x x x x
UQ UNIQUE constraint x x x x
V View x x x x
X Extended stored procedure x x x x

The sys.objects.type_desc column contains the description corresponding to the type:

type sys.objects.type_desc
AF AGGREGATE_FUNCTION
C CHECK_CONSTRAINT
D DEFAULT_CONSTRAINT
EC
ET
F FOREIGN_KEY_CONSTRAINT
FN SQL_SCALAR_FUNCTION
FS CLR_SCALAR_FUNCTION
FT CLR_TABLE_VALUED_FUNCTION
IF SQL_INLINE_TABLE_VALUED_FUNCTION
IT INTERNAL_TABLE
P SQL_STORED_PROCEDURE
PC CLR_STORED_PROCEDURE
PG PLAN_GUIDE
PK PRIMARY_KEY_CONSTRAINT
R RULE
RF REPLICATION_FILTER_PROCEDURE
S SYSTEM_TABLE
SN SYNONYM
SO SEQUENCE_OBJECT
SQ SERVICE_QUEUE
TA CLR_TRIGGER
TF SQL_TABLE_VALUED_FUNCTION
TR SQL_TRIGGER
TT TYPE_TABLE
U USER_TABLE
UQ UNIQUE_CONSTRAINT
V VIEW
X EXTENDED_STORED_PROCEDURE

My previous post on this topic only covered SQL Server up to version 2008.


Updating YuJisho Data Sources

February 25, 2020

After updating the web application of YuJisho, it was high time to provide the dictionary with current data.

The following data have been imported as of February 2020:

You can browse my online CJK dictionary YuJisho here.


Length of UTF-8 VARCHAR in SQL Server

February 21, 2020

Foreword

For as long as I can remember, a VARCHAR (or CHAR) was always defined as “1 character equals 1 byte”. Different character sets (code pages) where implemented as COLLATIONs, so that you had basic database support for internationalization.

Then came Unicode, and we got NVARCHAR strings (or NCHAR), where the rule was “1 character equals 2 bytes”, and we could store any text from around the world without bothering with code pages, encodings, etc. The .Net framework brought us the string class with similar features and the world was beautiful.

Then, in 2001, came Unicode 3.1 and needed more space:

For the first time, characters are encoded beyond the original 16-bit codespace or Basic Multilingual Plane (BMP or Plane 0). These new characters, encoded at code positions of U+10000 or higher, are synchronized with the forthcoming standard ISO/IEC 10646-2. For further information, see Article IX, Relation to 10646. Unicode 3.1 and 10646-2 define three new supplementary planes.

These additional planes were immediately supported in SQL Server 2012. From now on, using an *_SC collation, NVARCHARs could be 2 or 4 bytes per character.

In C#, the StringInfo class handles supplementary planes, but it seems, they are still a bit behind:

Starting with the .NET Framework 4.6.2, character classification is based on The Unicode Standard, Version 8.0.0. For the .NET Framework 4 through the .NET Framework 4.6.1, it is based on The Unicode Standard, Version 6.3.0. In .NET Core, it is based on The Unicode Standard, Version 8.0.0.

(For the record, the current Unicode version is 12.1, and 13.0 is going to be released soon)

UTF-8 Collations

So now SQL Server 2019 supports UTF-8-enabled collations.

A question on SO quoted the documentation as

A common misconception is to think that CHAR(n) and VARCHAR(n), the n defines the number of characters. But in CHAR(n) and VARCHAR(n) the n defines the string length in bytes (0-8,000). n never defines numbers of characters that can be stored

(emphasis mine) which confused me a little bit, and the quote continues

The misconception happens because when using single-byte encoding, the storage size of CHAR and VARCHAR is n bytes and the number of characters is also n.

(emphasis mine).

This got me investigating, and I had a look into this issue. I create a UTF8-enabled database with a table with all kinds of N/VARCHAR columns

CREATE DATABASE [test-sc] COLLATE Latin1_General_100_CI_AI_KS_SC_UTF8

CREATE TABLE [dbo].[UTF8Test](
  [Id] [int] IDENTITY(1,1) NOT NULL,
  [VarcharText] [varchar](50) COLLATE Latin1_General_100_CI_AI NULL,
  [VarcharTextSC] [varchar](50) COLLATE Latin1_General_100_CI_AI_KS_SC NULL,
  [VarcharUTF8] [varchar](50) COLLATE Latin1_General_100_CI_AI_KS_SC_UTF8 NULL,
  [NVarcharText] [nvarchar](50) COLLATE Latin1_General_100_CI_AI_KS NULL,
  [NVarcharTextSC] [nvarchar](50) COLLATE Latin1_General_100_CI_AI_KS_SC NULL,
  [NVarcharUTF8] [nvarchar](50) COLLATE Latin1_General_100_CI_AI_KS_SC_UTF8 NULL
)

I inserted test data from various Unicode ranges

INSERT INTO [dbo].[UTF8Test] ([VarcharText],[VarcharTextSC],[VarcharUTF8],[NVarcharText],[NVarcharTextSC],[NVarcharUTF8])
    VALUES ('a','a','a','a','a','a')
INSERT INTO [dbo].[UTF8Test] ([VarcharText],[VarcharTextSC],[VarcharUTF8],[NVarcharText],[NVarcharTextSC],[NVarcharUTF8])
    VALUES ('ö','ö','ö',N'ö',N'ö',N'ö')
-- U+56D7
INSERT INTO [dbo].[UTF8Test] ([VarcharText],[VarcharTextSC],[VarcharUTF8],[NVarcharText],[NVarcharTextSC],[NVarcharUTF8])
    VALUES (N'囗',N'囗',N'囗',N'囗',N'囗',N'囗')
-- U+2000B
INSERT INTO [dbo].[UTF8Test] ([VarcharText],[VarcharTextSC],[VarcharUTF8],[NVarcharText],[NVarcharTextSC],[NVarcharUTF8])
    VALUES (N'𠀋',N'𠀋',N'𠀋',N'𠀋',N'𠀋',N'𠀋')

then selected the lengths and data lengths of each text field

SELECT TOP (1000) [Id]
    ,[VarcharText],[VarcharTextSC],[VarcharUTF8]
    ,[NVarcharText],[NVarcharTextSC],[NVarcharUTF8]
FROM [test-sc].[dbo].[UTF8Test]
SELECT TOP (1000) [Id]
    ,LEN([VarcharText]) VT,LEN([VarcharTextSC]) VTSC
    ,LEN([VarcharUTF8]) VU
    ,LEN([NVarcharText]) NVT,LEN([NVarcharTextSC]) NVTSC
    ,LEN([NVarcharUTF8]) NVU
FROM [test-sc].[dbo].[UTF8Test]
SELECT TOP (1000) [Id]
    ,DATALENGTH([VarcharText]) VT,DATALENGTH([VarcharTextSC]) VTSC
    ,DATALENGTH([VarcharUTF8]) VU
    ,DATALENGTH([NVarcharText]) NVT,DATALENGTH([NVarcharTextSC]) NVTSC
    ,DATALENGTH([NVarcharUTF8]) NVU
FROM [test-sc].[dbo].[UTF8Test]

Select Lengths.png

I was surprised to find that the old mantra “a VARCHAR only stores single byte characters” needs to be revised when using UTF8 collations.

Table data only

Note that only table columns are associated with collations, but not T-SQL variables, as you cannot declare a collation on a variable

SELECT @VarcharText = [VarcharText],@NVarcharText = [NVarcharText]
FROM [test-sc].[dbo].[UTF8Test]
WHERE [Id] = 4
SELECT @VarcharText, Len(@VarcharText), DATALENGTH(@VarcharText)
    , @NVarcharText, Len(@NVarcharText), DATALENGTH(@NVarcharText)

SELECT @VarcharText = [VarcharTextSC], @NVarcharText = [NVarcharTextSC]
FROM [test-sc].[dbo].[UTF8Test]
WHERE [Id] = 4
SELECT @VarcharText, Len(@VarcharText), DATALENGTH(@VarcharText)
    , @NVarcharText, Len(@NVarcharText), DATALENGTH(@NVarcharText)

SELECT @VarcharText = [VarcharUTF8], @NVarcharText = [NVarcharUTF8]
FROM [test-sc].[dbo].[UTF8Test]
WHERE [Id] = 4
SELECT @VarcharText, Len(@VarcharText), DATALENGTH(@VarcharText)
    , @NVarcharText, Len(@NVarcharText), DATALENGTH(@NVarcharText)

 

Select Variable Lengths.png

 


Retrieving Active GitHub Forks

February 20, 2020

The problem with GitHub repositories is that, whenever there exists forks for a given repository, it is not immediately obvious which forks are active, containing newer commits, and which forks are just-for-fun forks, which should have been Starred rather than Forked.

Of course, you can always navigate to that forked repository, and check for “commits ahead” and “commits behind”, but that’s really painful.

Apparently there are others that think such a feature would be useful

but the functionality is still not built-in. Both tickets link to helper applications or scripts that might perform the task, though.

So I came across Active GitHub Forks which lists Stars, Forks and Last Push date for each forked repository of a given base repository. (Their are also other solutions such as this Active GitHub Forks page)

So I forked this repository, had a little look on the GitHub API, I learned that you can query the forks of a given repository, sorted by stars or newest or oldest, but the list of forked repositories does not contain the relevant attributes.

Instead, you have to query each of the forked repositories separately to retrieve their attributes “total_commits”, “ahead_by”, “behind_by”. And each separate query weighs on GitHub’s rate limit. (No wonder then that some of the tools screenscrape rather than query the API)

Visit my solution of Active GitHub Forks and query the forked repositories individually.

Active GitHub Forks screenshot


Updating YuJisho: a Unicode CJK Character web dictionary

February 17, 2020

My online CJK dictionary YuJisho got a facelift again – this time, from ASP.Net MVC3 to the current MVC 5, and from Bootstrap 2 to Bootstrap 4.

I hope this gets rid of the messages in the Google Search Console 😉 :

  •  Text too small to read
  • Clickable elements too close together
  • Viewport not set

There is a little change though: In times of GDPR &co, queries to Wikipedias and Wiktionaries need to invoked by clicking the “Query Wikipedia” button, rather than querying automatically.

click the button

click the button “Query Wikipedia”

results in links to various Wikipedias containing an article

If your browser / operating system fails to display certain Chinese characters, there is now a button “Load Glyphs” which tries to load the unsupported characters’ images as .svg from GlyphWiki.

after load glyphs

after load glyphs u

Please check the About page for more information.


Essential NuGet packages for ASP.Net MVC

February 13, 2020

Here are some useful packages I came across in recent years:

Topic Package NuGet Source
Logging
log4net nuget apache
Error Logger
elmah.corelibrary nuget github
Elmah.Mvc nuget github
ElmahAppender.log4net nuget github
Profiling
Glimpse nuget github
Glimpse.AspNet nuget
Glimpse.Mvc5 nuget
NHibernate.Glimpse nuget github
Minifier, Bundler
BundleTransformer.Core nuget github
BundleTransformer.Yui nuget github
Data Access
Nhibernate nuget github
Identity Management
NHibernate.AspNet.Identity nuget github
Templating
Handlebars.Net nuget github
RazorEngine nuget github
Documents
EPPlus nuget github
CsvHelper nuget github
PdfSharp nuget pdfsharp
HtmlRenderer.PdfSharp nuget github
Other
DocuVieware Lite
HTML5 Document Viewer For ASP.NET
homepage

SMOscript for Windows 10

February 9, 2020

I recently got notified that my tool SMOscript raises an error message when run on Windows 10.

I had a look at the problem, and indeed, as soon as the enumeration of database objects (e.g. using commands l (list) or s (script)) is to process a table, the program stop. In case of the script command, an error message is issued

Error: iterating objects: Attempt to retrieve data for object failed for Server ”.

The most recent version of SMOscript was still targeted on .Net 2.0, and used the SMO 100 assemblies. Time to upgrade! 😉

So I changed the .Net version to 4.5 and added the nuget SMO package and changed the .Net version to 4.7 to make the compiler happy.

Tests were performed successfully against SQL Server 2014 and 2019, but now SQL Server 2008 caused an error that I could work-around.

The tool now also supports options to sort database objects by schema and name, or by type and schema and name, rather than relying on the order resulting from the SMO method Database.EnumObjects().

Additionally, the -st (Sort Types) parameter allows to specify a custom sort order for database object types (rather than alphabetically), so that the parameter

smoscript -st Table,View

will process objects of types table and view first, and everything else afterwards. The parameter -st without a list of types will display the available types.

The latest version of SMOscript is available for download here.

Note that there is a regular build “V 0.40”, but also a build using the original SMO 100 assemblies.

smoscript 0.40 command-line parameters

smoscript 0.40 sort SMO Types