Concatenating a list of strings in T-SQL

Remember how clumsy it was to concatenate string identifiers of child records within a sub-SELECT and adding this artificial FOR XML PATH('')? (if you don’t remember)

Writing plain SQL again after a long time, I found the “new” function STRING_AGG (introduced in SQL Server 2017) to be a great simplification. I guess mainly because the sub-select can be inlined and the whole SELECT can be GROUP BY’d.

Case in point, I needed to find out which database tables have unique indexes defined, and which columns they indexed.

Here’s the magic:

select o.name, i.name,
    STRING_AGG(c.name, ', ') WITHIN GROUP (ORDER BY ic.index_column_id)
from sys.index_columns ic
inner join sys.indexes i 
    on ic.object_id = i.object_id and ic.index_id = i.index_id
inner join sys.objects o on i.object_id = o.object_id
inner join sys.columns c 
    on ic.object_id = c.object_id and ic.column_id = c.column_id
where o.is_ms_shipped = 0 
and o.name not like '%_history'
and i.is_primary_key = 0
and i.is_unique = 1
group by o.name, i.name
order by 1, 2

Checking Consistency of Database Column Data Types

Working on a database project, I noticed that column data types where not specified consistently.

For example, one table might have a column “Name” specified as nvarchar(50), and another one as nvarchar(100). Or a column “Description” specified nvarchar(500) in one table, and nvarchar(max) in others.

To find the inconsistencies, I wrote a script which finds inconsistently typed table columns, and then writes out the column specifications:

with cte as (
  select c.name, t.name tname, c.max_length, count(*) C
  from sys.columns c
  inner join sys.objects o on o.object_id = c.object_id
  inner join sys.types t on c.user_type_id = t.user_type_id
  where o.type = 'U'
  and o.name not in ('sysdiagrams', 'dtproperties')
  group by c.name, t.name, c.max_length
)

The first CTE retrieves all unique combinations of column names and type specifications

, cols as (
  select name from cte
  group by name
  having count(*) > 1
)

The second CTE, which operates on the result of the first CTE, filters out the column names with different type specifications

select c.name, o.name, t.name tname,
  case 
    when t.name = 'nvarchar' and c.max_length = -1 then null
    when t.name = 'nvarchar' then c.max_length / 2
    when t.name = 'varchar' then c.max_length 
    else null
  end max_length
from sys.columns c
inner join cols on c.name = cols.name
inner join sys.objects o on o.object_id = c.object_id
inner join sys.types t on c.user_type_id = t.user_type_id
where o.type = 'U'
and o.name not in ('sysdiagrams', 'dtproperties')
order by c.name, o.name

And finally, we select all table names and column specifications along with their effective data type length.

Note that we need to explicitly exclude the built-in tables “sysdiagrams” and “dtproperties”.

Series: Analyzing SQL Server Dependencies

A step-by-step introduction to the system views containing dependency information in SQL Server:

Analyzing SQL Server Dependencies – sys.sql_dependencies

Analyzing SQL Server Dependencies – View Dependencies

Analyzing SQL Server Dependencies – sys.sql_expression_dependencies

Analyzing SQL Server Dependencies – Unresolved Dependencies in sys.sql_expression_dependencies

Analyzing SQL Server Dependencies – Client-side Dependencies

Analyzing SQL Server Dependencies – Read-only Stored Procedures

Analyzing SQL Server Dependencies – Read-only Stored Procedures

Bonus post for my “Analyzing SQL Server Dependencies” series answering the SO question Is it possible to get list of stored procs which only read data?

This was my first try (SQL Server 2008)

select o.name
from sys.objects o
where o.type = 'P' and o.object_id not in (
    select d.object_id from sys.sql_dependencies d where is_updated = 1
)
order by 1

As SO user gvee pointed out, the system view sys.sql_dependencies is deprecated in SQL Server 2012:

This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature. Use sys.sql_expression_dependencies instead.

So I needed to rewrite the query for SQL Server 2012 and higher:

select o.name,  OBJECT_SCHEMA_NAME(o.object_id), *
from sys.objects o
where o.type = 'P' and o.name not in ('sp_upgraddiagrams')
and not exists (
    select 1
    from  sys.dm_sql_referenced_entities(
        OBJECT_SCHEMA_NAME(o.object_id) + '.' + o.name, 'OBJECT')
    where is_updated = 1
)
order by 1

OP noted that the is_updated flag is not set for stored procedures that TRUNCATE tables.

I’d suggest the following work-around:

TRUNCATE TABLE MyTable
IF 0=0 DELETE MyTable

Didn’t test, should work 😉

Analyzing SQL Server Dependencies – Unresolved Dependencies in sys.sql_expression_dependencies

SQL Server provides two system views, sys.sql_dependencies and sys.sql_expression_dependencies, to retrieve information about object dependencies.

As I noticed that the referenced_id column of sys.sql_expression_dependencies can be NULL, I wondered how these NULLed records could be useful.

A simple query returns the NULLed records:

select o.name, d.referenced_entity_name, *
from sys.sql_expression_dependencies  d
inner join sys.objects o on d.referencing_id = o.object_id
where referenced_id is null

The column referenced_entity_name contains the name of the object that could not be resolved to a reference_id when the referencing object was created.

At first glance, there seem to be two cases:

  • EXEC storedprocedure where the schema name of the SP is not given
  • UPDATE (et al) with table alias (on temp table)

Mysteriously, tables and views referenced without a schema name are not in the result set.

The case of the EXEC statements can be cleaned up easily by adding the schema name of the stored procedure. For the DML statements using table aliases, it’s best to choose alias names that are not used as object names.

Analyzing SQL Server Dependencies – sys.sql_expression_dependencies

After I noticed that there are differences between the contents of sys.sql_dependencies and the View Dependencies dialog, I analyzed the statements and queries that View Dependencies uses to retrieve the dependencies, to find that the dialog also queries the system view sys.sql_expression_dependencies.

Great, I thought, and modified my original query

select type, name, *
from sys.objects o
where o.object_id not in 
    (select referenced_major_id from sys.sql_dependencies)
and o.type not in ('S', 'D', 'F', 'IT', 'PK', 'TR')
and o.name not like 'dev[_]%'
and o.is_ms_shipped = 0
order by o.type, o.name

to reference sys.sql_expression_dependencies instead of sys.dependencies:

select type, name, *
from sys.objects o
where o.object_id not in 
    (select referenced_id from sys.sql_expression_dependencies)
and o.type not in ('S', 'D', 'F', 'IT', 'PK', 'TR')
and o.name not like 'dev[_]%'
and o.is_ms_shipped = 0
order by o.type, o.name

only to retrieve an empty result set.

Huh?

It turns out that the referenced_id column of sys.sql_expression_dependencies may contain NULL values, and a

expression NOT IN (SELECT NULL)

is always false.

Changing the WHERE condition to

where o.object_id not in 
    (select referenced_id from sys.sql_expression_dependencies 
        where referenced_id is not null)

finally returned a useful result: objects that are (presumably) not referenced by any other SQL Server objects.

Almost…

Analyzing SQL Server Dependencies – View Dependencies

My “exploration tour” through SQL Server dependencies started when I noticed that the sys.sql_dependencies view does not contain all the dependencies listed in the View Dependencies dialog that can be opened by right-clicking an object in SSMS.

To find out which queries SSMS (2008) uses to detect dependencies, I opened SQL Server Profiler. To get all the queries issued, simply select the SQL:BatchStarting EventClass. I won’t reproduce the statements here, but the ones required start with

CREATE TABLE #tempdep...
INSERT INTO #tempdep...
declare @find_referencing_objects...

In the original INSERT INTO, the parameters @_msparam_0 and @_msparam_1 refer to the object name and the object schema name.

Copying the queries from Profiler to SSMS, and additionally declaring and assigning the 2 parameters returns a result set that contains all references for the requested object.

However, since View Dependencies displays the dependency hierarchy in a tree view, the result contains the complete try, not just the direct dependencies.

To filter only for the direct dependencies, and maintain the order as in View Dependencies, the last SELECT statement must be changed to end in

where t.relative_id = OBJECT_ID('myschema.myobject')
order by 3, 2

Partial success… 😉

Analyzing SQL Server Dependencies – sys.sql_dependencies

To find out which SQL Server object depends on other objects, we can query the system view sys.sql_dependencies:

select o.name, ref.name, * from sys.sql_dependencies d
inner join sys.objects o on o.object_id = d.object_id
inner join sys.objects ref on ref.object_id = d.referenced_major_id

and additionally sort by the referencing or by the referenced objects using

order by ref.type, ref.name

or

order by o.type, o.name

respectively.

As objects are created during development, modified, replaced, or becoming out-dated, it may not be clear which objects are still in use and which are not, and the question arises whether some of the object not referenced by any other are accessed by a client application.

To find all objects that are not referenced by other database objects, we can use the query

select type, name, *
from sys.objects o
where o.object_id not in 
    (select referenced_major_id from sys.sql_dependencies)
and o.type not in ('S', 'D', 'F', 'IT', 'PK', 'TR')
and o.name not like 'dev[_]%'
and o.is_ms_shipped = 0
order by o.type, o.name

This statement selects all objects that are not referenced by sys.sql_dependencies, filters out the constraints, triggers, and primary keys, as well as the pre-installed objects and some well-known user-defined objects (beginning with “dev_”, in my case).

However the result also contained functions etc. that I knew where definitely referenced, as a right-click and View Dependencies confirmed, so there must be more information available. What does SSMS do that sql_dependencies alone does not achieve?

Checking MS SQL Server Objects

I have written about a method to check stored procedures in MS SQL Server using T-SQL before. I extended this method to also include views and stored functions, and wrote a small utility called checktsql.

The idea is to retrieve all views, stored procedures and functions from an MS SQL Server database, and execute:

  • SELECT TOP(1) * FROM [View]
  • EXECUTE [Procedure]
  • SELECT [Function]
  • SELECT * FROM [TableValuedFunction]

Each statement is executed within a transaction that is rolled back after execution. All expected parameters are set to NULL. All Exceptions are caught and displayed by the application making it easier to find out which objects are out-of-date and need to be fixed.

checktsql supports to modes of operation: plain execution and setting FMTONLY ON. In the latter case, SQL Server checks all conditional branches (i.e. ignores IF/ELSE).

The application opens one tab per connection/check, and displays the result for each object in a list, along with the type of exception and the exception message. The result list can be filtered to the entries containing errors, and be copied to the clipboard using the context menu. (see screenshots)

Some warnings

  • Even though a stored procedure may execute correctly in normal usage, passing NULL parameters may break its functionality and generate error messages like

SQL Exception    Cannot insert the value NULL into column ‘ID’, table ‘mydb.dbo.tablefoo’; column does not allow nulls. INSERT fails. INSERT fails. The statement has been terminated. The statement has been terminated.

This is caused by the NULL values being passed to the procedure.

  • Long-running stored procedures may time out
  • If you have stored procedure that “clean up” tables or reset or destroy data, you should add a dummy parameter if there is none, and distinguish the check mode from real execution by checking the parameter value for NOT NULL.
  • Setting FMTONLY will cause exceptions if the stored procedure uses temporary tables

SQL Exception    Invalid object name ‘#TBL’.

Finally, if you get a message stating

SQL Exception    Invalid object name ‘MyOtherTable’.

or

SQL Exception    Could not find stored procedure ‘sp_Foo’.

you know where to start to fix your T-SQL code…

checktsql is available for download here.

This is a real version 1.00, so send your comments if something is missing 😉

Listing ON DELETE CASCADE Constraints in MS SQL Server 2000

I needed to run my checks on ON DELETE CASCADE Constraints in MS SQL Server on SQL Server 2000, and needed to adjust the system table names and columns accordingly.

These are the queries adjusted for SQL Server 2000:

DECLARE @t NVARCHAR(128)
SET @t = 'MyTable'

SELECT p.name,
    objectproperty(fk.constid, 'CnstIsDeleteCascade') OnDelete,
    t.name
FROM dbo.sysforeignkeys fk
INNER JOIN dbo.sysobjects t ON fk.fkeyid = t.id
INNER JOIN dbo.sysobjects p ON fk.rkeyid = p.id
WHERE p.name = @t
ORDER BY 1, 3

SELECT DISTINCT p.name,
    objectproperty(fk.constid, 'CnstIsDeleteCascade') OnDelete,
    t.name,
    objectproperty(fk2.constid, 'CnstIsDeleteCascade') OnDelete,
    tt.name
FROM sysforeignkeys fk
INNER JOIN sysobjects t ON fk.fkeyid = t.id
INNER JOIN sysobjects p ON fk.rkeyid = p.id
INNER JOIN sysforeignkeys fk2 ON fk2.rkeyid = t.id
INNER JOIN sysobjects tt ON fk2.fkeyid = tt.id
WHERE p.name = @t
AND t.id <> p.id
ORDER BY 1, 3, 5