I recently developed an application, which, among other things, communicates with other already-existing applications via XML files. The files are passed through MQSeries queues and stored in an SQL Server database.
Today I needed to find database records with contain certain XML content, so I thought, OK, that’s SQL Server 2005, it has XML support, so that shouldn’t be a big deal.
The table stores the XML files in an NTEXT column, so I tried
SELECT ID, CAST(XmlFile AS XML) FROM TableName
which gave me an unexpected error message.
Msg 9402, Level 16, State 1, Line 1
XML parsing: line 1, character 43, unable to switch the encoding
True, SQL Server stores my files in Unicode if I declare the column ntext. But how should I ever be able to store foreign XML documents, if the server can’t handle them? Ignoring the encoding attribute would be very nice in that case.
Next step: try to get rid of the encoding attribute. Easy, just replace it with an empty string:
SELECT ID, CAST(REPLACE(XmlFile, encoding="ISO-8859-1"', '') AS XML) ...
with the undesired side-effect, that *any* occurrence would be replace, not just the one within the ?xml? declaration.
Luckily, that didn’t work either:
Msg 8116, Level 16, State 1, Line 1
Argument data type ntext is invalid for argument 1 of replace function.
You’d expect that, since ntext and nvarchar(max) *behave* the same (i.e. unlimited Unicode strings), functions operating on them shouldn’t make a difference either.
Finally, my solution looked like this:
SELECT ID, CAST(
REPLACE(CAST(XmlFile AS NVARCHAR(MAX)), 'encoding="ISO-8859-1"', '')
AS XML).query('xpath to nodes') AS Node
FROM TableName
Should I be happy that this works, or angry that there’s no simpler way to do it?