Importing mysqldump into MS SQL Server

To import data from MySQL into a MS SQL database, we need to first analyze the output of mysqldump, the backup utility of MySQL. I have covered the most obvious differences to T-SQL in my previous post.

Fortunately, the output of mysqldump is line-oriented, i.e. every CREATE TABLE statement, table column or index, and INSERT statement is in a separate line.

In a brute-force approach, we can therefore parse each line using regular expressions, and do not require full parsing involving tokenizer and grammar.

I found a couple of interesting points that need to be considered when generating T-SQL statements:

  • As T-SQL did not support “DROP IF EXISTS” before SQL Server 2016, we need to provide the classical variant “IF OBJECT_ID() IS NOT NULL”
  • NVARCHAR(MAX) columns cannot be indexed
  • BINARY data needs to be 0x-encoded
  • UNIQUE INDEXes should not include NULLable columns
  • Multi-row INSERT INTO statements cannot contain more than 1000 rows
  • ‘0000-00-00’ is a valid date in MySQL, but not in MSSQL. The value is being replaced by ‘1800-01-01’ to avoid the error message

    The conversion of a varchar data type to a datetime data type resulted in an out-of-range value.

Some indexes may require manually editing column collations, e.g. for unique case-sensitive MySQL indexes.

When executing T-SQL statements in SSMS, a number of warnings or errors can occur:

  • A warning that indexed data is supported to a maximum length of 960 bytes
  • Inserting data may result in the error “String or binary data would be truncated.”
  • Some files are too big to execute in SSMS, and need to be executed using sqlcmd
  • A batch with too many too large INSERT INTO statements may raise the error

    There is insufficient system memory in resource pool ‘default’ to run this query

    so we break up the batch using “GO”

  • Multiple cascade paths

MS SQL Server does not support ON DELETE CASCADE clauses such that the grandchild record of a parent-child-grandchild relation cannot be uniquely identified. Creating such a FOREIGN KEY raises the error message

Msg 1785, Level 16, State 0, Line 1
Introducing FOREIGN KEY constraint ‘fk_xxx’ on table ‘xxx’ may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.

In such cases, either drop the ON DELETE CASCADE if possible, and replace the deletion cascade with an INSTEAD OF DELETE trigger.

I developed mysqldump2mssql, a prototype implementing these rules, which is available on my GitHub. The program has been developed and tested against MySQL dumps of MediaWiki, Joomla, Bugzilla and Nextcloud.

The tool generates up to 4 files for each table defined in the source SQL file:

  • create.sql: the CREATE TABLE statement, optional CREATE UNIQUE INDEX statements
  • data.sql: the INSERT INTO statements
  • fk.sql: the FOREIGN KEY relations to other tables
  • fulltext.sql: the CREATE FULLTEXT statement

The filenames of the generated files consist of the sequence number of execution, the table name, and one of the endings mentioned in above table.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.