After sketching my CTE-based multi-REPLACE procedure, I wanted to test in on a real-life scenario. First example is a translation table to convert the cyrillic alphabet into Latin.![]()
First we need to define the data and the translation tables using a case-sensitive cyrillic collation with COLLATE Cyrillic_General_CS_AS:
CREATE TABLE #T (
level int identity(1,1),
old NVARCHAR(50) COLLATE Cyrillic_General_CS_AS,
new NVARCHAR(50) COLLATE Cyrillic_General_CS_AS
)
CREATE TABLE #Data (
data NVARCHAR(50) COLLATE Cyrillic_General_CS_AS
)
Next, we define the transliterations. See here for a list of standardized transliterations.
INSERT INTO #T VALUES(N'ый', 'y') INSERT INTO #T VALUES(N'ЫЙ', 'Y') INSERT INTO #T VALUES(N'а', 'a') INSERT INTO #T VALUES(N'б', 'b') INSERT INTO #T VALUES(N'в', 'v') INSERT INTO #T VALUES(N'г', 'g') INSERT INTO #T VALUES(N'д', 'd') INSERT INTO #T VALUES(N'е', 'e') INSERT INTO #T VALUES(N'ё', 'yo') INSERT INTO #T VALUES(N'ж', 'zh') INSERT INTO #T VALUES(N'з', 'z') INSERT INTO #T VALUES(N'и', 'i') INSERT INTO #T VALUES(N'й', 'y') INSERT INTO #T VALUES(N'к', 'k') INSERT INTO #T VALUES(N'л', 'l') INSERT INTO #T VALUES(N'м', 'm') INSERT INTO #T VALUES(N'н', 'n') INSERT INTO #T VALUES(N'о', 'o') INSERT INTO #T VALUES(N'п', 'p') INSERT INTO #T VALUES(N'р', 'r') INSERT INTO #T VALUES(N'с', 's') INSERT INTO #T VALUES(N'т', 't') INSERT INTO #T VALUES(N'у', 'u') INSERT INTO #T VALUES(N'ф', 'f') INSERT INTO #T VALUES(N'х', 'kh') INSERT INTO #T VALUES(N'ц', 'c') INSERT INTO #T VALUES(N'ч', 'ch') INSERT INTO #T VALUES(N'ш', 'sh') INSERT INTO #T VALUES(N'щ', 'shch') INSERT INTO #T VALUES(N'ъ', '') INSERT INTO #T VALUES(N'ы', 'y') INSERT INTO #T VALUES(N'ь', '''') INSERT INTO #T VALUES(N'э', 'e') INSERT INTO #T VALUES(N'ю', 'yu') INSERT INTO #T VALUES(N'я', 'ya') INSERT INTO #T VALUES(N'А', 'A') INSERT INTO #T VALUES(N'Б', 'B') INSERT INTO #T VALUES(N'В', 'V') INSERT INTO #T VALUES(N'Г', 'G') INSERT INTO #T VALUES(N'Д', 'D') INSERT INTO #T VALUES(N'Е', 'E') INSERT INTO #T VALUES(N'Ё', 'YO') INSERT INTO #T VALUES(N'Ж', 'ZH') INSERT INTO #T VALUES(N'З', 'Z') INSERT INTO #T VALUES(N'И', 'I') INSERT INTO #T VALUES(N'Й', 'Y') INSERT INTO #T VALUES(N'К', 'K') INSERT INTO #T VALUES(N'Л', 'L') INSERT INTO #T VALUES(N'М', 'M') INSERT INTO #T VALUES(N'Н', 'N') INSERT INTO #T VALUES(N'О', 'O') INSERT INTO #T VALUES(N'П', 'P') INSERT INTO #T VALUES(N'Р', 'R') INSERT INTO #T VALUES(N'С', 'S') INSERT INTO #T VALUES(N'Т', 'T') INSERT INTO #T VALUES(N'У', 'U') INSERT INTO #T VALUES(N'Ф', 'F') INSERT INTO #T VALUES(N'Х', 'KH') INSERT INTO #T VALUES(N'Ц', 'C') INSERT INTO #T VALUES(N'Ч', 'CH') INSERT INTO #T VALUES(N'Ш', 'SH') INSERT INTO #T VALUES(N'Щ', 'SHCH') INSERT INTO #T VALUES(N'Ъ', '') INSERT INTO #T VALUES(N'Ы', 'Y') INSERT INTO #T VALUES(N'Ь', '''') INSERT INTO #T VALUES(N'Э', 'E') INSERT INTO #T VALUES(N'Ю', 'YU') INSERT INTO #T VALUES(N'Я', 'YA')
Note that the order of INSERT statements is relevant to the transliterated result. That’s we multi-character replacements are stated first.
I chose some random words from the Russian Wikipedia as test cases:
INSERT INTO #Data (data) VALUES (N'Добро пожаловать') INSERT INTO #Data (data) VALUES (N'Википедия') INSERT INTO #Data (data) VALUES (N'День Конституции Российской Федерации') INSERT INTO #Data (data) VALUES (N'География') INSERT INTO #Data (data) VALUES (N'Америка: Канада (Квебек) — Мексика') INSERT INTO #Data (data) VALUES (N'Философия')
Using the transliteration CTE, we get the following results:
| Америка: Канада (Квебек) — Мексика | Amerika: Kanada (Kvebek) — Meksika |
| Википедия | Vikipediya |
| География | Geografiya |
| День Конституции Российской Федерации | Den’ Konstitucii Rossiyskoy Federacii |
| Добро пожаловать | Dobro pozhalovat’ |
| Философия | Filosofiya |

[...] Latin alphabet using T-SQL One more example of replacing multiple strings using a CTE. After Russian / Cyrillic, it’s time for the Japanese [...]
Thank you! Very helpful!