I created two Python scripts to extract German IPA data from German and English Wiktionary dumps.
See my posts
and the corresponding GitHub repository german-ipa-dict
.
I created two Python scripts to extract German IPA data from German and English Wiktionary dumps.
See my posts
and the corresponding GitHub repository german-ipa-dict
.
The code for the latest version of dbscript has been sitting around for a while, and I did not find the time to release it until a recent bug report forced encouraged me to finally publish it. What’s more, I promised to provide an update!
So what’s new in dbscript 1.04?
Up to dbscript 1.03, every new supported database engine and every new object type stored their values in either base columns or object type-specific columns. The object type-specific columns are now mapped onto the same columns in the Object table (using some meta data tables, and an internal code generator which also creates the NHibernate classes).
Dependency Diagrams (based on T-SQL analysis of views, stored procedures and functions)
namespace parameter for C# and VB.Net value scripts
Recognize sql server 2008 and 2008R2
Multi-comparison is performed in C# rather than TSQL, reducing the chance of a TSQL timeout, and allows adding/removing project versions after a calculation has been performed.
XmlSchemaCollections are treated as types instead of objects
A couple of UI inconsistences have been fixed
Compare Object HTML-escapes correctly
Administrators can delete projects
dbscript user Mark made pointed out that dbscript does not work with current versions of MediaWiki anymore. A bit of research found that the login form adds a new hidden parameter wpLoginToken which needs to be submitted in the login POST.
dbscript 1.04 has been modified to provide this parameter, and hopefully this fix solves the issue for everybody having problems generating MediaWiki documentation of their databases.
As always, the latest version of dbscript is available for download here.
The upcoming version 1.02 of dbscript supports uploading, versioning and documenting Entity Framework Model Files (extension .edmx).
I have described earlier how to write XSLT files to convert the contents of edmx files into MediaWiki markup.
Follow these links to view sample output generated by dbscript:
In its first version, YuJisho provided a web search interface to a collection of freely available dictionaries. The obvious extension to that principle is to include other encyclopedias and online dictionaries as well.
MediaWiki wikis not only display their contents in the /wiki/ root directory, but also provide a Query API via the /w/api.php URL. This API provides results in various formats, among them JSON, which is typically used by JavaScript clients.
JavaScript code can query this API to search for article titles in a given wiki. jQuery implements the getJSON() method to asynchronously retrieve results. If more than one request is to be executed, the ajax() method has to used with the parameter mode set to ‘queue’.
Out of all available Wikimedia projects, wikipedia.org and wiktionary.org languages have been selected that are most closely related to CJK characters (Chinese, Japanese, Korean) or for which most translations exist in the data (English, German, French, Russian).
So from now on, if you search on YuJisho (for example: 東京 (Tokyo), 北京 (Beijing)), every result page will automatically perform a JavaScript search in various wikis, and provide links to the relevant wiki pages.
The major new feature of version 1.00 of dbscript is the new SQL parser with the following improvements:
The descriptions and dependencies extracted by parsing or importing from a live database are included in SQL scripts and generated XML files, and thus available in all generated documentation. XSL style sheets have been adapted to display object and column descriptions.
Output samples of generated documentation:
Single HTML file documentation
The new features lay the groundwork for future versions of dbscript the implement parsers for other SQL dialects (schema import from database is already supported for Oracle and PostgreSql), dependency diagrams, and more.
dbscript is available for download here.
Please send your feedback 😉
I wrote about my plans to add a new SQL parser engine into dbscript a couple of months ago. Now the time has come to actually implement it for T-SQL (MS SQL Server; Oracle and PostgreSQL will follow in future versions), and I found that I did not foresee all the consequences of my initial intent. The overall architecture remained the same though.
The grammar definition allows to define attributes on non-terminals. Using the SQL Server versions as attributes (2005, 2008), I can mark commands or clauses as their support or feature has been introduced in a specific version, and store the database version as property of the uploaded Project Version.
The parser skips the parts of the uploaded file that it cannot parse, and writes these parts to the upload log. Up to now, the parser would simply fail if it considered the SQL file somehow invalid.
The (C#) object representation of a parsed SQL command has a boolean flag IsHandled for each non-terminal. The code processing the object representation needs to mark every non-terminal object as being handled (i.e. translated into schema information stored in the database). Objects that have not been flagged will also be listed in the upload log.
Next, the new parser allowed me to rewrite the dependency analysis. Until now, dbscript only analyzed view dependencies to order the CREATE VIEW statements. Dependency analysis has now been extended to all database objects.
One more major issue that has been solved is parsing and interpreting EXECUTE statements. Thus an EXEC sp_addextendedproperty is interpreted as adding a description to a database object.
This obviously led to changes in the XML generation of database schemas. Each object now lists its descriptions and the dependencies on and references from other database objects.
The XSL stylesheets which translate a project version XML into markup or HTML have been revised to reflect the additional information in the generated XML.
All taken together, the new parser adds:
See the following links documenting AdventureWorks OLTP 2008 (version 2005) for the effects of the new functionality:
Single HTML file documentation
In both wikis, compare the (old-style) “wikibot” section with the new section “automatically generated” to see the changes.
The next version of dbscript with the mythical version number “1.0” will be released soon 😉
After introducing the XML format of Entity Framework’s edmx files, let’s use that knowledge to create a small XSLT style sheet which displays the mappings of tables and entities in a Wiki-style table (which can be used in MediaWiki and SharePoint wikis).
In the XSLT root, we need to declare all namespaces used by the edmx to access nodes and attributes:
<?xml version="1.0" encoding="utf-8" standalone="yes" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:edmx="http://schemas.microsoft.com/ado/2007/06/edmx" xmlns:store="http://schemas.microsoft.com/ado/2007/12/edm/ EntityStoreSchemaGenerator" xmlns:ssdl="http://schemas.microsoft.com/ado/2006/04/edm/ssdl" xmlns:cs="urn:schemas-microsoft-com:windows:storage:mapping:CS" xmlns:edm="http://schemas.microsoft.com/ado/2006/04/edm" xmlns:a="http://schemas.microsoft.com/ado/2006/04/codegeneration" xml:space="default" > <xsl:output method="html" omit-xml-declaration="yes" /> <!-- input file is C:\path\to\Model.edmx -->
This XSLT does not start with the mappings section, but with the tables and views inside the Schema definition, and then looks up their Mappings definition:
<xsl:template match="/"> <xsl:apply-templates select="edmx:Edmx/edmx:Runtime/edmx:StorageModels/ssdl:Schema" /> </xsl:template> <xsl:template match="ssdl:Schema"> <html> <body> <table width="100%"> <xsl:apply-templates select="ssdl:EntityType" > <xsl:with-param name="namespace" select="@Namespace" /> </xsl:apply-templates> </table> </body> </html> </xsl:template>
This code creates a table row for each database table and its class:
<xsl:template match="ssdl:EntityType" > <xsl:param name="namespace"></xsl:param> <xsl:variable name="table" select="@Name" ></xsl:variable> <xsl:variable name="map" select="/edmx:Edmx/edmx:Runtime/edmx:Mappings/cs:Mapping/ cs:EntityContainerMapping/ cs:EntitySetMapping[cs:EntityTypeMapping/ cs:MappingFragment/@StoreEntitySet=$table]" /> <xsl:variable name="s" select="$map/*/@TypeName" /> <xsl:variable name="p" select="concat('IsTypeOf(', substring($namespace, 1, string-length($namespace) - 5))" /> <xsl:variable name="class" select="substring($s, string-length($p) + 1, string-length($s) - string-length($p) - 1)"> </xsl:variable> <tr valign="top"> <td > [[<xsl:value-of select="@Name"/>]] </td> <td> <xsl:value-of select="$class" /> </td> </tr> </xsl:template> </xsl:stylesheet>
The [[ ]] notation creates a wiki hyperlink that allows developers to document tables and entities, and link to other documentation.
The latest version 0.94 of dbscript adds new functionality:
A user/role model has been implemented which allows the following definitions:
Each user can be assigned to several roles. A role is either a sysadmin role, or assigned to one or more projects. Within a project, a role is defined as project administrator or project user role. Users inherit sysadmin, project admin, and project user capability from their assigned roles.
New documentation generators create documentation as
Adding to existing functionality creating C# const declarations:
Added the following checks on database tables:
The latest version of dbscript is available for download here.
This is part 2 of the description of automating MediaWiki documentation using dbscript, and deals with documenting table data.
First, we need to specify which tables are to be documented: Go to the Project’s Value Scripts menu item, click New, select Script Type “XML”, enter an identifier for the script, and press Insert.
Next, go to the Tables item of the new Value Script. Select a table from the dropdown at the bottom of the page, and select Script Mode “Field Elements” (default).
I included the following tables of AdventureWorks in the documentation:
AddressType, AWBuildVersion, ContactType, CountryRegion, Culture, Currency, Location, SalesReason, SalesTerritory, ScrapReason, Shift, ShipMethod, UnitMeasure.
After you added the desired tables, they are displayed in the list Tables. Click each table name, and then its Columns menu.
From the Column dropdown, select the columns you want to add in your documentation, or press Insert All to include all columns. Columns can be deleted from the list by clicking Delete.
As the Value Script definition is complete, navigate to the Documentation Generator created in the previous post, and display its Parts page.
Under New Part, select Value Script, and click Create.
Enter a Sequence number, select an XSL (default “MediaWiki ValueScript DocGen”), select the newly created value script, and the database you imported the AdventureWorks schema from.
Specify an Alternate Page Section (e.g. “wikivalues”) to avoid overwriting the table definition in the default Page Section. Click Insert.
In the Documentation Generators menu click Generate, and generate the whole documentation, or just the newly created Part containing the data values.
Select the tables listed above to view data documentation samples on this project overview page. dbscript is available for download here.
I previously published a couple of posts on generating Wiki documentation in dbscript, and the current release 0.93 of dbscript puts all the steps together to automatically generate MediaWiki documentation of an MS SQL Server database schema, and values of selected tables.
This is an example of how to generate documentation for the AdventureWorks sample database. You need:
In dbscript, create a Project AdventureWorks. Add a Database with dbo login credentials, and press Test Connection to mark the database connectable. (Alternatively, you can script the AdventureWorks database in Management Studio, and upload the generated sql file)
The Documentation Generator needs to handle tables differently from other database objects, as it needs to apply different XSL style sheets to the schema information.
Thus we create an Object Selection which selects every object except of tables: create an Object Selection “everything except tables”, and click the Objects page. Add Object Type “Table” with the option “Select included objects” selected. Add all other available object types with the “Select excluded objects” option.
Create a Documentation Generator, and select type “MediaWiki” (currently the only available option). Select the newly created Project Version. Enter the Wiki’s URL (http://mywiki.example.com), a section name (optional), the wiki bot’s username and its password.
If no section name is given, the whole page for each database object will be replaced. If you specify a section name, only this section is created or changed.
Next, go to the Parts page.
Under “New Part”, select “Project Version” and click Create. Enter a sequence number (e.g. 100), select an XSL (MediaWiki ProjectVersion overview), and a Page Title (“AdventureWorks Overview”) and save. This will create a summary page.
Go to the Parts page again, select New Part “Object”, click Create. Now we add tables to the documentation. Enter a sequence number (200), select an XSL (MediaWiki Table DocGen), and an Object Type “Table”.
Now we add all database objects other than tables: On the Parts page, select “Object”, click Create. Enter sequence number (300), select XSL (MediaWiki SourceCode DocGen), and specify Object Select “everything except tables”.
Go to the Generate page. Press the Generate button, and watch the progress log for error messages or progress updates.
The results of the documentation generator can be viewed here. dbscript is available for download here.