As mentioned in my previous blog posts, multilingual capabilities in CiviCRM are implemented using a database-side approach of adding per-language columns and exposing them under their original names via per-language table views. This lets CiviCRM hide its multilingual nature from code that’s not interested in whether it talks to a single- or multilingual install, but introduces new challenges when it comes to upgrades between versions. This part of my project was aimed at polishing this area of CiviCRM development.

The approach of hiding per-language columns under per-language views (along with some fancy triggers for intelligent column population on INSERTs) allows for on-the-fly rewriting of simple, typical SQL queries for CRUD (create, read, update, delete) operations on table contents. As mentioned above, this makes CiviCRM’s multilingual capabilities transparent to any code that does not explicitely asks whether a given install is single- or multilingual.

Unfortunately, this feature of hassle-less coding is a tradeoff, balanced by ugly code needed to properly upgrade multilingual CiviCRM databases. The SQL queries for upgrading between given CiviCRM versions are usually hand-crafted, and not only adjust the database schema, but also quite often shift data from old table columns to new ones – sometimes even computing new table contents based on existing data. My first idea of addressing this was to create fancy logic that would rewrite single-language queries on-the-fly, much like the CRUD queries are rewritten. Unfortunately, real-life examples (espiecially these which compute new contents) quickly showed that this approach won’t fly with anything less than a full-blown SQL syntax parser.

Fortunately, after looking closely at the way CiviCRM handles single- vs. multilingual upgrade paths, I figured out that there’s a pattern applicable to roughly 95% of the cases: the SQL code for both code paths is almost the same, and the crucial difference is that a given SQL operation in a multilingual install should add/drop/alter several columns in parallel (where the same SQL code operates on a single column in single-language installs).

This conditional logic is done in Smarty templates by an {if $multilingual}…{else}…{/if} construct, which basically duplicated the SQL queries – a simple query in the else branch is copied to the if branch and all the localizable columns are wrapped into {foreach} blocks, iterating over the enabled locales. Or, actually, that is the approach of the past – since yesterday, all these abominable conditionals were rewritten thanks to the newly-implemented {localize} Smarty block, which abstracts the whole copy/iterate-if-needed process. The queries look much better now, the code lacks the repetition, and while there still is need for a human touch (to tell which places should get the multilingual treatment) it’s way subtler and easier to maintain than before: the patch introducing this feature in the latest upgrade script (we’re releasing CiviCRM 3.0.alpha1 really soon now…) has the lovely line stats of ‘+124 -312’. :)