Welcome, guest ( Login )

WikiHome » InternationalizationOverview

InternationalizationOverview

Version 17, changed by tim 10/23/2006.   Show version history

Note: some parts of the dojo.i18n.* package are marked EXPERIMENTAL and APIs are likely to change

Internationalization (i18n)

Internationalization, also called i18n by omitting the 18 letters in the middle of the word, includes the localization of applications (l10n) into the user's native language as well as representing numbers, currencies, dates and other information in various formats used around the world.

Locale

dojo.locale

The locale is a short string, defined by the host environment, which conforms to RFC 3066 used in the HTML specification. It consists of short identifiers, typically two characters long which are case-insensitive. Note that Dojo uses dash separators, not underscores like Java (e.g. "en-us", not "en_US"). Typically country codes are used in the optional second identifier, and additional variants may be specified. For example, Japanese is "ja"; Japanese in Japan is "ja-jp". Notice that the lower case is intentional -- while Dojo will often convert all locales to lowercase to normalize them, it is the lowercase that must be used when defining your resources.

The locale in the browser is typically set during install and is not easily configurable. Note that this is not the same locale in the preferences dialog which can be used to accompany HTTP requests; there is unfortunately no way to access that locale from the client without a server round-trip.

The locale Dojo uses on a page may be overridden by setting djConfig.locale. This may be done to accomodate applications with a known user profile or server pages which do manual assembly and assume a certain locale. You may also set djConfig.extraLocale to load localizations in addition to your own, in case you want to specify a particular translation or have multiple languages appear on your page.

Using localized string resources

dojo.requireLocalization() / dojo.i18n.getLocalization()

The DojoPackageSystem was leveraged to create a mechanism to load locale-specific resources. Resources are associated with Javascript packages and structured as a Javascript object (like JSON notation) with identifiers as indices. The property values may be strings or any other type. Resources may be located within any Javascript package beneath a specially named "nls" directory, with translations made available in directories named by their locale.

dojo.requireLocalization() is used to declare usage of these resources and load them in the same way that dojo.requires() pulls in Javascript packages. An optional locale argument can specify a particular translation to load; otherwise the one best matching the user agent's locale will be used. A "root" bundle is provided in the case where the requested localization or less-specific variants are not available.

Note: requireLocalization() assumes synchronous loading and currently does NOT work with cross-domain package loading. A build optimization step is being implemented to avoid multiple hits in the locale search and might also be used with the xdLoader.

Use dojo.i18n.getLocalization() to get the object (hash) representing the resources. An optional locale may be specified as an argument or via djConfig.extraLocale, otherwise the user's environment will specify the locale and the best match loaded by the "requireLocalization" step will be used. The localized values will be available as properties on the returned object. For example:

dojo.requireLocalization("myns.mywidget", "prompts");

dojo.requireLocalization("myns.mywidget", "words");

...

var res = dojo.i18n.getLocalization("myns.mywidget", "prompts");

dojo.debug(res.continue); // "Click OK to continue" (or in your language, if a matching translation is provided)



Parameterization of strings is important in i18n as translations may use different ordering of words. dojo.strings.substituteParams() makes it easy to substitute using function arguments or a hash object as a source for substitutions:

var fnf = res.filenotfound; // "The file '%{0}' is not found"

dojo.debug(dojo.strings.substituteParams(fnf, "foo.html")); // "The file 'foo.html' is not found"

var template = res.wantTo; // "Do you want to %{action}?"

var anotherHash = dojo.i18n.getLocalization("myns.mywidget", "words"); // {action: "continue"}

dojo.debug(dojo.strings.substituteParams(template, anotherHash); // "Do you want to continue?"



Builds

dojo.requireLocalization() works a bit like dojo.require() in that it results in additional web hits and can introduce latency. It's a bit worse than that, though. Loading a single bundle results in n+1 web hits, where n is the number of segments in your locale. Some of these hits might be 404s. For example, requesting a bundle with "en-us" will look in the directories for "en-us", "en", and "ROOT". Multiply this by the number of bundles you use and the performance hit can be significant. For this reason, it can be important to factor your resources to trade off between what's really needed by a particular piece of code versus the benefits of loading fewer files.


It's great to have the flexibility to develop with loose files, but fortunately a build step has been introduced to optimize the loading of resources. Like dojo.js, which can include all Javascript on a particular dependency path, the build will now generate an nls/ directory, if needed, which will include all the resources for a list of locales. (How to specify that list is TBD) The nls/ directory must be deployed to your server along with the dojo.js file. So the operation of loading your resources will go from m(n+1) hits down to a single hit, for a total of as few as two hits to load your application, one for the dojo.js, and one for nls/dojo_xx.js. Although only one prebuilt translation file is downloaded to the browser for any particular session, your server may host a long list translations in the nls directory.

For those who are interested, this step is achieved by pulling out the dojo.requireLocalization() references into a temporary file, executing them with Dojo during the build (without the browser) and then sorting the references by locale using an AOP-like approach and writing them to disk using dojo.json.serialize(), in another Dojo script.

Double Byte Character Sets (DBCS) / Non-ASCII support

The web browser does all the work for us here. We simply must make sure that the encodings used match what is used in the data. UTF-8 is a good encoding to use, as it is well supported by all browsers and includes all character sets.

HTML...

NOTE: xmlhttprequest uses UTF-8 by default. Most browsers follow this convention except KHTML (Safari, Konqueror) See ticket #1010 for workarounds.

Bi-directional text (BiDi)

dojo.i18n.isLTR()

Some languages, mostly middle-eastern in origin, have text flow from the right to the left (e.g. Hebrew and Arabic) Again, the web browser generally takes care of this for us, provided the appropriate HTML attributes are set. However, sometimes the logic of an application or widget must change to accomodate bidi. For example, menu widgets drop from the upper-right hand corner of the screen and are right-justified. Some icons may also have different orientation. isLTR() can be checked to provide alternate logic for true and false results.

Common Locale Data Repository (CLDR) Project

The CLDR at unicode.org is a joint project by companies like Microsoft and IBM to compile locale-specific information necessary to localize applications. It provides translated strings for things like time zones, month names, country names, and also formatting information for numbers, currencies, dates and times, and multiple calendaring systems. The Dojo i18n package seeks to leverage this repository such that virtually all available locales can be supported by code which is data driven and able to take updates from this repository. The data is available at the unicode.org site in XML format and transformed into JSON string resources to be used by Dojo's loading mechanism. (The transformation is currently manual, but should someday be driven by a process or stylesheet)

Numbers

dojo.i18n.number is EXPERIMENTAL and needs to be rewritten to use data from the CLDR and to follow the style of the datetime API.

Currency

dojo.i18n.currency is EXPERIMENTAL and needs to be rewritten to use data from the CLDR and to follow the style of the datetime API.

Dates and Times

dojo.date.format(dateObject, options)

dojo.date.parse(value, options)

Uses date/time formats from the CLDR appropriate to the user's locale to format Date objects into Strings and parse String objects into Dates. The actual formatting patterns are hidden in normal usage, such that it is possible to simply request to format a date using the "long", "short", or other known formats. Special patterns may be specified, or additional resource bundles may be used to provide sets of localizations for custom formats. The patterns used follow the specification and are similar to those used by the Java dateformat class (e.g. MMDDYYYY). Optionally the date and time may be processed together or individually.

dojo.date.strftime(dateObject, format)

Functionally similar to the format() method, strftime() follows the unix convention (e.g. %m/%d/%y) for specifying date and time formats, where the format is typically offered as a string rather than by locale. %x does indicate to use the default locale format, but not with as many options as format() strftime() is provided for legacy purposes but may also be popular with certain developer communities.

dojo.date.getNames(item, type, use, locale)
dojo.date.getDayName(dateObject, locale), getDayShortName, getMonthName, getMonthShortName

The strings representing day and month names are available as localized strings via these methods, as provided by the CLDR. getNames() lets you specify the type of string you want, such as the day of the week, and whether you want the full version (Monday), an abbreviation (Mon), etc.

Calendars

TBD

Validation

TBD

Attachments (0)

  File By Size Attached Ver.