Monthly Archives: January 2014

Guessing game with languages, keyboards and timezones

For the last three years, I’ve been working as a developer of the Anaconda installer, the installer of Fedora, RHEL and other related GNU/Linux distributions. Aside from other things I’ve been spending my efforts on language, keyboard and timezone configuration in the installation process and since there were maybe 20 bug reports, complaints etc. on how pre-selection of these parameters work, I’d like to explain the background of that process a bit.

The very first thing the installer has to provide is language selection so that everything else may be localized based on user’s choice.  Then timezone and keyboard configuration follows, in case of the Anaconda installer as two screens (spokes) reachable from the summary screen (hub). To preselect the language (or more precisely locale) the installer may use the GeoIP information if it’s available. So we can preselect the most likely locale and let user do the decision whether it is the right one or whether it should be changed to something else. Easy, isn’t it? But what’s next? What is the right keyboard and timezone?

At the first glance it may seem easy. The user has chosen “Czech” language, so we simply preset the “Czech” keyboard layout (cz) and “Europe/Prague” as timezone. No problem, right? But what about other languages? E.g. there is the “French” keyboard layout (fr), but you know what? It’s not used by French people much often, the ‘fr (oss)’ layout is the one that is usually used. How to find out something like that? And how to recognize “Greek” and “Greek, modern 1453-” as the same language (the former one as an official language name, the latter one used by the X server configuration files and ISO codes)? That’s why the langtable project [1] was born to provide this kind of information and help not only Anaconda with preselecting sane defaults. Still, what about non-ascii keyboard layouts like “Russian” (ru)? Should there be also “English” (us) layout configured? And should “ru” or “us” be the default? What should be the default switching option?

The situation with timezones is even more complicated. Formerly, users were asked to choose from languages not locales. If user chose “French” as the langauge which timezone should have been preselected? “Europe/Paris”, “Europe/Zurich” or “America/Montreal”? Things got much better with letting user choose the locale, thus one of “French (France)”,  “French (Switzerland)”, “French (Canada)”, etc. Still, is the chosen language best source of information for timezone preselection? If user chooses the “Czech” language, it’s probably sane to expect they will type “Czech” and thus use  the “cz” keyboard layout even if GeoIP tells us they are e. g. in Japan. But in case of the timezone preselection, isn’t GeoIP information better hint? Of course, if there is no GeoIP information (no network), the chosen language is the only hint the installer has.

Wait, what about doing timezone preselection based on chosen keyboard layout? One of the bug reports [2] mentions following:

Anaconda deduces timezone from language.  Should use keyboard instead.  Many people, including me, whose native language is not English, prefer to use English as main distro language instead of dealing with inaccurate or bad translations along with glitschs when displaying non-AsII characters.  So if user selects US English as primary language it does not mean he lives in America.  The real clue is his keyboard.  Even for laptops: if he selects french keybord, it means the laptop has been bought in France and there is a 99,9% chance user lives in France.  So Anaconda should propose France's timezone not US East-Coast timezone.

Maybe this looks like a good idea. However, there are many wrong assumptions hidden in it. First of all, if user selects french keyboard, are they from France, Switzerland, Canada or some other French speaking country? There is no clue. The other issue is the fact that keyboard selection is done in the spoke rechable from the summary hub. Thus timezone preselection based on chosen keyboard would cause undercover changes in the configuration the user will hardly notice. Let’s say user visits the summary hub and sees e.g. the “America/Montreal” timezone in the Date&Time spoke’s status then visits the Keyboard spoke and selects the “Czech” keyboard layout. If the timezone had changed to “Europe/Prague” behind the scenes at that moment, the user would have hardly noticed the change in the Date&Time spoke’s status and would have probably ended with the wrong configuration of the installed system.

So this is how the Anaconda installer plays the guessing game with languages, keyboards and timezones. To sum it up, hints are processed for pre-selections as follows:

  • language: GeoIP
  • keyboard: langtable using the selected language
  • timezone: GeoIP, if not available, langtable using the selected language

Have an idea how to improve this process? Let me know in the comments! I’m truly open to change the implementation to work better. But don’t forget it won’t be only you and people from your country, using your keyboard layout and living in your timezone using the Anaconda installer.