Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Don't ever use the original string as key in the localization table. That will force you to translate "high" difficulty the same as "high" resolution, for example.


sort by: page size:

It would makes localizing nightmare.

> the solution for which is to use icu

We ran headfirst into this issue at my company and we've actually been recommending the opposite (use the "C" locale on the database, treat collation as a render level concern).

I have a whole write up explaining the technical motivations behind that recommendation: https://gist.github.com/rraval/ef4e4bdc63e68fe3e83c9f98f56af...


That's what I expect, too. In the end, the biggest problem is always to understand and change complex foreign code.

Hardcoding 'Fizz' and 'Buzz' inhibits internationalization.

There's also "translate to English", so you can instead treat it directly as a key and use that, or treat the English text as a key and change/add to the file if it's just something like a typo.

Don't recall about the second, we only had translations on one site and it's been a few years.


The article doesn't mention how to resolve string manipulation problem involving locales.

And then you have the issue of those small parts being so common that they could match even normal English. It's not an easy problem to solve :(

> The format does have some pretty major drawbacks too, like the msgid can become "fuzzy" which leads to a differing set of issues related to the unique keying between translations.

I am not sure how much of an issue this is in practice. The main problem of the PO format AFAICT is that it is quite outdated. For instance, it has no support for genders and you cannot "mix" plural rules within a phrase.

> It is interesting you call out cultural issues, did you have any specific examples?

The wikipedia entry on l10n[1] has some examples.

The process of localization is not merely about translating some strings, but adapting them to a specific language and culture, which is the hardest part. For instance, your home page is one of the most important pages in your app and is geared to make as many people as possible sign up. Do you think a simple translation would have the same effect on British, French, Arabs, Japanese etc people?

[1]: https://en.wikipedia.org/wiki/Internationalization_and_local...


don't forget that full internationalization also includes taking care of layouting issues i.e. words that are longer in other languages. Characters that are higher. Right to left text.

"Bad" characters in this context is control characters. So no, it would not affect internationalization at all.

Reminds me of the Pseudo-Locales Windows Vista added that "translate" English strings to things that look like English, but use unusual characters and end up with longer strings in an attempt to catch UI issues before having full localization versions ready.

https://docs.microsoft.com/en-us/windows/win32/intl/pseudo-l...


I dont think you understand how localization works. They have a localization file and they send them off to a translation service. The translation service goes through the file and translates the individual strings (or string fragments).

Either the translators made a mistake, and thought it was referring to a ZIP (regardless of capitalization) and translated accordingly, or a developer used the wrong key when assembling the string references - i.e. he used the equivalent of (this is pseudocode as I dont know how they handle localizations):

  localize("CompressToArchive", localize("Zip"), localize("File")) - i.e. with a reference to localization of "Zip" (or ZIP, or zip - the dev likely just searched for a string that matched what he wanted to localize)
instead of

  localize("CompressToArchive", "Zip", localize("File")) - i.e. with a string of "Zip"
where the strings are defined as:

  CompressToArchive: Compress to %1.%2 (same for us and uk)
  Zip: ZIP (us) or postcode (uk)
  File: File (same for us and uk)

Nice.

I tried a small change

  "key": "Where is your key?"
==> de

  "key": "Wo ist dein Schlüssel?"
that shows that it correctly translate "key" in the string bit not "key" in the field name.

==> es

  "key": "¿Dónde está tu llave?"
that shows that it adds a "¿" at the beginning that is correct and obvious, but I'm too used to bad translators (perhaps I'm too old).

Which languages are supported?


> Probably not something international software is aware of.

Collation rules that vary by locale exist for this reason, and all major programming languages and OS'es support this. Of course whether the software you use does this or not depends on the developers writing the software.


Somewhat related to injecting unusual characters, in my experience in localization efforts:

Inject a Turkish 'I'. I don't know how to type or paste it here, but picture an English lower case 'i' that is upper case. It is a splendid way among many to shake out some loc bugs.


Spot on. It never occurred to me this was essentially a localisation issue. Blasted annoying one though.

Can’t localize that!

“Shorter/simpler/obvious the sentence is the easier it must be” isn’t actually the case with translations.

UI strings being short usually means hidden heavy context lies in visual elements, so it’ll just strengthen hilarity in mistakes like “Name: SQL Server, Province/Prefecture: Running” (because you know, equivalents to provinces in a region are called “State” in American English...).

“Province” is more or less harmless, but “(has/is/is in/to/like to)Start(ed/ing) type of errors due to missing context can make UI unusable. Oh and it’s un-spottable by non-speakers because they make sense when translated back to original languages.


In practice, even the hardware arrangements of keys on stenotype machines (disregarding the actual assignment of mnemonic labels to keys) are quite region-specific. A big obstacle to i18n in this field.
next

Legal | privacy