Latest Posts

Topic: Newbie question: Syncing old translations

tothxa
Avatar
Topic Opener
Joined: 2021-03-24, 12:44
Posts: 485
OS: antix / Debian
Version: some new PR I'm testing
Ranking
Tribe Member
Posted at: 2021-03-24, 13:15

I've just started working on the hungarian translations. I've noticed when my daughter played the tutorials (build 21) that coverage is spotty, and it seems that many messages had been translated before, but not updated later on. Diving in on transifex also confirmed this. For older content, there are usually "suggestions" from 3 to 6 years ago. There are even 100% matches, that are still not carried over to the current version. I also tried to see the revision history on github, but that seems pretty useless, as there are just the tons of automated transifex syncs, and it's hard to see whether they contain any actual changes. (yes, I'm new to github as well)

So, I'd like to ask if there is a semi-automatic way to find (and merge) strings where the translation was not updated when the original text changed? I don't mind (actually prefer) using console tools, but I've never done this before. e.g. I wouldn't mind learning to use GNU gettext at all.

Also it seems that the 100% matches are caused by moving files around, but version control not picking it up. (suggestion says old resource was deleted) What can be the cause of this and does anyone have an idea how to find these, if the above does not do it?


Top Quote
Nordfriese
Avatar
Joined: 2017-01-17, 18:07
Posts: 2056
OS: Debian Testing
Version: Latest master
Ranking
One Elder of Players
Location: 0x55555d3a34c0
Posted at: 2021-03-24, 13:35

Hi tothxa and welcome in the forum face-smile.png

Also it seems that the 100% matches are caused by moving files around, but version control not picking it up. (suggestion says old resource was deleted) What can be the cause of this and does anyone have an idea how to find these, if the above does not do it?

This is the easier part of the answer. Transifex categorizes string by context and resource to allow translating the same string in multiple ways depending on the context. This is very useful e.g. for worker names and building helptexts, but has the side effect that we also get some disambiguations where we don't need them.

So, I'd like to ask if there is a semi-automatic way to find (and merge) strings where the translation was not updated when the original text changed? I don't mind (actually prefer) using console tools, but I've never done this before. e.g. I wouldn't mind learning to use GNU gettext at all.

AFAIK Transifex offers no automated way to do this, except viewing every string individually and applying the suggestions. Console tools are probably no help here since the deleted strings are no longer present in the files. What I do in such cases is to use the PoEdit program, which comes with its own translation memory and a pre-translate feature:

  • Load some other PO files into PoEdit to get their strings in your local translation memory. You can obtain such files by checking out any git commit dated five years ago (or whenever the translations of interest existed).

  • Download the resource's PO file from Transifex for offline translation

  • Open it in PoEdit

  • CatalogPre-translate fills in matching strings automatically

  • Go through the strings which were marked as Needs Work and rephrase if necessary

  • Save the file and re-upload it to Transifex


Top Quote
tothxa
Avatar
Topic Opener
Joined: 2021-03-24, 12:44
Posts: 485
OS: antix / Debian
Version: some new PR I'm testing
Ranking
Tribe Member
Posted at: 2021-03-28, 03:59

Thank you very much!

It took me a while to digest this, so I ended up doing the long texts of the campaigns on transifex, even if I probably had to rewrite some previously translated but lost parts from scratch.

But yesterday I finished most of the easy things on transifex, so today I started working in poedit on the repetitive things in tribes. I also learned the git basics, but I still couldn't quite figure out how I am supposed to efficiently explore the history of a single file.

So I just imported all the current hu.po-s to poedit's translation memory, and it did help quite a bit with the repetitive messages in tribes. However it didn't handle the <some plant> (<tiny/small/.../ripe>) texts any better than transifex. Actually this might be better in transifex with the glossary, except for all the time spent waiting for the server. And at least copy&paste from other strings is easier in poedit... except lack of filtering and bookmarking means you still can't really find the places to copy from... Anyway, translation memory in both leaves much to desire. First of all, being able to match parts of strings.

Next up is gettext console tools, and maybe some scripting. face-smile.png

So some notes on my experience that might be helpful for other newcomers:

offline poedit pros:

  • much faster for basic use
  • you can do mass operations on multiple strings at once (like copying over ship names from the source language)

transifex pros missing from poedit:

  • integrated glossary
  • filtering strings to see all similar messages (e.g. military buildings occupied/attacked/taken/lost) to check consistency

transifex annoyances for offline use:

  • you have to download resources one-by-one, no option to get them all in e.g. a zip
  • downloading is tiresome to get to, even after you figure out where they've hidden it

I don't know if it's a bug in my poedit version, but I also couldn't easily diff my offline changes, because transifex keeps the linewrapped original strings and references, but doesn't wrap translations, while poedit either rewraps or unwraps everything depending on line wrap settings, despite having the "keep original formatting" box checked. And GNU gettext's msgmerge doesn't help either, because it also (re)wraps everything.

Edited: 2021-03-28, 04:03

Top Quote
tothxa
Avatar
Topic Opener
Joined: 2021-03-24, 12:44
Posts: 485
OS: antix / Debian
Version: some new PR I'm testing
Ranking
Tribe Member
Posted at: 2021-04-10, 01:01

Meanwhile, I've found the offline tool that does everything that transifex can: lokalize from KDE. I think it should be mentioned on the TranslatingWidelands page. Only drawback is that it pulls in all KDE libraries if you don't have them already installed, and that's a lot.

For the glossary, I wrote a most probably quite ugly-hacky awk script to convert the csv downloaded from transifex to a tbx that lokalize can handle. I think I've seen it mentioned that the glossary can be downloaded as tbx from transifex, but I couldn't, and I'm not sure that lokalize would have accepted it unchanged even if I could, as it seemed quite picky about the fields used in the tbx.

I even figured out that I can pull and push translations with the transifex command line client, so the down/uploading annoyances are gone as well. Maybe this could also be mentioned, including that the program source already contains the tx config file.

Of course if there is an active team for a language, then offline work needs more coordination, so I can understand if you don't want to advertise it too much.


Top Quote
Nordfriese
Avatar
Joined: 2017-01-17, 18:07
Posts: 2056
OS: Debian Testing
Version: Latest master
Ranking
One Elder of Players
Location: 0x55555d3a34c0
Posted at: 2021-04-10, 12:34

Sounds nice, I had not known about it face-smile.png Could you add instructions about these tools to the Translatingwidelands page please? (It's a wiki page so anyone can edit it)


Top Quote
tothxa
Avatar
Topic Opener
Joined: 2021-03-24, 12:44
Posts: 485
OS: antix / Debian
Version: some new PR I'm testing
Ranking
Tribe Member
Posted at: 2021-04-12, 16:06

OK, I did it... Please check it.


Top Quote
Nordfriese
Avatar
Joined: 2017-01-17, 18:07
Posts: 2056
OS: Debian Testing
Version: Latest master
Ranking
One Elder of Players
Location: 0x55555d3a34c0
Posted at: 2021-04-12, 16:41

LGTM. Thanks face-smile.png


Top Quote