[pve-devel] [PATCH proxmox-i18n] use xgettext to extract translatable strings
Maximiliano Sandoval
m.sandoval at proxmox.com
Fri Dec 1 15:25:27 CET 2023
xgettext is a robust tool to extract translatable strings from source
code.
Using msgcat for concatenating pot files is not recommended, hence we
also switch to xgettext. It also added garbage when there were comments.
What do we get for free:
- It de-escapes strings. there are 3 cases in our code base where
single-quoted strings were used and its `'` had to be escaped, these
were not de-escaped properly when presented to translators. This is one
such example
```diff
#: proxmox-widget-toolkit/src/panel/EmailRecipientPanel.js:39
-msgid "The notification will be sent to the user\\'s configured mail address"
+#, fuzzy
+msgid "The notification will be sent to the user's configured mail address"
msgstr "La notificación sera enviada a el correo configurado del usuario"
```
- xgettext can detect when strings use a certain style of substitutions,
but I was not able to detect the conditions and it only affects a single
string in the entire code base.
```diff
#: proxmox-widget-toolkit/src/Utils.js:995
+#, javascript-format
msgid "{0}% of {1}"
msgstr "{0}% de {1}"
```
- Correct POT-Creation-Date, note how the new one matches the
Revision-Date's format.
```diff
@@ -7,7 +7,7 @@ msgid ""
msgstr ""
"Project-Id-Version: proxmox translations\n"
"Report-Msgid-Bugs-To: <support at proxmox.com>\n"
-"POT-Creation-Date: Wed Nov 22 18:17:30 2023\n"
+"POT-Creation-Date: 2023-12-01 11:25+0100\n"
"PO-Revision-Date: 2023-11-27 16:43+0100\n"
"Last-Translator: Maximiliano Sandoval <m.sandoval at proxmox.com>\n"
"Language-Team: Spanish\n"
```
- Extraction of strings using ngettext, pgettext, etc. Even if we don't
have js wrappers for these at the moment, they are critical to provide
good-quality translations and could be added in the future.
- We can extract comments from the source code with `xgettext -c`.
Newly added comments won't mark strings as fuzzy but can provide
helpful context to translators.
Comments are additive, if for example two sources contain
the same string with different comments and it appears a third time
without comments, the three sources and the two comments will be shown
to translators.
These are a few examples that could be implemented in our codebase:
It is not clear if "Prune Options" prunes the options or configures
pruning.
```js
// TRANSLATORS: Opens the panel that allows configuring how Pruning works
let s = gettext("Prune Options");
```
Adding a source for a concept or its expanded name can help
translators decide whats the gender for a word in their language.
```js
// TRANSLATORS: TOTP stands for Time-based one-time password
let s = gettext("Add a TOTP login factor");
```
Some strings are not marked for translation to avoid translating
certain parts of it, this is a change that could be made
```diff
-fieldLabel: 'Crush Rule', // do not localize
+// TRANSLATORS: Do not translate 'Crush', its a proper name
+fieldLabel: gettext('Crush Rule'),
```
Or simply to give more context when substitutions are involved.
```
// TRANSLATORS: For example 'Join CLUSTER_NAME'
return Ext.String.format(gettext('Join {0}'), `'${cn}'`);
```
Cons:
- In total 3 translations were marked as fuzzy. Translators will have to
review and mark them as translated again.
- If using -c, gettext can't distinguish if the comment above is useful
for translators. The common practice is to add a `TRANSLATORS:` tag to
these comments.
- The reordering of sources for each msgstr will create an unnecessarily
massive (yet ultimately harmless) diff (approx. 50k insertions(+) 50k
deletions(-)).
Signed-off-by: Maximiliano Sandoval <m.sandoval at proxmox.com>
Thomas: Should this be merged, please run `make do_update` and commit the
changes to each .po{,t} file. I am not sure if it is possible to even
send an email with over 100k lines of text.
---
Makefile | 10 +++-
jsgettext.pl | 135 ---------------------------------------------------
2 files changed, 8 insertions(+), 137 deletions(-)
delete mode 100755 jsgettext.pl
diff --git a/Makefile b/Makefile
index 1d7af6e..4776e02 100644
--- a/Makefile
+++ b/Makefile
@@ -97,7 +97,13 @@ pbs-lang-%.js: %.po
# parameter 1 is the name
# parameter 2 is the directory
define potupdate
- ./jsgettext.pl -p "$(1) $(shell cd $(2);git rev-parse HEAD)" -o $(1).pot $(2)
+ find . -iname "*.js" -path "./$(2)*" | xargs xgettext -c -s \
+ --from-code="UTF-8" \
+ --package-name="$(1)" \
+ --package-version="$(shell cd $(2);git rev-parse HEAD)" \
+ --msgid-bugs-address="<support at proxmox.com>" \
+ --copyright-holder="Copyright (C) Proxmox Server Solutions GmbH <support at proxmox.com> & the translation contributors." \
+ --output="$(1)".pot
endef
.PHONY: update update_pot do_update
@@ -124,7 +130,7 @@ init-%.po: messages.pot
.INTERMEDIATE: messages.pot
messages.pot: proxmox-widget-toolkit.pot proxmox-mailgateway.pot pve-manager.pot proxmox-backup.pot
- msgcat $^ > $@
+ xgettext $^ --msgid-bugs-address="<support at proxmox.com>" -o $@
.PHONY: distclean
distclean: clean
diff --git a/jsgettext.pl b/jsgettext.pl
deleted file mode 100755
index 7f758fd..0000000
--- a/jsgettext.pl
+++ /dev/null
@@ -1,135 +0,0 @@
-#!/usr/bin/perl
-
-use strict;
-use warnings;
-
-use Encode;
-use Getopt::Long;
-use Locale::PO;
-use Time::Local;
-
-my $options = {};
-GetOptions($options, 'o=s', 'b=s', 'p=s') or die "unable to parse options\n";
-
-my $dirs = [@ARGV];
-
-die "no directory specified\n" if !scalar(@$dirs);
-
-foreach my $dir (@$dirs) {
- die "no such directory '$dir'\n" if ! -d $dir;
-}
-
-my $projectId = $options->{p} || die "missing project ID\n";
-
-my $basehref = {};
-if (my $base = $options->{b}) {
- my $aref = Locale::PO->load_file_asarray($base) ||
- die "unable to load '$base'\n";
-
- my $charset;
- my $hpo = $aref->[0] || die "no header";
- my $header = $hpo->dequote($hpo->msgstr);
- if ($header =~ m|^Content-Type:\s+text/plain;\s+charset=(\S+)$|im) {
- $charset = $1;
- } else {
- die "unable to get charset\n" if !$charset;
- }
-
- foreach my $po (@$aref) {
- my $qmsgid = decode($charset, $po->msgid);
- my $msgid = $po->dequote($qmsgid);
- $basehref->{$msgid} = $po;
- }
-}
-
-sub find_js_sources {
- my ($base_dirs) = @_;
-
- my $find_cmd = 'find ';
- # shell quote heuristic, with the (here safe) assumption that the dirs don't contain single-quotes
- $find_cmd .= join(' ', map { "'$_'" } $base_dirs->@*);
- $find_cmd .= ' -name "*.js"';
- open(my $find_cmd_output, '-|', "$find_cmd | sort") or die "Failed to execute command: $!";
-
- my $sources = [];
- while (my $line = <$find_cmd_output>) {
- chomp $line;
- print "F: $line\n";
- push @$sources, $line;
- }
- close($find_cmd_output);
-
- return $sources;
-}
-
-my $header = <<'__EOD';
-Proxmox message catalog.
-
-Copyright (C) Proxmox Server Solutions GmbH
-
-This file is free software: you can redistribute it and/or modify it under the terms of the GNU
-Affero General Public License as published by the Free Software Foundation, either version 3 of the
-License, or (at your option) any later version.
--- Proxmox Support Team <support\@proxmox.com>
-__EOD
-
-my $ctime = scalar localtime;
-
-my $href = {
- '' => Locale::PO->new(
- -msgid => '',
- -comment => $header,
- -fuzzy => 1,
- -msgstr => "Project-Id-Version: $projectId\n"
- ."Report-Msgid-Bugs-To: <support\@proxmox.com>\n"
- ."POT-Creation-Date: $ctime\n"
- ."PO-Revision-Date: YEAR-MO-DA HO:MI +ZONE\n"
- ."Last-Translator: FULL NAME <EMAIL\@ADDRESS>\n"
- ."Language-Team: LANGUAGE <support\@proxmox.com>\n"
- ."MIME-Version: 1.0\n"
- ."Content-Type: text/plain; charset=UTF-8\n"
- ."Content-Transfer-Encoding: 8bit\n",
- ),
-};
-
-sub extract_msg {
- my ($filename, $linenr, $line) = @_;
-
- my $count = 0;
-
- while(1) {
- my $text;
- if ($line =~ m/\bgettext\s*\((("((?:[^"\\]++|\\.)*+)")|('((?:[^'\\]++|\\.)*+)'))\)/g) {
- $text = $3 || $5;
- }
- last if !$text;
- return if $basehref->{$text};
- $count++;
-
- my $ref = "$filename:$linenr";
-
- if (my $po = $href->{$text}) {
- $po->reference($po->reference() . " $ref");
- } else {
- $href->{$text} = Locale::PO->new(-msgid=> $text, -reference=> $ref, -msgstr=> '');
- }
- }
- die "can't extract gettext message in '$filename' line $linenr\n" if !$count;
- return;
-}
-
-my $sources = find_js_sources($dirs);
-
-foreach my $s (@$sources) {
- open(my $SRC_FH, '<', $s) || die "unable to open file '$s' - $!\n";
- while(defined(my $line = <$SRC_FH>)) {
- if ($line =~ m/gettext\s*\(/ && $line !~ m/^\s*function gettext/) {
- extract_msg($s, $., $line);
- }
- }
- close($SRC_FH);
-}
-
-my $filename = $options->{o} // "messages.pot";
-Locale::PO->save_file_fromhash($filename, $href);
-
--
2.39.2
More information about the pve-devel
mailing list