0

I'm trying to translate my bash scripts using the gettext tools but I have a problem where the encoding seems to be wrong.

Let's say I have the following file called fr.po:

# French translations for my-package package
# Traductions françaises du paquet my-package.
# Copyright (C) 2025 THE my-package'S COPYRIGHT HOLDER
# This file is distributed under the same license as the my-package package.
# Automatically generated, 2025.
#
msgid ""
msgstr ""
"Project-Id-Version: my-package v0.0.1\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2025-11-25 20:36-0500\n"
"PO-Revision-Date: 2025-11-25 17:58-0500\n"
"Last-Translator: Automatically generated\n"
"Language-Team: none\n"
"Language: fr\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n > 1);\n"

msgid "test-message"
msgstr "a à e é è ê ë i î ï o ô ö u ù û ü c ç n ñ"

Then I execute the following:

file --mime ./fr.po # output: ./fr.po: text/x-po; charset=utf-8
msgfmt --output-file='/usr/share/locale/fr/LC_MESSAGES/my-test.mo' ./fr.po

export TEXTDOMAINDIR=/usr/share/locale
export TEXTDOMAIN=my-test
export LANG=fr_CA.UTF-8
export LC_ALL=fr_CA.UTF-8

# The following command works as intended and prints this:
# a à e é è ê ë i î ï o ô ö u ù û ü c ç n ñ
gettext test-message


# However if I use the same command within a string or in a pipeline I get this:
# a □ e □ □ □ □ i □ □ o □ □ u □ □ □ c □ n
printf "$(gettext test-message)"
echo "$(gettext test-message)"
gettext test-message | cat
cat <(gettext test-message)

####

gettext test-message > out.txt
cat out.txt # output: a □ e □ □ □ □ i □ □ o □ □ u □ □ □ c □ n
file --mime out.txt # output: out.txt: text/plain; charset=iso-8859-1

As you can see in the last 3 lines above, gettext seems to encode my message in ISO-8859-1 which is not what I want.

How can I force gettext to give me my message in UTF-8 ? Or how can I work around this issue ?

I tried changing the terminal encoding with chcp.com 65001 but it didn't change anything. I also tried to place the .mo file in /usr/share/locale/fr.utf-8/... but with no avail.

I saw this question too which seems awfully close to my problem but I couldn't find any equivalent to bind_textdomain_codeset that I could call from a bash script.


Here's the content of my-test.mo in UTF-8:

ޒ          ,      <       H      I   3  V   7                test-message Project-Id-Version: my-package v0.0.1
Report-Msgid-Bugs-To: 
PO-Revision-Date: 2025-11-25 17:58-0500
Last-Translator: Automatically generated
Language-Team: none
Language: fr
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Plural-Forms: nplurals=2; plural=(n > 1);
 a à e é è ê ë i î ï o ô ö u ù û ü c ç n ñ  

Update

I found a workaround using iconv. For example:

printf "$(gettext test-message | iconv -f iso-8859-1 -t utf-8)"

However I suspect this workaround must only be used if the script is executed with git-bash (windows) since this problem doesn't exist on linux. On a linux system (and probably WSL) the output of gettext is already in UTF-8 so converting it again would probably result in an error or a bad output.

So I'm still looking for a better alternative where I wouldn't have to wrap gettext in a function to check which OS the script is beeing executed on.

8
  • Your example worked for me as expected in all the test cases (but I had to put sudo before msgfmt). As far as I'm aware of, I also don't have set any other environment variables impacting my locale, and all entries printed by locale show the value fr_CA.UTF-8. Using Bash 5.3.3 and gettext-tools 0.26 Commented Nov 26 at 19:06
  • @pmf I think my problem only occurs on windows (git-bash) and I assume you're on linux or WSL since you had to use sudo which doesn't work in git-bash or at least not out-of-the-box. Also I have git's latest version which comes with bash 5.2.37 Commented Nov 26 at 19:12
  • Right, I must have overlooked the windows and git-bash tags, sorry. Commented Nov 26 at 19:15
  • What is the content of my-test.mo? Commented Nov 26 at 20:59
  • @Philippe It's the output of msgfmt. I added it to the question. Commented Nov 26 at 21:47

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.