Tutorial: Translation with gettext + PHP
The translation workflow
The following image shows the translation workflow for PHP and gettext. The basic steps are:
- Extract messages to translate with
xgettext
, stored as Portable Object Template file (.pot) - Use
msginit
andmsgmerge
to create a Portable Object file (.po) for each language - Convert the PO files into optimized Machine Object Files (.mo) using
msgfmt
Prepare PHP for translations
Important notice: Before you start using gettext check if your development and production servers both support it:
- PHP modules
gettext
andmbstring
must be installed on your servers - the locales you want to support must be installed
The next 2 sections explain how you can check the requirements:
Check PHP modules
The translation functionality in PHP requires the modules gettext
and mbstring
.
Call phpInfo()
from a script to see if these modules are
both activated. If you can't see them both, install and activate them in the php.ini file.
Set the language
Start by setting the language you want to display. It does not matter where you do that: It's just vital that you set the language before you request the first translated string.
setlocale(LC_ALL, 'de_DE.UTF-8');
The gettext module in PHP only supports languages and language identifiers that are installed on your server.
Check the available locales by calling
locale -a
on the command line of your server (e.g., by logging in with SSH).
You can install missing locales with
locale-gen <language>
The command may require admin privileges and maybe not work on a hosted shared server. If the language you want to use is not available, you have to ask your admin or switch to a different solution.
Set the translation files
Tell PHP the location of your translations:
bindtextdomain("myapp", "locale");
myapp
is called "text domain" and corresponds to the translation file name. You can callbindtextdomain()
for multiple files and switch between them later.locale
is the directory from which the translation files will be loaded. Its path is relative to the PHP file.
In our example PHP will try to load the following translation file: locale/de_DE.UTF-8/LC_MESSAGES/myapp.mo
LC_MESSAGES
is a predefined folder name, it is mandatory.
de_DE.UTF-8
is the locale identifier you've passed to setlocale()
before.
In the folder name, country code, and the encoding are optional. The
file loader will also try to find these translation files:
- locale/de_DE/LC_MESSAGES/myapp.mo
- locale/de/LC_MESSAGES/myapp.mo
Set text domain
Select the text domain you want to use for translation lookups for all
following _()
calls:
textdomain("myapp");
The passed name must be bound to a translation file before with bindtextdomain()
,
see above.
Add text markup
Now you can wrap all translatable strings with the _()
function. This function
is an alias for the function gettext
.
It replaces the passed string with the translation of the language you set before
with setlocale()
:
<h1>
<?php echo _("Translating PHP pages with gettext")?>
</h1>
The _()
function is also used to extract translatable strings from your source
code automatically.
Create translation files
Extract strings with xgettext
With the xgettext
tool you can extract all translatable strings from the
PHP source code, store them in a PO template file with file extension .pot
.
Don't edit the POT file - xgettext
overwrites it each time you start it.
xgettext --add-comments *.php -o myapp.pot
With the --add-comments
option, comment blocks preceding the _(...)
expression
are copied from the source code to the .pot file. With such comments, you can give
a hint to the translator how he should translate the text.
Create initial PO files with msginit
The POT file is the template file used to create a new PO file for each target language. Place them in directories as mentioned above:
msginit --locale de --input myapp.pot --output locale/de/LC_MESSAGES/myapp.po
msginit --locale fr --input myapp.pot --output locale/fr/LC_MESSAGES/myapp.po
This step is only needed once when you set up your project. As msginit
overwrites an existing PO file, you shouldn't call it if you have already PO files
containing translations. In this case use msgmerge
as described later.
Translate the PO files
Now you can start translating the PO files. You can use a text editor or a special translation editor for that. If your translations contain non-ASCII characters, change the charset in the header from ASCII to UTF-8:
"Content-Type: text/plain; charset=UTF-8\n"
Convert PO to MO files using msgfmt
You next have to convert the PO files into "Message Object" files (with the extension .mo) which can be loaded on the server:
cd locale/de/LC_MESSAGES/
msgfmt myapp.po --output-file=myapp.mo
Enjoy
Start your PHP script and enjoy its translated output. Keep in mind that gettext caches the translation data. If you add/remove/update MO files you might have to restart/reload your web server.
Managing changes with msgmerge
If you're updating your PHP code, you might change translatable texts, too.
This requires an update of the PO/MO files. There's, of course, a gettext tool
to simplify this task, it's called msgmerge
. First, you have to extract the
translatable strings from the new PHP sources using xgettext
. This overwrites
your old POT file with a new one.
Then you can merge the changes into the language-specific PO files:
xgettext --add-comments *.php -o myapp.pot
msgmerge --update locale/de/LC_MESSAGES/myapp.po myapp.pot
... translate new / changed texts in myapp.po ...
msgfmt locale/de/LC_MESSAGES/myapp.po --output-file=locale/de/LC_MESSAGES/myapp.mo
msgmerge
doesn't delete any translation: If a string is no longer
found in the source code, it is marked as deleted in the PO file:
#~ msgid "Draft"
#~ msgstr "Entwurf"
New strings are added to the PO file, for unchanged strings the source code line number, and the extracted comment are updated, if necessary.
msgmerge
also tries to detect changed strings using a fuzzy string comparison.
If only a small part of the string has changed the new string is saved together
with the old translation in the PO file, and the entry is marked as "fuzzy".
Example: If you change "Translation of PHP pages with gettext" to "Translate your PHP pages with gettext" you will get this PO file entry. The translation is outdated and should be updated in the next step:
#: index.php:17
#, fuzzy
msgid "Translate your PHP pages with gettext"
msgstr "Übersetzen von PHP-Seiten mit gettext"
If msgmerge
cannot detect a similarity between old and new string,
the old one is marked as deleted, and the new one is added as new entry,
without translation.