Skip to content

gh-87281: Improve documentation for locale.setlocale() and locale.getlocale() #137313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 74 additions & 10 deletions Doc/library/locale.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,17 @@ The :mod:`locale` module defines the following exception and functions:

If *locale* is given and not ``None``, :func:`setlocale` modifies the locale
setting for the *category*. The available categories are listed in the data
description below. *locale* may be a string, or an iterable of two strings
(language code and encoding). If it's an iterable, it's converted to a locale
name using the locale aliasing engine. An empty string specifies the user's
description below. *locale* may be a :ref:`string <locale_name>`, or a pair,
language code and encoding. An empty string specifies the user's
default settings. If the modification of the locale fails, the exception
:exc:`Error` is raised. If successful, the new locale setting is returned.

If *locale* is a pair, it is converted to a locale name using
the locale aliasing engine.
The language code has the same format as a :ref:`locale name <locale_name>`,
but without encoding and ``@``-modifier.
The language code and encoding can be ``None``.

If *locale* is omitted or ``None``, the current setting for *category* is
returned.

Expand Down Expand Up @@ -345,22 +350,26 @@ The :mod:`locale` module defines the following exception and functions:
``'LANG'``. The GNU gettext search path contains ``'LC_ALL'``,
``'LC_CTYPE'``, ``'LANG'`` and ``'LANGUAGE'``, in that order.

Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
*language code* and *encoding* may be ``None`` if their values cannot be
The language code has the same format as a :ref:`locale name <locale_name>`,
but without encoding and ``@``-modifier.
The language code and encoding may be ``None`` if their values cannot be
determined.
The "C" locale is represented as ``(None, None)``.

.. deprecated-removed:: 3.11 3.15


.. function:: getlocale(category=LC_CTYPE)

Returns the current setting for the given locale category as sequence containing
*language code*, *encoding*. *category* may be one of the :const:`!LC_\*` values
except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.
Returns the current setting for the given locale category as a tuple containing
the language code and encoding. *category* may be one of the :const:`!LC_\*`
values except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.

Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
*language code* and *encoding* may be ``None`` if their values cannot be
The language code has the same format as a :ref:`locale name <locale_name>`,
but without encoding and ``@``-modifier.
The language code and encoding may be ``None`` if their values cannot be
determined.
The "C" locale is represented as ``(None, None)``.


.. function:: getpreferredencoding(do_setlocale=True)
Expand Down Expand Up @@ -615,6 +624,61 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered
part of a character class such as letter or whitespace.


.. _locale_name:

Locale names
------------

The format of the locale name is platform dependent, and the set of supported
locales can depend on the system configuration.

On Posix platforms, it usually has the format [1]_:

.. productionlist:: locale_name
: language ["_" territory] ["." charset] ["@" modifier]

where *language* is a two- or three-letter language code from `ISO 639`_,
*territory* is a two-letter country or region code from `ISO 3166`_,
*charset* is a locale encoding, and *modifier* is a script name,
a language subtag, a sort order identifier, or other locale modifier
(for example, "latin", "valencia", "stroke" and "euro").

On Windows, several formats are supported. [2]_ [3]_
A subset of `IETF BCP 47`_ tags:

.. productionlist:: locale_name
: language ["-" script] ["-" territory] ["." charset]
: language ["-" script] "-" territory "-" modifier

where *language* and *territory* have the same meaning as in Posix,
*script* is a four-letter script code from `ISO 15924`_,
and *modifier* is a language subtag, a sort order identifier
or custom modifier (for example, "valencia", "stroke" or "x-python").
Both hyphen (``'-'``) and underscore (``'_'``) separators are supported.
Only UTF-8 encoding is allowed for BCP 47 tags.

Windows also supports locale names in the format:

.. productionlist:: locale_name
: language ["_" territory] ["." charset]

where *language* and *territory* are full names, such as "English" and
"United States", and *charset* is either a code page number (for example, "1252")
or UTF-8.
Only the underscore separator is supported in this format.

The "C" locale is supported on all platforms.

.. _ISO 639: https://www.iso.org/iso-639-language-code
.. _ISO 3166: https://www.iso.org/iso-3166-country-codes.html
.. _IETF BCP 47: https://www.rfc-editor.org/info/bcp47
.. _ISO 15924: https://www.unicode.org/iso15924/

.. [1] `IEEE Std 1003.1-2024; 8.2 Internationalization Variables <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02>`_
.. [2] `UCRT Locale names, Languages, and Country/Region strings <https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings>`_
.. [3] `Locale Names <https://learn.microsoft.com/en-us/windows/win32/intl/locale-names>`_


.. _embedding-locale:

For extension writers and programs that embed Python
Expand Down
Loading