summaryrefslogtreecommitdiffstats
path: root/util/locale_database/enumdata.py
Commit message (Collapse)AuthorAgeFilesLines
* Include new languages for CLDR v48Edward Welbourne2025-11-181-0/+42
| | | | | | | | | | | | | | | | Include two new languages, Ladin and Shan, and document the various languages and scripts that show up in the cldr2qlocalexml.py output, that I have checked and seen to contain inadequate information. This may make it easier for future updaters to spot new unknown codes when they show up. These are not picked back to past versions because they're naturally documented as [since 6.11] and picking would involve each past branch getting a minor version as its since. Fixes: QTBUG-141949 Change-Id: If0cb3e3b33cd3ce636fd29e904a9ddd617940314 Reviewed-by: Mate Barany <mate.barany@qt.io>
* Update CLDR to v46Mate Barany2025-01-061-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | New languages added with v46 - Kara-Kalpak - Swampy Cree Several new Chinese-language locales have been added, including one using Latin script, which invalidated some prior QLocale tests, which have been adjusted to fit. Some obsolete time-zone identifiers are now treated as deprecated aliases. These have lost their AnyTerritory association, implying changes to QTimeZone tests. Many redundant likely sub-tag rules for unspecified language have been dropped, in favor of simpler rules. [ChangeLog][Third-Party Code] Updated CLDR data, used by QLocale, to v46. Task-number: QTBUG-130877 Pick-to: 6.9 6.8 Change-Id: I92cf210422c7759dd829a7ca2f845d20e263d25b Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Update CLDR to v45, adding language KuviEdward Welbourne2024-07-111-0/+2
| | | | | | | | | | | | | | | | | | | This was in fact present in v44, but we overlooked it somehow. The new version also fixes some inconsistencies in the data, that I reported against v44.1; in particular, Tamil no longer claims to override the root AM/PM markers (probably because it uses 24-hour time so doesn't need them). Add the test-file under util to the list of files containing generated content. [ChangeLog][Third-Party Code] Updated CLDR data, used by QLocale, to v45. Task-number: QTBUG-126060 Pick-to: 6.8 6.7 6.5 6.2 Change-Id: I81a5bcca49519b55091fc541de6b73b606661bb4 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Fix spacing inconsistencies brought to light by flake8Edward Welbourne2024-04-231-1/+1
| | | | | | | | | | It has many grumbles about spacing, but at least this code is currently consistent about its departure from PEP8's spacing rules (and closer to Qt's) for the present. We can review whether to do a drastic spacing revolution later. Change-Id: Ife4e8a5b02b63434bd9c7ac7ba4cbc11b6311f9f Reviewed-by: Mate Barany <mate.barany@qt.io>
* Rework enumdata.py's commentsEdward Welbourne2024-03-181-28/+54
| | | | | | | | | Turn the large comment at the start into a doc-string and add some more details to it. Fix the Ivory Coast comment's indent and a typo in it. Change-Id: I36b4e5094d3c3d5c5b91809424b424bcac5daafa Reviewed-by: Friedemann Kleint <Friedemann.Kleint@qt.io>
* Update QLocale and calendar data to CLDR v44.1Edward Welbourne2024-02-021-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (This turns out to be identical to v44, for our purposes.) The CLDR license has been revised at v44 to "UNICODE LICENSE V3", which is now included (as LICENSES/UNICODE-3.0.txt) in addition to the old license (still in use, presumably, by UCD - at least until its next update). Some new QLocale::Language entries are needed. There is no change to the time-zone data. Some tests needed changes: * Various Arabic locales now use U+0623 (Arabic letter aleph with hamza above) in exponent separator, replacing plain U+0627 (Arabic letter aleph); it is still followed by U+0633 (Arabic letter seen). * Where likely sub-tags used to fill in world, 001, as territory for a language, they now (e.g. for Prussian and Yiddish) give specific countries. * Tamil locales now have something of a mix of inherited and localized forms for AM/PM, which looks a lot like a mistake in CLDR. * New likely sub-tag rules fix ctor(und_US) and ctor(und_GB), which previously failed. [ChangeLog][Third-Party Code] Updated QLocale's data extracted from the Unicode Common Locale Data Repository (CLDR) to v44.1. The license changed to Unicode License V3. Pick-to: 6.7 6.6 6.5 Fixes: QTBUG-121485 Task-number: QTBUG-121325 Change-Id: Ide1a68016129526d7a5aa3fc67f1a674858696bc Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Change enumdata.py names so comments read more naturallyEdward Welbourne2023-08-091-13/+13
| | | | | | | | Now that the "and" is only seen in enumdata.py and comments, we can s/And/and/ in all the various territory names that used it. Change-Id: Ic376d5904b6f5ab54931f96230c1dd5b7f357b8d Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
* Revise enumdata.py's names to more closely match CLDR'sEdward Welbourne2023-08-091-12/+12
| | | | | | | | | | | | We could already use dashes in some, rather than spaces, and now no longer need to capitalize each word. This changes the *_name_list[] entries for affected languages to more closely match what CLDR gives as their names. It also amends various comments. Added tests for the QLocale::*ToString() functions to cover the entries changed. Task-number: QTBUG-94460 Change-Id: I0163795cb282881f15a97be00a5311c1936c3a09 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Move enum-name-munging from LocaleHeaderWriter to QLocaleXmlReaderEdward Welbourne2023-08-091-10/+14
| | | | | | | | | | | | | | | | | The former needed the latter's .dupes to do the job, so can now just take a method as a tool to do the job instead, letting .dupes become private. In the process refine the munging to free enumdata.py from having to capitalize each word in its names. This will, in due course, let us use more natural forms in various comments. This causes no change to generted data. Update enumdata.py's introduction doc, mainly to reflect this but also fixing the out-of-date names (old *_list have long been *_map) and adding some details to other paragraphs. Task-number: QTBUG-94460 Change-Id: If195b2e94a53a495fc4f1f216bed07a910439fa7 Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
* Add new languages and a script for CLDR v43Edward Welbourne2023-08-011-0/+10
| | | | | | | | | | | Also add a comment to check the locales new additions enable do have substantial data. Some of those added in the past are more or less stubs, for all that they're officially present. Task-number: QTBUG-111550 Change-Id: I04d46ee96303ecec56c056a0deff6a9457b863e9 Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io> Reviewed-by: Mate Barany <mate.barany@qt.io>
* Correct enumdata.py's new-at-v42 entries to match generated dataEdward Welbourne2023-08-011-6/+6
| | | | | | | | | | | | | | | | Amends commit 9a8b9473d5f0fd4639193481ba9b344d91f3f00a - apparently the enumdata.py entries were tidied up after the data had been generated, leading to them being inconsistent (and I missed that in review). That, in turn, meant the next update would have changed the public API enum members, backwards-incompatibly; so make enumdata.py consistent with the released public API. We'll be tidying the order up at Qt 7 in any case. Task-number: QTBUG-110333 Change-Id: I3eed2924ce8b69deb552e923d9b0dc142c5f3a65 Reviewed-by: Konrad Kujawa <konrad.kujawa@qt.io> Reviewed-by: Mate Barany <mate.barany@qt.io> Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
* Update CLDR to v42Mate Barany2023-02-071-1/+10
| | | | | | | | | | | | | | | | | | | | New languages (and one local for each) added with v42 - Haryanvi - Moksha - Northern Frisian - Obolo - Pijin - Rajasthani - Toki Pona It also appears that Canada has changed its date format. Modify the relevant test case to reflect this change. Task-number: QTBUG-110333 Pick-to: 6.5 Change-Id: Ia8975c2866cd54c9e565543d05bacd52f4987909 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io> Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Use SPDX license identifiersLucie Gérard2022-05-161-28/+2
| | | | | | | | | | | | | Replace the current license disclaimer in files by a SPDX-License-Identifier. Files that have to be modified by hand are modified. License files are organized under LICENSES directory. Task-number: QTBUG-67283 Change-Id: Id880c92784c40f3bbde861c0d93f58151c18b9f1 Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
* QLocale: Add support for Kaingang and Nheengatu languagesIevgenii Meshcheriakov2021-11-101-0/+2
| | | | | | | | | | Update the locale generation script to support Kaingang and Nheengatu languages. These are new in CLDR v40. Regenerate the locale data. Task-number: QTBUG-94358 Change-Id: I5195d5161d8c4d9f17129bbcfde39dfd3fcf1cd5 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
* Nomenclature change: s/countr/territor/g in locale scriptsEdward Welbourne2021-05-261-4/+4
| | | | | | | | | | Change the nomenclature used in the scripts and the QLocaleXML data format to use "territory" and "territories" in place of "country" and "countries". Does not change the generated source files. Change-Id: I4b208d8d01ad2bfc70d289fa6551f7e0355df5ef Reviewed-by: JiDe Zhang <zhangjide@uniontech.com> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Rename util/locale_database/enumdata.py's various *_list to *_mapEdward Welbourne2021-05-261-4/+4
| | | | | | | | | These variables provide mappings, not lists, so name them non-deceptively. Change-Id: Idf15e78ad73790bc86dd8b9d4f248d1c4f73993c Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com> Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
* Remove unused functions from enumdata.pyEdward Welbourne2021-05-181-24/+0
| | | | | | | | | | It's now a data-only module. The callers of its code-to-ID functions have, for some time now, been rearranging its mappings to get at data efficiently. Change-Id: Ia16dcaa767203cdf3b81a96bd51793491ad41563 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
* Add the "Territory" enumerated type for QLocaleJiDe Zhang2021-04-151-1/+7
| | | | | | | | | | | | | | | | | | | The use of "Country" is misleading as some entries in the enumeration are not countries (eg, HongKong), for all that most are. The Unicode Consortium's Common Locale Data Repository (CLDR, from which QLocale's data is taken) calls these territories, so introduce territory-based names and prepare to deprecate the country-based ones in due course. [ChangeLog][QtCore][QLocale] QLocale now has Territory as an alias for its Country enumeration, and associated territory-based names to match its country-named methods, to better match the usage in relevant standards. The country-based names shall in due course be deprecated in favor of the territory-based names. Fixes: QTBUG-91686 Change-Id: Ia1ae1ad7323867016186fb775c9600cd5113aa42 Reviewed-by: Edward Welbourne <edward.welbourne@qt.io> Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Add a note explaining what a macrolanguage isEdward Welbourne2020-11-241-0/+5
| | | | | | | | | The comments in enumdata.py indicating macrolanguages meant nothing to me, until I stumbled on a reference that lead me to ISO 639's usage of the term. Add a minimal explanation to save such confusion for others. Change-Id: Ia1d849d93a1d94c04c8c461debdecf879e9a7db5 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Reorder locale enums alphabeticallyEdward Welbourne2020-11-081-733/+740
| | | | | | | | | | | | | Binary-incompatible change: change the numeric values of QLocale's Language, Script and Country enums, as encouraged by a comment in the generator script enumdata.py and clarify documentation around that. In the process (since I was changing almost every line anyway), convert the dictionary values from (mutable) lists of length two to tuples, since they are (and should be) immutable data. Change-Id: I26222bce45b9f5074b1d81ed70015a75ac34adcd Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Use newer names for various languages, territories and scriptsEdward Welbourne2020-11-081-32/+64
| | | | | | | | | | Our enumdata.py namings of countries had fallen somewhat out of sync with CLDR's names. In the process, support including hyphenation in the unsquashed name, along with spacing. Distinguish, in comments, between older renamings and those first seen in Qt6. Change-Id: I91ec444bf35222ab6a9332e389ace19cca0e4fdf Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
* Purge deprecated language and country codes from QLocaleEdward Welbourne2020-10-291-52/+0
| | | | | | | | | | | | | | | | | | | | Requires subsequent re-numbering of the enum tables to eliminate gaps, before locale data can be regenerated. However, it will work with the present locale data, since it merely loses the means to use some names for which the available data was just the name and code. This implies a transient issue of recognising some codes for which there is no actual enum member; but relevant code will work as before, finding nothing but the code and its name. This shall be resolved by a coming BiC change to resort the language, country and script codes, changing the numbering (almost) completely. [ChangeLog][QtCore][QLocale] Various obsolete language and country codes have been removed. Some lacked locale data, others were obsolete aliases. All have been deprecated in 5.15. Task-number: QTBUG-84669 Change-Id: I45fc76a5f2f6c3b0ea3c1bb61e917da984183783 Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
* Update CLDR to v37, adding Nigerian Pidgin as a new languageEdward Welbourne2020-10-261-1/+9
| | | | | | | | | | | | | | | | Routine update by running scripts, ignoring clang-format's extensive grumbles. Added notes to util/locale_database/'s README, on the need for that, and enumdata.py, on when to add entries. As usual, several new locales are also added, for existing languages, territories and scripts. [ChangeLog][QtCore][QLocale] Updated to new version of CLDR (the Unicode Consortium's Common Locale Data Repository) v37. Fixes: QTBUG-84669 Pick-to: 5.15 Change-Id: Ib76848bf4bd1219180faf46820077e8d8049a4e3 Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
* Update CLDR to v36Edward Welbourne2019-10-251-1/+4
| | | | | | | | | | | | | | | | | | | | | | Released on October 4th. Adds Windows names for two time zones, Qyzylorda and Volgograd. Added languages Chickasaw (cic), Muscogee (mus) and Silesian (szl). Norwegian number formatting has flipped back to using colon rather than dot as time separator; it's flipped back and forth over the last several CLDR releases. The dot form is present as a variant, the colon form was long given as the normal pattern, then went away; but now it's back as a contributed draft and that's what we pick up. The MS-Win time-zone ID script was iterating a dict, causing random reshuffling when new entries are added. Fixed that by doing the critical iteration in sorted order. Omitted locales ccp_BD and ccp_IN due to QTBUG-69324. Task-number: QTBUG-79418 Change-Id: I43869ee1810ecc1fe876523947ddcbcddf4e550a Reviewed-by: Lars Knoll <lars.knoll@qt.io>
* Add locale support for Cebuano and Erzya languages (new in CLDR v35.1)Edward Welbourne2019-05-201-0/+2
| | | | | | | Change-Id: I5d0ee7bc27eeca1c046d442b0410128ea5abbdb3 Reviewed-by: Thiago Macieira <thiago.macieira@intel.com> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
* Rename util/locale_database/ to include the e that was missingEdward Welbourne2019-05-201-0/+878
It was misnamed local_database, quite missing the point of its name. Change-Id: I73a4fdf24f53daac12304de1f443636d89afacb2 Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>