postgresql.git - This is the main PostgreSQL git repository.

Age	Commit message (Collapse)	Author
35 hours	downcase_identifier(): use method table from locale provider.	Jeff Davis
	Previously, libc's tolower() was always used for lowercasing identifiers, regardless of the database locale (though only characters beyond 127 in single-byte encodings were affected). Refactor to allow each provider to supply its own implementation of identifier downcasing. For historical compatibility, when using a single-byte encoding, ICU still relies on tolower(). One minor behavior change is that, before the database default locale is initialized, it uses ASCII semantics to downcase the identifiers. Previously, it would use the postmaster's LC_CTYPE setting from the environment. While that could have some effect during GUC processing, for example, it would have been fragile to rely on the environment setting anyway. (Also, it only matters when the encoding is single-byte.) Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
3 days	Remove unused single-byte char_is_cased() API.	Jeff Davis
	https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
8 days	Add pg_iswcased().	Jeff Davis
	True if character has multiple case forms. Will be a useful multibyte-aware replacement for char_is_cased(). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-10-29	Use C11 char16_t and char32_t for Unicode code points.	Jeff Davis
	Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/bedcc93d06203dfd89815b10f815ca2de8626e85.camel%40j-davis.com
2025-10-18	Add pg_iswxdigit(), useful for tsearch.	Jeff Davis
	Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com
2025-09-30	Make some use of anonymous unions [pg_locale_t]	Peter Eisentraut
	Make some use of anonymous unions, which are allowed as of C11, as examples and encouragement for future code, and to test compilers. This commit changes the pg_locale_t type. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org
2025-07-01	Remove provider field from pg_locale_t.	Jeff Davis
	The behavior of pg_locale_t is specified by methods, so a separate provider field is no longer necessary. Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com
2025-07-01	Control ctype behavior internally with a method table.	Jeff Davis
	Previously, pattern matching and case mapping behavior branched based on the provider. Refactor to use a method table, which is less error-prone. This is also a step toward multiple provider versions, which we may want to support in the future. Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com
2025-06-30	Remove unused #include's in src/backend/utils/adt/*	Peter Eisentraut
	Author: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAJ7c6TOowVbR-0NEvvDm6a_mag18krR0XJ2FKrc9DHXj7hFRtQ%40mail.gmail.com
2025-04-21	Fix INITCAP() word boundaries for PG_UNICODE_FAST.	Jeff Davis
	Word boundaries are based on whether a character is alphanumeric or not. For the PG_UNICODE_FAST collation, alphanumeric includes non-ASCII digits; whereas for the PG_C_UTF8 collation, it only includes digits 0-9. Pass down the right information from the pg_locale_t into initcap_wbnext to differentiate the behavior. Reported-by: Noah Misch <noah@leadboat.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250417135841.33.nmisch@google.com
2025-04-12	Harmonize function parameter names for Postgres 18.	Peter Geoghegan
	Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. These inconsistencies were all introduced during Postgres 18 development. This commit was written with help from clang-tidy, by mechanically applying the same rules as similar clean-up commits (the earliest such commit was commit 035ce1fe).
2025-01-24	Add SQL function CASEFOLD().	Jeff Davis
	Useful for caseless matching. Similar to LOWER(), but avoids edge-case problems with using LOWER() for caseless matching. For collations that support it, CASEFOLD() handles characters with more than two case variations or multi-character case variations. Some characters may fold to uppercase. The results of case folding are also more stable across Unicode versions than LOWER() or UPPER(). Discussion: https://postgr.es/m/a1886ddfcd8f60cb3e905c93009b646b4cfb74c5.camel%40j-davis.com Reviewed-by: Ian Lawrence Barwick
2025-01-17	Support PG_UNICODE_FAST locale in the builtin collation provider.	Jeff Davis
	The PG_UNICODE_FAST locale uses code point sort order (fast, memcmp-based) combined with Unicode character semantics. The character semantics are based on Unicode full case mapping. Full case mapping can map a single codepoint to multiple codepoints, such as "ß" uppercasing to "SS". Additionally, it handles context-sensitive mappings like the "final sigma", and it uses titlecase mappings such as "ǅ" when titlecasing (rather than plain uppercase mappings). Importantly, the uppercasing of "ß" as "SS" is specifically mentioned by the SQL standard. In Postgres, UCS_BASIC uses plain ASCII semantics for case mapping and pattern matching, so if we changed it to use the PG_UNICODE_FAST locale, it would offer better compliance with the standard. For now, though, do not change the behavior of UCS_BASIC. Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-17	Support Unicode full case mapping and conversion.	Jeff Davis
	Generate tables from Unicode SpecialCasing.txt to support more sophisticated case mapping behavior: * support case mappings to multiple codepoints, such as "ß" uppercasing to "SS" * support conditional case mappings, such as the "final sigma" * support titlecase variants, such as "ǆ" uppercasing to "Ǆ" but titlecasing to "ǅ" Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-08	Move code for collation version into provider-specific files.	Jeff Davis
	Author: Andreas Karlsson Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477%40proxel.se
2025-01-01	Update copyright for 2025	Bruce Momjian
	Backpatch-through: 13
2024-12-16	Refactor string case conversion into provider-specific files.	Jeff Davis
	Create API entry points pg_strlower(), etc., that work with any provider and give the caller control over the destination buffer. Then, move provider-specific logic into pg_locale_builtin.c, pg_locale_icu.c, and pg_locale_libc.c as appropriate. Discussion: https://postgr.es/m/7aa46d77b377428058403723440862d12a8a129a.camel@j-davis.com
2024-12-03	Perform provider-specific initialization in new functions.	Jeff Davis
	Reviewed-by: Andreas Karlsson Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477@proxel.se