summaryrefslogtreecommitdiff
path: root/src/backend/optimizer/util
AgeCommit message (Collapse)Author
6 daysFix infer_arbiter_index for partitioned tablesÁlvaro Herrera
The fix for concurrent index operations in bc32a12e0db2 started considering indexes that are not yet marked indisvalid as arbiters for INSERT ON CONFLICT. For partitioned tables, this leads to including indexes that may not exist in partitions, causing a trivially reproducible "invalid arbiter index list" error to be thrown because of failure to match the index. To fix, it suffices to ignore !indisvalid indexes on partitioned tables. There should be no risk that the set of indexes will change for concurrent transactions, because in order for such an index to be marked valid, an ALTER INDEX ATTACH PARTITION must run which requires AccessExclusiveLock. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/17622f79-117a-4a44-aa8e-0374e53faaf0%40gmail.com
8 daysUse palloc_object() and palloc_array() in backend codeMichael Paquier
The idea is to encourage more the use of these new routines across the tree, as these offer stronger type safety guarantees than palloc(). This batch of changes includes most of the trivial changes suggested by the author for src/backend/. A total of 334 files are updated here. Among these files, 48 of them have their build change slightly; these are caused by line number changes as the new allocation formulas are simpler, shaving around 100 lines of code in total. Similar work has been done in 0c3c5c3b06a3 and 31d3847a37be. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
8 daysFix const-simplification for index expressions and predicateRichard Guo
Similar to the issue with constraint and statistics expressions fixed in 317c117d6, index expressions and predicate can also suffer from incorrect reduction of NullTest clauses during const-simplification, due to unfixed varnos and the use of a NULL root. It has been reported that this issue can cause the planner to fail to pick up a partial index that it previously matched successfully. Because we need to cache the const-simplified index expressions and predicate in the relcache entry, we cannot fix the Vars before applying eval_const_expressions. To ensure proper reduction of NullTest clauses, this patch runs eval_const_expressions a second time -- after the Vars have been fixed and with a valid root. It could be argued that the additional call to eval_const_expressions might increase planning time, but I don't think that's a concern. It only runs when index expressions and predicate are present; it is relatively cheap when run on small expression trees (which is typically the case for index expressions and predicate), and it runs on expressions that have already been const-simplified once, making the second pass even cheaper. In return, in cases like the one reported, it allows the planner to match and use partial indexes, which can lead to significant execution-time improvements. Bug: #19007 Reported-by: Bryan Fox <bryfox@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19007-4cc6e252ed8aa54a@postgresql.org
2025-12-02Remove doc and code comments about ON CONFLICT deficienciesÁlvaro Herrera
They have been fixed, so we don't need this text anymore. This reverts commit 8b18ed6dfbb8. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/CADzfLwWo+FV9WSeOah9F1r=4haa6eay1hNvYYy_WfziJeK+aLQ@mail.gmail.com
2025-12-02Remove useless casting to same typePeter Eisentraut
This removes some casts where the input already has the same type as the type specified by the cast. Their presence could cause risks of hiding actual type mismatches in the future or silently discarding qualifiers. It also improves readability. Same kind of idea as 7f798aca1d5 and ef8fe693606. (This does not change all such instances, but only those hand-picked by the author.) Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/aSQy2JawavlVlEB0%40ip-10-97-1-34.eu-west-3.compute.internal
2025-12-01Fix ON CONFLICT ON CONSTRAINT during REINDEX CONCURRENTLYÁlvaro Herrera
When REINDEX CONCURRENTLY is processing the index that supports a constraint, there are periods during which multiple indexes match the constraint index's definition. Those must all be included in the set of inferred index for INSERT ON CONFLICT, in order to avoid spurious "duplicate key" errors. To fix, we set things up to match all indexes against attributes, expressions and predicates of the constraint index, then return all indexes that match those, rather than just the one constraint index. This is more onerous than before, where we would just test the named constraint for validity, but it's not more onerous than processing "conventional" inference (where a list of attribute names etc is given). This is closely related to the misbehaviors fixed by bc32a12e0db2, for a different situation. We're not backpatching this one for now either, for the same reasons. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com
2025-11-29Don't call simplify_aggref with a NULL PlannerInfoDavid Rowley
42473b3b3 added prosupport infrastructure to allow simplification of Aggrefs during constant-folding. In some cases the context->root that's given to eval_const_expressions_mutator() can be NULL. 42473b3b3 failed to take that into account, which could result in a crash. To fix, add a check and only call simplify_aggref() when the PlannerInfo is set. Author: David Rowley <dgrowleyml@gmail.com> Reported-by: Birler, Altan <altan.birler@tum.de> Discussion: https://postgr.es/m/132d4da23b844d5ab9e352d34096eab5@tum.de
2025-11-27Add parallelism support for TID Range ScansDavid Rowley
In v14, bb437f995 added support for scanning for ranges of TIDs using a dedicated executor node for the purpose. Here, we allow these scans to be parallelized. The range of blocks to scan is divvied up similarly to how a Parallel Seq Scans does that, where 'chunks' of blocks are allocated to each worker and the size of those chunks is slowly reduced down to 1 block per worker by the time we're nearing the end of the scan. Doing that means workers finish at roughly the same time. Allowing TID Range Scans to be parallelized removes the dilemma from the planner as to whether a Parallel Seq Scan will cost less than a non-parallel TID Range Scan due to the CPU concurrency of the Seq Scan (disk costs are not divided by the number of workers). It was possible the planner could choose the Parallel Seq Scan which would result in reading additional blocks during execution than the TID Scan would have. Allowing Parallel TID Range Scans removes the trade-off the planner makes when choosing between reduced CPU costs due to parallelism vs additional I/O from the Parallel Seq Scan due to it scanning blocks from outside of the required TID range. There is also, of course, the traditional parallelism performance benefits to be gained as well, which likely doesn't need to be explained here. Author: Cary Huang <cary.huang@highgo.ca> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Steven Niu <niushiji@gmail.com> Discussion: https://postgr.es/m/18f2c002a24.11bc2ab825151706.3749144144619388582@highgo.ca
2025-11-26Have the planner replace COUNT(ANY) with COUNT(*), when possibleDavid Rowley
This adds SupportRequestSimplifyAggref to allow pg_proc.prosupport functions to receive an Aggref and allow them to determine if there is a way that the Aggref call can be optimized. Also added is a support function to allow transformation of COUNT(ANY) into COUNT(*). This is possible to do when the given "ANY" cannot be NULL and also that there are no ORDER BY / DISTINCT clauses within the Aggref. This is a useful transformation to do as it is common that people write COUNT(1), which until now has added unneeded overhead. When counting a NOT NULL column. The overheads can be worse as that might mean deforming more of the tuple, which for large fact tables may be many columns in. It may be possible to add prosupport functions for other aggregates. We could consider if ORDER BY could be dropped for some calls, e.g. the ORDER BY is quite useless in MAX(c ORDER BY c). There is a little bit of passing fallout from adjusting expr_is_nonnullable() to handle Const which results in a plan change in the aggregates.out regression test. Previously, nothing was able to determine that "One-Time Filter: (100 IS NOT NULL)" was always true, therefore useless to include in the plan. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAApHDvqGcPTagXpKfH=CrmHBqALpziThJEDs_MrPqjKVeDF9wA@mail.gmail.com
2025-11-24Fix infer_arbiter_index during concurrent index operationsÁlvaro Herrera
Previously, we would only consider indexes marked indisvalid as usable for INSERT ON CONFLICT. But that's problematic during CREATE INDEX CONCURRENTLY and REINDEX CONCURRENTLY, because concurrent transactions would end up with inconsistents lists of inferred indexes, leading to deadlocks and spurious errors about unique key violations (because two transactions are operating on different indexes for the speculative insertion tokens). Change this function to return indexes even if invalid. This fixes the spurious errors and deadlocks. Because such indexes might not be complete, we still need uniqueness to be verified in a different way. We do that by requiring that at least one index marked valid is part of the set of indexes returned. It is that index that is going to help ensure that the inserted tuple is indeed unique. This does not fix similar problems occurring with partitioned tables or with named constraints. These problems will be fixed in follow-up commits. We have no user report of this problem, even though it exists in all branches. Because of that and given that the fix is somewhat tricky, I decided not to backpatch for now. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CANtu0ogv+6wqRzPK241jik4U95s1pW3MCZ3rX5ZqbFdUysz7Qw@mail.gmail.com
2025-11-23Add SupportRequestInlineInFrom planner support request.Tom Lane
This request allows a support function to replace a function call appearing in FROM (typically a set-returning function) with an equivalent SELECT subquery. The subquery will then be subject to the planner's usual optimizations, potentially allowing a much better plan to be generated. While the planner has long done this automatically for simple SQL-language functions, it's now possible for extensions to do it for functions outside that group. Notably, this could be useful for functions that are presently implemented in PL/pgSQL and work by generating and then EXECUTE'ing a SQL query. Author: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/09de6afa-c33d-4d94-a5cb-afc6cea0d2bb@illuminatedcomputing.com
2025-11-03Doc: cover index CONCURRENTLY causing errors in INSERT ... ON CONFLICT.Noah Misch
Author: Mikhail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com Backpatch-through: 13
2025-11-02Improve planner's estimates of tuple hash table sizes.Tom Lane
For several types of plan nodes that use TupleHashTables, the planner estimated the expected size of the table as basically numEntries * (MAXALIGN(dataWidth) + MAXALIGN(SizeofHeapTupleHeader)). This is pretty far off, especially for small data widths, because it doesn't account for the overhead of the simplehash.h hash table nor for any per-tuple "additional space" the plan node may request. Jeff Janes noted a case where the estimate was off by about a factor of three, even though the obvious hazards such as inaccurate estimates of numEntries or dataWidth didn't apply. To improve matters, create functions provided by the relevant executor modules that can estimate the required sizes with reasonable accuracy. (We're still not accounting for effects like allocator padding, but this at least gets the first-order effects correct.) I added functions that can estimate the tuple table sizes for nodeSetOp and nodeSubplan; these rely on an estimator for TupleHashTables in general, and that in turn relies on one for simplehash.h hash tables. That feels like kind of a lot of mechanism, but if we take any short-cuts we're violating modularity boundaries. The other places that use TupleHashTables are nodeAgg, which took pains to get its numbers right already, and nodeRecursiveunion. I did not try to improve the situation for nodeRecursiveunion because there's nothing to improve: we are not making an estimate of the hash table size, and it wouldn't help us to do so because we have no non-hashed alternative implementation. On top of that, our estimate of the number of entries to be hashed in that module is so suspect that we'd likely often choose the wrong implementation if we did have two ways to do it. Reported-by: Jeff Janes <jeff.janes@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com
2025-10-21Use CompactAttribute more often, when possibleDavid Rowley
5983a4cff added CompactAttribute for storing commonly used fields from FormData_pg_attribute. 5983a4cff didn't go to the trouble of adjusting every location where we can use CompactAttribute rather than FormData_pg_attribute, so here we change the remaining ones. There are some locations where I've left the code using FormData_pg_attribute. These are mostly in the ALTER TABLE code. Using CompactAttribute here seems more risky as often the TupleDesc is being changed and those changes may not have been flushed to the CompactAttribute yet. I've also left record_recv(), record_send(), record_cmp(), record_eq() and record_image_eq() alone as it's not clear to me that accessing the CompactAttribute is a win here due to the FormData_pg_attribute still having to be accessed for most cases. Switching the relevant parts to use CompactAttribute would result in having to access both for common cases. Careful benchmarking may reveal that something can be done to make this better, but in absence of that, the safer option is to leave these alone. In ReorderBufferToastReplace(), there was a check to skip attnums < 0 while looping over the TupleDesc. Doing this is redundant since TupleDescs don't store < 0 attnums. Removing that code allows us to move to using CompactAttribute. The change in validateDomainCheckConstraint() just moves fetching the FormData_pg_attribute into the ERROR path, which is cold due to calling errstart_cold() and results in code being moved out of the common path. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAApHDvrMy90o1Lgkt31F82tcSuwRFHq3vyGewSRN=-QuSEEvyQ@mail.gmail.com
2025-10-16Remove partColsUpdated.Nathan Bossart
This information appears to have been unused since commit c5b7ba4e67. We could not find any references in third-party code, either. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aO_CyFRpbVMtgJWM%40nathan
2025-10-14Rename apply_at to apply_agg_at for clarityRichard Guo
The field name "apply_at" in RelAggInfo was a bit ambiguous. Rename it to "apply_agg_at" to improve clarity and make its purpose clearer. Per complaint from David Rowley, Robert Haas. Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+TgmoZ0KR2_XCWHy17=HHcQ3p2Mamc9c6Dnnhf1J6wPYFD9ng@mail.gmail.com
2025-10-08Implement Eager AggregationRichard Guo
Eager aggregation is a query optimization technique that partially pushes aggregation past a join, and finalizes it once all the relations are joined. Eager aggregation may reduce the number of input rows to the join and thus could result in a better overall plan. In the current planner architecture, the separation between the scan/join planning phase and the post-scan/join phase means that aggregation steps are not visible when constructing the join tree, limiting the planner's ability to exploit aggregation-aware optimizations. To implement eager aggregation, we collect information about aggregate functions in the targetlist and HAVING clause, along with grouping expressions from the GROUP BY clause, and store it in the PlannerInfo node. During the scan/join planning phase, this information is used to evaluate each base or join relation to determine whether eager aggregation can be applied. If applicable, we create a separate RelOptInfo, referred to as a grouped relation, to represent the partially-aggregated version of the relation and generate grouped paths for it. Grouped relation paths can be generated in two ways. The first method involves adding sorted and hashed partial aggregation paths on top of the non-grouped paths. To limit planning time, we only consider the cheapest or suitably-sorted non-grouped paths in this step. Alternatively, grouped paths can be generated by joining a grouped relation with a non-grouped relation. Joining two grouped relations is currently not supported. To further limit planning time, we currently adopt a strategy where partial aggregation is pushed only to the lowest feasible level in the join tree where it provides a significant reduction in row count. This strategy also helps ensure that all grouped paths for the same grouped relation produce the same set of rows, which is important to support a fundamental assumption of the planner. For the partial aggregation that is pushed down to a non-aggregated relation, we need to consider all expressions from this relation that are involved in upper join clauses and include them in the grouping keys, using compatible operators. This is essential to ensure that an aggregated row from the partial aggregation matches the other side of the join if and only if each row in the partial group does. This ensures that all rows within the same partial group share the same "destiny", which is crucial for maintaining correctness. One restriction is that we cannot push partial aggregation down to a relation that is in the nullable side of an outer join, because the NULL-extended rows produced by the outer join would not be available when we perform the partial aggregation, while with a non-eager-aggregation plan these rows are available for the top-level aggregation. Pushing partial aggregation in this case may result in the rows being grouped differently than expected, or produce incorrect values from the aggregate functions. If we have generated a grouped relation for the topmost join relation, we finalize its paths at the end. The final paths will compete in the usual way with paths built from regular planning. The patch was originally proposed by Antonin Houska in 2017. This commit reworks various important aspects and rewrites most of the current code. However, the original patch and reviews were very useful. Author: Richard Guo <guofenglinux@gmail.com> Author: Antonin Houska <ah@cybertec.at> (in an older version) Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version) Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version) Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version) Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
2025-10-07Allow private state in certain planner data structures.Robert Haas
Extension that make extensive use of planner hooks may want to coordinate their efforts, for example to avoid duplicate computation, but that's currently difficult because there's no really good way to pass data between different hooks. To make that easier, allow for storage of extension-managed private state in PlannerGlobal, PlannerInfo, and RelOptInfo, along very similar lines to what we have permitted for ExplainState since commit c65bc2e1d14a2d4daed7c1921ac518f2c5ac3d17. Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com
2025-10-05Don't include access/htup_details.h in executor/tuptable.hÁlvaro Herrera
This is not at all needed; I suspect it was a simple mistake in commit 5408e233f066. It causes htup_details.h to bleed into a huge number of places via execnodes.h. Remove it and fix fallout. Discussion: https://postgr.es/m/202510021240.ptc2zl5cvwen@alvherre.pgsql
2025-10-03Add IGNORE NULLS/RESPECT NULLS option to Window functions.Tatsuo Ishii
Add IGNORE NULLS/RESPECT NULLS option (null treatment clause) to lead, lag, first_value, last_value and nth_value window functions. If unspecified, the default is RESPECT NULLS which includes NULL values in any result calculation. IGNORE NULLS ignores NULL values. Built-in window functions are modified to call new API WinCheckAndInitializeNullTreatment() to indicate whether they accept IGNORE NULLS/RESPECT NULLS option or not (the API can be called by user defined window functions as well). If WinGetFuncArgInPartition's allowNullTreatment argument is true and IGNORE NULLS option is given, WinGetFuncArgInPartition() or WinGetFuncArgInFrame() will return evaluated function's argument expression on specified non NULL row (if it exists) in the partition or the frame. When IGNORE NULLS option is given, window functions need to visit and evaluate same rows over and over again to look for non null rows. To mitigate the issue, 2-bit not null information array is created while executing window functions to remember whether the row has been already evaluated to NULL or NOT NULL. If already evaluated, we could skip the evaluation work, thus we could get better performance. Author: Oliver Ford <ojford@gmail.com> Co-authored-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Krasiyan Andreev <krasiyan@gmail.com> Reviewed-by: Andrew Gierth <andrew@tao11.riddles.org.uk> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Fetter <david@fetter.org> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/flat/CAGMVOdsbtRwE_4+v8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A@mail.gmail.com
2025-09-20Re-allow using statistics for bool-valued functions in WHERE.Tom Lane
Commit a391ff3c3, which added the ability for a function's support function to provide a custom selectivity estimate for "WHERE f(...)", unintentionally removed the possibility of applying expression statistics after finding there's no applicable support function. That happened because we no longer fell through to boolvarsel() as before. Refactor to do so again, putting the 0.3333333 default back into boolvarsel() where it had been (cf. commit 39df0f150). I surely wouldn't have made this error if 39df0f150 had included a test case, so add one now. At the time we did not have the "extended statistics" infrastructure, but we do now, and it is also unable to work in this scenario because of this error. So make use of that for the test case. This is very clearly a bug fix, but I'm afraid to put it into released branches because of the likelihood of altering plan choices, which we avoid doing in minor releases. So, master only. Reported-by: Frédéric Yhuel <frederic.yhuel@dalibo.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/a8b99dce-1bfb-4d97-af73-54a32b85c916@dalibo.com
2025-09-16Treat JsonConstructorExpr as non-strictRichard Guo
JsonConstructorExpr can produce non-NULL output with a NULL input, so it should be treated as a non-strict construct. Failing to do so can lead to incorrect query behavior. For example, in the reported case, when pulling up a subquery that is under an outer join, if the subquery's target list contains a JsonConstructorExpr that uses subquery variables and it is mistakenly treated as strict, it will be pulled up without being wrapped in a PlaceHolderVar. As a result, the expression will be evaluated at the wrong place and will not be forced to null when the outer join should do so. Back-patch to v16 where JsonConstructorExpr was introduced. Bug: #19046 Reported-by: Runyuan He <runyuan@berkeley.edu> Author: Tender Wang <tndrwang@gmail.com> Co-authored-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19046-765b6602b0a8cfdf@postgresql.org Backpatch-through: 16
2025-09-15Expand virtual generated columns in constraint expressionsPeter Eisentraut
Virtual generated columns in constraint expressions need to be expanded because the optimizer matches these expressions to qual clauses. Failing to do so can cause us to miss opportunities for constraint exclusion. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/204804c0-798f-4c72-bd1f-36116024fda3%40eisentraut.org
2025-09-12Fix misuse of Relids for storing attribute numbersRichard Guo
The typedef Relids (Bitmapset *) is intended to represent set of relation identifiers, but was incorrectly used in several places to store sets of attribute numbers. This is my oversight in e2debb643. Fix that by replacing such usages with Bitmapset * to reflect the correct semantics. Author: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAEG8a3LJhp_xriXf39iCz0TsK+M-2biuhDhpLC6Baxw8+ZYT3A@mail.gmail.com
2025-08-30Fix const-simplification for constraints and statsRichard Guo
Constraint expressions and statistics expressions loaded from the system catalogs need to be run through const-simplification, because the planner will be comparing them to similarly-processed qual clauses. Without this step, the planner may fail to detect valid matches. Currently, NullTest clauses in these expressions may not be reduced correctly during const-simplification. This happens because their Var nodes do not yet have the correct varno when eval_const_expressions is applied. Since eval_const_expressions relies on varno to reduce NullTest quals, incorrect varno can cause problems. Additionally, for statistics expressions, eval_const_expressions is called with root set to NULL, which also inhibits NullTest reduction. This patch fixes the issue by ensuring that Vars are updated to have the correct varno before const-simplification, and that a valid root is passed to eval_const_expressions when needed. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19007-4cc6e252ed8aa54a@postgresql.org
2025-08-29Fix possible use after free in expand_partitioned_rtentry()David Rowley
It's possible that if the only live partition is concurrently dropped and try_table_open() fails, that the bms_del_member() will pfree the live_parts Bitmapset. Since the bms_del_member() call does not assign the result back to the live_parts local variable, the while loop could segfault as that variable would still reference the pfree'd Bitmapset. Backpatch to 15. 52f3de874 was backpatched to 14, but there's no bms_del_member() there due to live_parts not yet existing in RelOptInfo in that version. Technically there's no bug in version 15 as bms_del_member() didn't pfree when the set became empty prior to 00b41463c (from v16). Applied to v15 anyway to keep the code similar and to avoid the bad coding pattern. Author: Bernd Reiß <bd_reiss@gmx.at> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/6b88f27a-c45c-4826-8e37-d61a04d90182@gmx.at Backpatch-through: 15
2025-08-19Pathify RHS unique-ification for semijoin planningRichard Guo
There are two implementation techniques for semijoins: one uses the JOIN_SEMI jointype, where the executor emits at most one matching row per left-hand side (LHS) row; the other unique-ifies the right-hand side (RHS) and then performs a plain inner join. The latter technique currently has some drawbacks related to the unique-ification step. * Only the cheapest-total path of the RHS is considered during unique-ification. This may cause us to miss some optimization opportunities; for example, a path with a better sort order might be overlooked simply because it is not the cheapest in total cost. Such a path could help avoid a sort at a higher level, potentially resulting in a cheaper overall plan. * We currently rely on heuristics to choose between hash-based and sort-based unique-ification. A better approach would be to generate paths for both methods and allow add_path() to decide which one is preferable, consistent with how path selection is handled elsewhere in the planner. * In the sort-based implementation, we currently pay no attention to the pathkeys of the input subpath or the resulting output. This can result in redundant sort nodes being added to the final plan. This patch improves semijoin planning by creating a new RelOptInfo for the RHS rel to represent its unique-ified version. It then generates multiple paths that represent elimination of distinct rows from the RHS, considering both a hash-based implementation using the cheapest total path of the original RHS rel, and sort-based implementations that either exploit presorted input paths or explicitly sort the cheapest total path. All resulting paths compete in add_path(), and those deemed worthy of consideration are added to the new RelOptInfo. Finally, the unique-ified rel is joined with the other side of the semijoin using a plain inner join. As a side effect, most of the code related to the JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER jointypes -- used to indicate that the LHS or RHS path should be made unique -- has been removed. Besides, the T_Unique path now has the same meaning for both semijoins and upper DISTINCT clauses: it represents adjacent-duplicate removal on presorted input. This patch unifies their handling by sharing the same data structures and functions. This patch also removes the UNIQUE_PATH_NOOP related code along the way, as it is dead code -- if the RHS rel is provably unique, the semijoin should have already been simplified to a plain inner join by analyzejoins.c. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com
2025-08-08Disallow collecting transition tuples from child foreign tables.Etsuro Fujita
Commit 9e6104c66 disallowed transition tables on foreign tables, but failed to account for cases where a foreign table is a child table of a partitioned/inherited table on which transition tables exist, leading to incorrect transition tuples collected from such foreign tables for queries on the parent table triggering transition capture. This occurred not only for inherited UPDATE/DELETE but for partitioned INSERT later supported by commit 3d956d956, which should have handled it at least for the INSERT case, but didn't. To fix, modify ExecAR*Triggers to throw an error if the given relation is a foreign table requesting transition capture. Also, this commit fixes make_modifytable so that in case of an inherited UPDATE/DELETE triggering transition capture, FDWs choose normal operations to modify child foreign tables, not DirectModify; which is needed because they would otherwise skip the calls to ExecAR*Triggers at execution, causing unexpected behavior. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CAPmGK14QJYikKzBDCe3jMbpGENnQ7popFmbEgm-XTNuk55oyHg%40mail.gmail.com Backpatch-through: 13
2025-07-29Display Memoize planner estimates in EXPLAINDavid Rowley
There've been a few complaints that it can be overly difficult to figure out why the planner picked a Memoize plan. To help address that, here we adjust the EXPLAIN output to display the following additional details: 1) The estimated number of cache entries that can be stored at once 2) The estimated number of unique lookup keys that we expect to see 3) The number of lookups we expect 4) The estimated hit ratio Technically #4 can be calculated using #1, #2 and #3, but it's not a particularly obvious calculation, so we opt to display it explicitly. The original patch by Lukas Fittl only displayed the hit ratio, but there was a fear that might lead to more questions about how that was calculated. The idea with displaying all 4 is to be transparent which may allow queries to be tuned more easily. For example, if #2 isn't correct then maybe extended statistics or a manual n_distinct estimate can be used to help fix poor plan choices. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAP53Pky29GWAVVk3oBgKBDqhND0BRBN6yTPeguV_qSivFL5N_g%40mail.gmail.com
2025-07-22Reduce "Var IS [NOT] NULL" quals during constant foldingRichard Guo
In commit b262ad440, we introduced an optimization that reduces an IS [NOT] NULL qual on a NOT NULL column to constant true or constant false, provided we can prove that the input expression of the NullTest is not nullable by any outer joins or grouping sets. This deduction happens quite late in the planner, during the distribution of quals to rels in query_planner. However, this approach has some drawbacks: we can't perform any further folding with the constant, and it turns out to be prone to bugs. Ideally, this deduction should happen during constant folding. However, the per-relation information about which columns are defined as NOT NULL is not available at that point. This information is currently collected from catalogs when building RelOptInfos for base or "other" relations. This patch moves the collection of NOT NULL attribute information for relations before pull_up_sublinks, storing it in a hash table keyed by relation OID. It then uses this information to perform the NullTest deduction for Vars during constant folding. This also makes it possible to leverage this information to pull up NOT IN subqueries. Note that this patch does not get rid of restriction_is_always_true and restriction_is_always_false. Removing them would prevent us from reducing some IS [NOT] NULL quals that we were previously able to reduce, because (a) the self-join elimination may introduce new IS NOT NULL quals after constant folding, and (b) if some outer joins are converted to inner joins, previously irreducible NullTest quals may become reducible. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAMbWs4-bFJ1At4btk5wqbezdu8PLtQ3zv-aiaY3ry9Ymm=jgFQ@mail.gmail.com
2025-07-08Consider explicit incremental sort for Append and MergeAppendRichard Guo
For an ordered Append or MergeAppend, we need to inject an explicit sort into any subpath that is not already well enough ordered. Currently, only explicit full sorts are considered; incremental sorts are not yet taken into account. In this patch, for subpaths of an ordered Append or MergeAppend, we choose to use explicit incremental sort if it is enabled and there are presorted keys. The rationale is based on the assumption that incremental sort is always faster than full sort when there are presorted keys, a premise that has been applied in various parts of the code. In addition, the current cost model tends to favor incremental sort as being cheaper than full sort in the presence of presorted keys, making it reasonable not to consider full sort in such cases. No backpatch as this could result in plan changes. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAMbWs4_V7a2enTR+T3pOY_YZ-FU8ZsFYym2swOz4jNMqmSgyuw@mail.gmail.com
2025-07-03Simplify COALESCE() with one surviving argument.Tom Lane
If, after removal of useless null-constant arguments, a CoalesceExpr has exactly one remaining argument, we can just take that argument as the result, without bothering to wrap a new CoalesceExpr around it. This isn't likely to produce any great improvement in runtime per se, but it can lead to better plans since the planner no longer has to treat the expression as non-strict. However, there were a few regression test cases that intentionally wrote COALESCE(x) as a shorthand way of creating a non-strict subexpression. To avoid ruining the intent of those tests, write COALESCE(x,x) instead. (If anyone ever proposes de-duplicating COALESCE arguments, we'll need another iteration of this arms race. But it seems pretty unlikely that such an optimization would be worthwhile.) Author: Maksim Milyutin <maksim.milyutin@tantorlabs.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8e8573c3-1411-448d-877e-53258b7b2be0@tantorlabs.ru
2025-06-29Fix some new issues with planning of PlaceHolderVars.Tom Lane
In the wake of commit a16ef313f, we need to deal with more cases involving PlaceHolderVars in NestLoopParams than we did before. For one thing, a16ef313f was incorrect to suppose that we could rely on the required-outer relids of the lefthand path to decide placement of nestloop-parameter PHVs. As Richard Guo argued at the time, we must look at the required-outer relids of the join path itself. For another, we have to apply replace_nestloop_params() to such a PHV's expression, in case it contains references to values that will be supplied from NestLoopParams of higher-level nestloops. For another, we need to be more careful about the phnullingrels of the PHV than we were being. identify_current_nestloop_params only bothered to ensure that the phnullingrels didn't contain "too many" relids, but now it has to be exact, because setrefs.c will apply both NRM_SUBSET and NRM_SUPERSET checks in different places. We can compute the correct relids by determining the set of outer joins that should be able to null the PHV and then subtracting whatever's been applied at or below this join. Do the same for plain Vars, too. (This should make it possible to use NRM_EQUAL to process nestloop params in setrefs.c, but I won't risk making such a change in v18 now.) Lastly, if a nestloop parameter PHV was pulled up out of a subquery and it contains a subquery that was originally pushed down from this query level, then that will still be represented as a SubLink, because SS_process_sublinks won't recurse into outer PHVs, so it didn't get transformed during expression preprocessing in the subquery. We can substitute the version of the PHV's expression appearing in its PlaceHolderInfo to ensure that that preprocessing has happened. (Seems like this processing sequence could stand to be redesigned, but again, late in v18 development is not the time for that.) It's not very clear to me why the old have_dangerous_phv join-order restriction prevented us from seeing the last three of these problems. But given the lack of field complaints, it must have done so. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18953-1c9883a9d4afeb30@postgresql.org
2025-06-20Remove planner's have_dangerous_phv() join-order restriction.Tom Lane
Commit 85e5e222b, which added (a forerunner of) this logic, argued that Adding the necessary complexity to make this work doesn't seem like it would be repaid in significantly better plans, because in cases where such a PHV exists, there is probably a corresponding join order constraint that would allow a good plan to be found without using the star-schema exception. The flaw in this claim is that there may be other join-order restrictions that prevent us from finding a join order that doesn't involve a "dangerous" PHV. In particular we now recognize that small join_collapse_limit or from_collapse_limit could prevent it. Therefore, let's bite the bullet and make the case work. We don't have to extend the executor's support for nestloop parameters as I thought at the time, because we can instead push the evaluation of the placeholder's expression into the left-hand input of the NestLoop node. So there's not really a lot of downside to this solution, and giving the planner more join-order flexibility should have value beyond just avoiding failure. Having said that, there surely is a nonzero risk of introducing new bugs. Since this failure mode escaped detection for ten years, such cases don't seem common enough to justify a lot of risk. Therefore, let's put this fix into master but leave the back branches alone (for now anyway). Bug: #18953 Reported-by: Alexander Lakhin <exclusion@gmail.com> Diagnosed-by: Richard Guo <guofenglinux@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18953-1c9883a9d4afeb30@postgresql.org
2025-05-08Track the number of presorted outer pathkeys in MergePathRichard Guo
When creating an explicit Sort node for the outer path of a mergejoin, we need to determine the number of presorted keys of the outer path to decide whether explicit incremental sort can be applied. Currently, this is done by repeatedly calling pathkeys_count_contained_in. This patch caches the number of presorted outer pathkeys in MergePath, allowing us to save several calls to pathkeys_count_contained_in. It can be considered a complement to the changes in commit 828e94c9d. Reported-by: David Rowley <dgrowleyml@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAApHDvqvBireB_w6x8BN5txdvBEHxVgZBt=rUnpf5ww5P_E_ww@mail.gmail.com
2025-04-07Allow NOT NULL constraints to be added as NOT VALIDÁlvaro Herrera
This allows them to be added without scanning the table, and validating them afterwards without holding access exclusive lock on the table after any violating rows have been deleted or fixed. Doing ALTER TABLE ... SET NOT NULL for a column that has an invalid not-null constraint validates that constraint. ALTER TABLE .. VALIDATE CONSTRAINT is also supported. There are various checks on whether an invalid constraint is allowed in a child table when the parent table has a valid constraint; this should match what we do for enforced/not enforced constraints. pg_attribute.attnotnull is now only an indicator for whether a not-null constraint exists for the column; whether it's valid or invalid must be queried in pg_constraint. Applications can continue to query pg_attribute.attnotnull as before, but now it's possible that NULL rows are present in the column even when that's set to true. For backend internal purposes, we cache the nullability status in CompactAttribute->attnullability that each tuple descriptor carries (replacing CompactAttribute.attnotnull, which was a mirror of Form_pg_attribute.attnotnull). During the initial tuple descriptor creation, based on the pg_attribute scan, we set this to UNRESTRICTED if pg_attribute.attnotnull is false, or to UNKNOWN if it's true; then we update the latter to VALID or INVALID depending on the pg_constraint scan. This flag is also copied when tupledescs are copied. Comparing tuple descs for equality must also compare the CompactAttribute.attnullability flag and return false in case of a mismatch. pg_dump deals with these constraints by storing the OIDs of invalid not-null constraints in a separate array, and running a query to obtain their properties. The regular table creation SQL omits them entirely. They are then dealt with in the same way as "separate" CHECK constraints, and dumped after the data has been loaded. Because no additional pg_dump infrastructure was required, we don't bump its version number. I decided not to bump catversion either, because the old catalog state works perfectly in the new world. (Trying to run with new catalog state and the old server version would likely run into issues, however.) System catalogs do not support invalid not-null constraints (because commit 14e87ffa5c54 didn't allow them to have pg_constraint rows anyway.) Author: Rushabh Lathia <rushabh.lathia@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Tested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAGPqQf0KitkNack4F5CFkFi-9Dqvp29Ro=EpcWt=4_hs-Rt+bQ@mail.gmail.com
2025-04-06Relax ordering-related hardcoded btree requirements in planningPeter Eisentraut
There were several places in ordering-related planning where a requirement for btree was hardcoded but an amcanorder index could suffice. This fixes that. We just need to do the necessary mapping between strategy numbers and compare types and adjust some related APIs so that this works independent of btree strategy numbers. For instance, non-btree amcanorder indexes can now be used to support sorting and merge joins. Also, predtest.c works independent of btree strategy numbers now. To avoid performance regressions, some details on btree and other built-in index types are still hardcoded as shortcuts, but other index types now have access to the same features by providing the required flags and callbacks. Author: Mark Dilger <mark.dilger@enterprisedb.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-04-04Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriateAlexander Korotkov
This commit implements the automatic conversion of 'x IN (VALUES ...)' into ScalarArrayOpExpr. That simplifies the query tree, eliminating the appearance of an unnecessary join. Since VALUES describes a relational table, and the value of such a list is a table row, the optimizer will likely face an underestimation problem due to the inability to estimate cardinality through MCV statistics. The cardinality evaluation mechanism can work with the array inclusion check operation. If the array is small enough (< 100 elements), it will perform a statistical evaluation element by element. We perform the transformation in the convert_ANY_sublink_to_join() if VALUES RTE is proper and the transformation is convertible. The conversion is only possible for operations on scalar values, not rows. Also, we currently support the transformation only when it ends up with a constant array. Otherwise, the evaluation of non-hashed SAOP might be slower than the corresponding Hash Join with VALUES. Discussion: https://postgr.es/m/0184212d-1248-4f1f-a42d-f5cb1c1976d2%40tantorlabs.com Author: Alena Rybakina <a.rybakina@postgrespro.ru> Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-04-04Extract make_SAOP_expr() function from match_orclause_to_indexcol()Alexander Korotkov
This commit extracts the code to generate ScalarArrayOpExpr on top of the list of expressions from match_orclause_to_indexcol() into a separate function make_SAOP_expr(). This function was extracted to be used in optimization for conversion of 'x IN (VALUES ...)' to 'x = ANY ...'. make_SAOP_expr() is placed in clauses.c file as only two additional headers were needed there compared with other places. Discussion: https://postgr.es/m/0184212d-1248-4f1f-a42d-f5cb1c1976d2%40tantorlabs.com Author: Alena Rybakina <a.rybakina@postgrespro.ru> Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-04-03Remove duplicated comment in get_relation_constraintsRichard Guo
The check for non-inheritable constraints is performed later, and the same comment is included at that point. While we're here, remove one extraneous blank line. Author: jian he <jian.universality@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CACJufxETi6x86S8EkH8mRfOcm2AenoE9t1pyCFVMpU34gVhF3w@mail.gmail.com
2025-04-02Change SQL-language functions to use the plan cache.Tom Lane
In the historical implementation of SQL functions (if they don't get inlined), we built plans for all the contained queries at first call within an outer query, and then re-used those plans for the duration of the outer query, and then forgot everything. This was not ideal, not least because the plans could not be customized to specific values of the function's parameters. Our plancache infrastructure seems mature enough to be used here. That will solve both the problem with not being able to build custom plans and the problem with not being able to share work across successive outer queries. Aside from those performance concerns, this change fixes a longstanding bugaboo with SQL functions: you could not write DDL that would affect later statements in the same function. That's mostly still true with new-style SQL functions, since the results of parse analysis are baked into the stored query trees (and protected by dependency records). But for old-style SQL functions, it will now work much as it does with PL/pgSQL functions, because we delay parse analysis and planning of each query until we're ready to run it. Some edge cases that require replanning are now handled better too; see for example the new rowsecurity test, where we now detect an RLS context change that was previously missed. One other edge-case change that might be worthy of a release note is that we now insist that a SQL function's result be generated by the physically-last query within it. Previously, if the last original query was deleted by a DO INSTEAD NOTHING rule, we'd be willing to take the result from the preceding query instead. This behavior was undocumented except in source-code comments, and it seems hard to believe that anyone's relying on it. Along the way to this feature, we needed a few infrastructure changes: * The plancache can now take either a raw parse tree or an analyzed-but-not-rewritten Query as the starting point for a CachedPlanSource. If given a Query, it is caller's responsibility that nothing will happen to invalidate that form of the query. We use this for new-style SQL functions, where what's in pg_proc is serialized Query(s) and we trust the dependency mechanism to disallow DDL that would break those. * The plancache now offers a way to invoke a post-rewrite callback to examine/modify the rewritten parse tree when it is rebuilding the parse trees after a cache invalidation. We need this because SQL functions sometimes adjust the parse tree to make its output exactly match the declared result type; if the plan gets rebuilt, that has to be re-done. * There is a new backend module utils/cache/funccache.c that abstracts the idea of caching data about a specific function usage (a particular function and set of input data types). The code in it is moved almost verbatim from PL/pgSQL, which has done that for a long time. We use that logic now for SQL-language functions too, and maybe other PLs will have use for it in the future. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop
2025-04-02Add support for NOT ENFORCED in foreign key constraintsPeter Eisentraut
This expands the NOT ENFORCED constraint flag, previously only supported for CHECK constraints (commit ca87c415e2f), to foreign key constraints. Normally, when a foreign key constraint is created on a table, action and check triggers are added to maintain data integrity. With this patch, if a constraint is marked as NOT ENFORCED, integrity checks are no longer required, making these triggers unnecessary. Consequently, when creating a NOT ENFORCED foreign key constraint, triggers will not be created, and the constraint will be marked as NOT VALID. Similarly, if an existing foreign key constraint is changed to NOT ENFORCED, the associated triggers will be dropped, and the constraint will also be marked as NOT VALID. Conversely, if a NOT ENFORCED foreign key constraint is changed to ENFORCED, the necessary triggers will be created, and the will be changed to VALID by performing necessary validation. Since not-enforced foreign key constraints have no triggers, the shortcut used for example in psql and pg_dump to skip looking for foreign keys if the relation is known not to have triggers no longer applies. (It already didn't work for partitioned tables.) Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Joel Jacobson <joel@compiler.org> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Isaac Morland <isaac.morland@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Tested-by: Triveni N <triveni.n@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-04-01Fix planner's failure to identify multiple hashable ScalarArrayOpExprsDavid Rowley
50e17ad28 (v14) and 29f45e299 (v15) made it so the planner could identify IN and NOT IN clauses which have Const lists as right-hand arguments and when an appropriate hash function is available for the data types, mark the ScalarArrayOpExpr as hashable so the executor could execute it more optimally by building and probing a hash table during expression evaluation. These commits both worked correctly when there was only a single ScalarArrayOpExpr in the given expression being processed by the planner, but when there were multiple, only the first was checked and any subsequent ones were not identified, which resulted in less optimal expression evaluation during query execution for all but the first found ScalarArrayOpExpr. Backpatch to 14, where 50e17ad28 was introduced. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/29a76f51-97b0-4c07-87b7-ec8e3b5345c9@gmail.com Backpatch-through: 14
2025-03-15Remove table AM callback scan_bitmap_next_blockMelanie Plageman
After pushing the bitmap iterator into table-AM specific code (as part of making bitmap heap scan use the read stream API in 2b73a8cd33b7), scan_bitmap_next_block() no longer returns the current block number. Since scan_bitmap_next_block() isn't returning any relevant information to bitmap table scan code, it makes more sense to get rid of it. Now, bitmap table scan code only calls table_scan_bitmap_next_tuple(), and the heap AM implementation of scan_bitmap_next_block() is a local helper in heapam_handler.c. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/flat/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA%40mail.gmail.com
2025-01-16Add OLD/NEW support to RETURNING in DML queries.Dean Rasheed
This allows the RETURNING list of INSERT/UPDATE/DELETE/MERGE queries to explicitly return old and new values by using the special aliases "old" and "new", which are automatically added to the query (if not already defined) while parsing its RETURNING list, allowing things like: RETURNING old.colname, new.colname, ... RETURNING old.*, new.* Additionally, a new syntax is supported, allowing the names "old" and "new" to be changed to user-supplied alias names, e.g.: RETURNING WITH (OLD AS o, NEW AS n) o.colname, n.colname, ... This is useful when the names "old" and "new" are already defined, such as inside trigger functions, allowing backwards compatibility to be maintained -- the interpretation of any existing queries that happen to already refer to relations called "old" or "new", or use those as aliases for other relations, is not changed. For an INSERT, old values will generally be NULL, and for a DELETE, new values will generally be NULL, but that may change for an INSERT with an ON CONFLICT ... DO UPDATE clause, or if a query rewrite rule changes the command type. Therefore, we put no restrictions on the use of old and new in any DML queries. Dean Rasheed, reviewed by Jian He and Jeff Davis. Discussion: https://postgr.es/m/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50AgOe+pvisQ@mail.gmail.com
2025-01-15Rename RowCompareType to CompareTypePeter Eisentraut
RowCompareType served as a way to describe the fundamental meaning of an operator, notionally independent of an operator class (although so far this was only really supported for btrees). Its original purpose was for use inside RowCompareExpr, and it has also found some small use outside, such as for get_op_btree_interpretation(). We want to expand this now, as a more general way to describe operator semantics for other index access methods, including gist (to improve GistTranslateStratnum()) and others not written yet. To avoid future confusion, we rename the type to CompareType and the symbols from ROWCOMPARE_XXX to COMPARE_XXX to reflect their more general purpose. Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-01-11Add support for NOT ENFORCED in CHECK constraintsPeter Eisentraut
This adds support for the NOT ENFORCED/ENFORCED flag for constraints, with support for check constraints. The plan is to eventually support this for foreign key constraints, where it is typically more useful. Note that CHECK constraints do not currently support ALTER operations, so changing the enforceability of an existing constraint isn't possible without dropping and recreating it. This could be added later. Author: Amul Sul <amul.sul@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Tested-by: Triveni N <triveni.n@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-01-01Update copyright for 2025Bruce Momjian
Backpatch-through: 13
2024-12-20Introduce CompactAttribute array in TupleDesc, take 2David Rowley
The new compact_attrs array stores a few select fields from FormData_pg_attribute in a more compact way, using only 16 bytes per column instead of the 104 bytes that FormData_pg_attribute uses. Using CompactAttribute allows performance-critical operations such as tuple deformation to be performed without looking at the FormData_pg_attribute element in TupleDesc which means fewer cacheline accesses. For some workloads, tuple deformation can be the most CPU intensive part of processing the query. Some testing with 16 columns on a table where the first column is variable length showed around a 10% increase in transactions per second for an OLAP type query performing aggregation on the 16th column. However, in certain cases, the increases were much higher, up to ~25% on one AMD Zen4 machine. This also makes pg_attribute.attcacheoff redundant. A follow-on commit will remove it, thus shrinking the FormData_pg_attribute struct by 4 bytes. Author: David Rowley Reviewed-by: Andres Freund, Victor Yegorov Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
2024-12-19Improve planner's handling of SetOp plans.Tom Lane
Remove the code for inserting flag columns in the inputs of a SetOp. That was the only reason why there would be resjunk columns in a set-operations plan tree, so we can get rid of some code that supported that, too. Get rid of choose_hashed_setop() in favor of building Paths for the hashed and sorted alternatives, and letting them fight it out within add_path(). Remove set_operation_ordered_results_useful(), which was giving wrong answers due to examining the wrong ancestor node: we need to examine the immediate SetOperationStmt parent not the topmost node. Instead make each caller of recurse_set_operations() pass down the relevant parent node. (This thinko seems to have led only to wasted planning cycles and possibly-inferior plans, not wrong query answers. Perhaps we should back-patch it, but I'm not doing so right now.) Teach generate_nonunion_paths() to consider pre-sorted inputs for sorted SetOps, rather than always generating a Sort node. Patch by me; thanks to Richard Guo and David Rowley for review. Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us