summaryrefslogtreecommitdiff
path: root/src/backend/access/heap
AgeCommit message (Collapse)Author
5 hoursChange pgstat_report_vacuum() to use RelationHEADmasterMichael Paquier
This change makes pgstat_report_vacuum() more consistent with pgstat_report_analyze(), that also uses a Relation. This enforces a policy that callers of this routine should open and lock the relation whose statistics are updated before calling this routine. We will unlikely have a lot of callers of this routine in the tree, but it seems like a good idea to imply this requirement in the long run. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aUEA6UZZkDCQFgSA@ip-10-97-1-34.eu-west-3.compute.internal
15 hoursAdd explanatory comment to prune_freeze_setup()Melanie Plageman
heap_page_prune_and_freeze() fills in PruneState->deadoffsets, the array of OffsetNumbers of dead tuples. It is returned to the caller in the PruneFreezeResult. To avoid having two copies of the array, the PruneState saves only a pointer to the array. This was a bit unusual and confusing, so add a clarifying comment. Author: Melanie Plageman <melanieplageman@gmail.com> Suggested-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2=jiD1nqch4JQN+odAxZSD7mRvdoHUGJYN2r6tQG_66yQ@mail.gmail.com
15 hoursFix const qualification in prune_freeze_setup()Melanie Plageman
The const qualification of the presult argument to prune_freeze_setup() is later cast away, so it was not correct. Remove it and add a comment explaining that presult should not be modified. Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/fb97d0ae-a0bc-411d-8a87-f84e7e146488%40eisentraut.org
35 hoursRevisit cosmetics of "For inplace update, send nontransactional invalidations."Noah Misch
This removes a never-used CacheInvalidateHeapTupleInplace() parameter. It adds README content about inplace update visibility in logical decoding. It rewrites other comments. Back-patch to v18, where commit 243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 first appeared. Since this removes a CacheInvalidateHeapTupleInplace() parameter, expect a v18 ".abi-compliance-history" edit to follow. PGXN contains no calls to that function. Reported-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reported-by: Ilyasov Ian <ianilyasov@outlook.com> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Surya Poondla <s_poondla@apple.com> Discussion: https://postgr.es/m/CA+renyU+LGLvCqS0=fHit-N1J-2=2_mPK97AQxvcfKm+F-DxJA@mail.gmail.com Backpatch-through: 18
35 hoursCorrect comments of "Fix data loss at inplace update after heap_update()".Noah Misch
This corrects commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257. Reported-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reported-by: Surya Poondla <s_poondla@apple.com> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Discussion: https://postgr.es/m/CA+renyWCW+_2QvXERBQ+mna6ANwAVXXmHKCA-WzL04bZRsjoBA@mail.gmail.com
37 hoursAdd offnum range checks to suppress compile warnings with UBSAN.Tom Lane
Late-model gcc with -fsanitize=undefined enabled issues warnings about uses of PageGetItemId() when it can't prove that the offsetNumber is > 0. The call sites where this happens are checking that the offnum is <= PageGetMaxOffsetNumber(page), so it seems reasonable to add an explicit check that offnum >= 1 too. While at it, rearrange the code to be less contorted and avoid duplicate checks on PageGetMaxOffsetNumber. Maybe the compiler would optimize away the duplicate logic or maybe not, but the existing coding has little to recommend it anyway. There are multiple instances of this identical coding pattern in heapam.c and heapam_xlog.c. Current gcc only complains about two of them, but I fixed them all in the name of consistency. Potentially this could be back-patched in the name of silencing warnings; but I think enabling UBSAN is mainly something people would do on HEAD, so for now it seems not worth the trouble. Discussion: https://postgr.es/m/1699806.1765746897@sss.pgh.pa.us
2 daysUpdate typedefs.list to match what the buildfarm currently reports.Tom Lane
The current list from the buildfarm includes quite a few typedef names that it used to miss. The reason is a bit obscure, but it seems likely to have something to do with our recent increased use of palloc_object and palloc_array. In any case, this makes the relevant struct declarations be much more nicely formatted, so I'll take it. Install the current list and re-run pgindent to update affected code. Syncing with the current list also removes some obsolete typedef names and fixes some alphabetization errors. Discussion: https://postgr.es/m/1681301.1765742268@sss.pgh.pa.us
5 daysReplace most StaticAssertStmt() with StaticAssertDecl()Peter Eisentraut
Similar to commit 75f49221c22, it is preferable to use StaticAssertDecl() instead of StaticAssertStmt() when possible. Discussion: https://www.postgresql.org/message-id/flat/CA%2BhUKGKvr0x_oGmQTUkx%3DODgSksT2EtgCA6LmGx_jQFG%3DsDUpg%40mail.gmail.com
7 daysAdd comment about keeping PD_ALL_VISIBLE and VM in syncMelanie Plageman
The comment above heap_xlog_visible() about the critical integrity requirement for PD_ALL_VISIBLE and the visibility map should also be in heap_xlog_prune_freeze() where we set PD_ALL_VISIBLE. Oversight in add323da40a6bf9e Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com
7 daysSimplify vacuum visibility assertionMelanie Plageman
Phase I vacuum gives the page a once-over after pruning and freezing to check that the values of all_visible and all_frozen agree with the result of heap_page_is_all_visible(). This is meant to keep the logic in phase I for determining visibility in sync with the logic in phase III. Rewrite the assertion to avoid an Assert(false). Suggested by Andres Freund. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/mhf4vkmh3j57zx7vuxp4jagtdzwhu3573pgfpmnjwqa6i6yj5y%40sy4ymcdtdklo
7 daysUse palloc_object() and palloc_array() in backend codeMichael Paquier
The idea is to encourage more the use of these new routines across the tree, as these offer stronger type safety guarantees than palloc(). This batch of changes includes most of the trivial changes suggested by the author for src/backend/. A total of 334 files are updated here. Among these files, 48 of them have their build change slightly; these are caused by line number changes as the new allocation formulas are simpler, shaving around 100 lines of code in total. Similar work has been done in 0c3c5c3b06a3 and 31d3847a37be. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
8 daysAdd mode and started_by columns to pg_stat_progress_vacuum view.Masahiko Sawada
The new columns, mode and started_by, indicate the vacuum mode ('normal', 'aggressive', or 'failsafe') and the initiator of the vacuum ('manual', 'autovacuum', or 'autovacuum_wraparound'), respectively. This allows users and monitoring tools to better understand VACUUM behavior. Bump catalog version. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Yu Wang <wangyu_runtime@163.com> Discussion: https://postgr.es/m/CAOzEurQcOY-OBL_ouEVfEaFqe_md3vB5pXjR_m6L71Dcp1JKCQ@mail.gmail.com
12 daysSuppress spurious Coverity warning in prune freeze logicMelanie Plageman
Adjust the prune_freeze_setup() parameter types of new_relfrozen_xid and new_relmin_mxid to prevent misleading Coverity analysis. heap_page_prune_and_freeze() compared these values against NULL when passing them to prune_freeze_setup(), causing Coverity to assume they could be NULL and flag a possible null-pointer dereference later, even though it occurs inside a directly related conditional. Reported-by: Coverity Author: Melanie Plageman <melanieplageman@gmail.com>
13 daysRemove no longer needed casts to PointerPeter Eisentraut
These casts used to be required when Pointer was char *, but now it's void * (commit 1b2bb5077e9), so they are not needed anymore. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org
2025-11-27Fix possibly uninitialized HeapScanDesc.rs_startblockDavid Rowley
The solution used in 0ca3b1697 to determine the Parallel TID Range Scan's start location was to modify the signature of table_block_parallelscan_startblock_init() to allow the startblock to be passed in as a parameter. This allows the scan limits to be adjusted before that function is called so that the limits are picked up when the parallel scan starts. The commit made it so the call to table_block_parallelscan_startblock_init uses the HeapScanDesc's rs_startblock to pass the startblock to the parallel scan. That all works ok for Parallel TID Range scans as the HeapScanDesc rs_startblock gets set by heap_setscanlimits(), but for Parallel Seq Scans, initscan() does not initialize rs_startblock, and that results in passing an uninitialized value to table_block_parallelscan_startblock_init() as noted by the buildfarm member skink, running Valgrind. To fix this issue, make it so initscan() sets the rs_startblock for parallel scans unless we're doing a rescan. This makes it so table_block_parallelscan_startblock_init() will be called with the startblock set to InvalidBlockNumber, and that'll allow the syncscan code to find the correct start location (when enabled). For Parallel TID Range Scans, this InvalidBlockNumber value will be overwritten in the call to heap_setscanlimits(). initscan() is a bit light on documentation on what's meant to get initialized where for parallel scans. From what I can tell, it looks like it just didn't matter prior to 0ca3b1697 that rs_startblock was left uninitialized for parallel scans. To address the light documentation, I've also added some comments to mention that the syncscan location for parallel scans is figured out in table_block_parallelscan_startblock_init. I've also taken the liberty to adjust the if/else if/else code in initscan() to make it clearer which parts apply to parallel scans and which parts are for the serial scans. Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvqALm+k7FyfdQdCw1yF_8HojvR61YRrNhwRQPE=zSmnQA@mail.gmail.com
2025-11-27Add parallelism support for TID Range ScansDavid Rowley
In v14, bb437f995 added support for scanning for ranges of TIDs using a dedicated executor node for the purpose. Here, we allow these scans to be parallelized. The range of blocks to scan is divvied up similarly to how a Parallel Seq Scans does that, where 'chunks' of blocks are allocated to each worker and the size of those chunks is slowly reduced down to 1 block per worker by the time we're nearing the end of the scan. Doing that means workers finish at roughly the same time. Allowing TID Range Scans to be parallelized removes the dilemma from the planner as to whether a Parallel Seq Scan will cost less than a non-parallel TID Range Scan due to the CPU concurrency of the Seq Scan (disk costs are not divided by the number of workers). It was possible the planner could choose the Parallel Seq Scan which would result in reading additional blocks during execution than the TID Scan would have. Allowing Parallel TID Range Scans removes the trade-off the planner makes when choosing between reduced CPU costs due to parallelism vs additional I/O from the Parallel Seq Scan due to it scanning blocks from outside of the required TID range. There is also, of course, the traditional parallelism performance benefits to be gained as well, which likely doesn't need to be explained here. Author: Cary Huang <cary.huang@highgo.ca> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Steven Niu <niushiji@gmail.com> Discussion: https://postgr.es/m/18f2c002a24.11bc2ab825151706.3749144144619388582@highgo.ca
2025-11-26Split heap_page_prune_and_freeze() into helpersMelanie Plageman
Refactor the setup and planning phases of pruning and freezing into helpers. This streamlines heap_page_prune_and_freeze() and makes it more clear when the examination of tuples ends and page modifications begin. No code change beyond what was required to extract the code into helper functions. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/mhf4vkmh3j57zx7vuxp4jagtdzwhu3573pgfpmnjwqa6i6yj5y%40sy4ymcdtdklo
2025-11-25Assert that cutoffs are provided if freezing will be attemptedMelanie Plageman
heap_page_prune_and_freeze() requires the caller to initialize PruneFreezeParams->cutoffs so that the function can correctly evaluate whether tuples should be frozen. This requirement previously existed only in comments and was easy to miss, especially after “cutoffs” was converted from a direct function parameter to a field of the newly introduced PruneFreezeParams struct (added in 1937ed70621). Adding an assert makes this requirement explicit and harder to violate. Also, fix a minor typo while we're at it. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/0AC177F5-5E26-45EE-B273-357C51212AC5%40gmail.com
2025-11-20Split PruneFreezeParams initializers to one field per lineMelanie Plageman
This conforms more closely with the style of other struct initializers in the code base. Initializing multiple fields on a single line is unpopular in part because pgindent won't permit a space after the comma before the next field's period. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87see87fnq.fsf%40wibble.ilmari.org
2025-11-20Update PruneState.all_[visible|frozen] earlier in pruningMelanie Plageman
During pruning and freezing in phase I of vacuum, we delay clearing all_visible and all_frozen in the presence of dead items. This allows opportunistic freezing if the page would otherwise be fully frozen, since those dead items are later removed in vacuum phase III. To move the VM update into the same WAL record that prunes and freezes tuples, we must know whether the page will be marked all-visible/all-frozen before emitting WAL. Previously we waited until after emitting WAL to update all_visible/all_frozen to their correct values. The only barrier to updating these flags immediately after deciding whether to opportunistically freeze was that while emitting WAL for a record freezing tuples, we use the pre-corrected value of all_frozen to compute the snapshot conflict horizon. By determining the conflict horizon earlier, we can update the flags immediately after making the opportunistic freeze decision. This is required to set the VM in the XLOG_HEAP2_PRUNE_VACUUM_SCAN record emitted by pruning and freezing. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com
2025-11-20Keep all_frozen updated in heap_page_prune_and_freezeMelanie Plageman
Previously, we relied on all_visible and all_frozen being used together to ensure that all_frozen was correct, but it is better to keep both fields updated. Future changes will separate their usage, so we should not depend on all_visible for the validity of all_frozen. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com
2025-11-20Refactor heap_page_prune_and_freeze() parameters into a structMelanie Plageman
heap_page_prune_and_freeze() had accumulated an unwieldy number of input parameters and upcoming work to handle VM updates in this function will add even more. Introduce a new PruneFreezeParams struct to group the function’s input parameters, improving readability and maintainability. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/yn4zp35kkdsjx6wf47zcfmxgexxt4h2og47pvnw2x5ifyrs3qc%407uw6jyyxuyf7
2025-11-12Prefer spelling "cacheable" over "cachable".Thomas Munro
Previously we had both in code and comments. Keep the more common and accepted variant. Author: Chao Li <lic@highgo.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/5EBF1771-0566-4D08-9F9B-CDCDEF4BDC98@gmail.com
2025-11-06Use XLogRecPtrIsValid() in various placesÁlvaro Herrera
Now that commit 06edbed47862 has introduced XLogRecPtrIsValid(), we can use that instead of: - XLogRecPtrIsInvalid() - direct comparisons with InvalidXLogRecPtr - direct comparisons with literal 0 This makes the code more consistent. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aQB7EvGqrbZXrMlg@ip-10-97-1-34.eu-west-3.compute.internal
2025-11-03Add wal_fpi_bytes to VACUUM and ANALYZE logsMichael Paquier
The new wal_fpi_bytes counter calculates the total amount of full page images inserted in WAL records, in bytes. This commit adds this information to VACUUM and ANALYZE logs alongside the existing counters, building upon f9a09aa29520. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aQMMSSlFXy4Evxn3@paquier.xyz
2025-10-31Mark function arguments of type "Datum *" as "const Datum *" where possiblePeter Eisentraut
Several functions in the codebase accept "Datum *" parameters but do not modify the pointed-to data. These have been updated to take "const Datum *" instead, improving type safety and making the interfaces clearer about their intent. This change helps the compiler catch accidental modifications and better documents immutability of arguments. Most of "Datum *" parameters have a pairing "bool *isnull" parameter, they are constified as well. No functional behavior is changed by this patch. Author: Chao Li <lic@highgo.com> Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2msfT0knvzUa72ZBwu9LR_RLY4on85w2a9YpE-o2By5HQ@mail.gmail.com
2025-10-30Mark ItemPointer arguments as const throughoutPeter Eisentraut
This is a follow up 991295f. I searched over src/ and made all ItemPointer arguments as const as much as possible. Note: We cut out from the original patch the pieces that would have created incompatibilities in the index or table AM APIs. Those could be considered separately. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/CAEoWx2nBaypg16Z5ciHuKw66pk850RFWw9ACS2DqqJ_AkKeRsw%40mail.gmail.com
2025-10-27Remove Item typePeter Eisentraut
This type is just char * underneath, it provides no real value, no type safety, and just makes the code one level more mysterious. It is more idiomatic to refer to blobs of memory by a combination of void * and size_t, so change it to that. Also, since this type hides the pointerness, we can't apply qualifiers to what is pointed to, which requires some unconstify nonsense. This change allows fixing that. Extension code that uses the Item type can change its code to use void * to be backward compatible. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/c75cccf5-5709-407b-a36a-2ae6570be766@eisentraut.org
2025-10-15Add log_autoanalyze_min_durationPeter Eisentraut
The log output functionality of log_autovacuum_min_duration applies to both VACUUM and ANALYZE, so it is not possible to separate the VACUUM and ANALYZE log output thresholds. Logs are likely to be output only for VACUUM and not for ANALYZE. Therefore, we decided to separate the threshold for log output of VACUUM by autovacuum (log_autovacuum_min_duration) and the threshold for log output of ANALYZE by autovacuum (log_autoanalyze_min_duration). Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Kasahara Tatsuhito <kasaharatt@oss.nttdata.com> Discussion: https://www.postgresql.org/message-id/flat/CAOzEurQtfV4MxJiWT-XDnimEeZAY+rgzVSLe8YsyEKhZcajzSA@mail.gmail.com
2025-10-14Make heap_page_is_all_visible independent of LVRelStateMelanie Plageman
This function only requires a few fields from LVRelState, so pass them in individually. This change allows calling heap_page_is_all_visible() from code such as pruneheap.c, which does not have access to an LVRelState. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl%40mdyyqpqzxjek
2025-10-14Add helper for freeze determination to heap_page_prune_and_freezeMelanie Plageman
After scanning the line pointers on a heap page during the first phase of vacuum, we use the information collected to decide whether to use the assembled freeze plans. Move this decision logic into a helper function to improve readability. While here, rename a PruneState member and disambiguate some local variables in heap_page_prune_and_freeze(). Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl%40mdyyqpqzxjek
2025-10-13Eliminate XLOG_HEAP2_VISIBLE from vacuum phase IIIMelanie Plageman
Instead of emitting a separate XLOG_HEAP2_VISIBLE WAL record for each page that becomes all-visible in vacuum's third phase, specify the VM changes in the already emitted XLOG_HEAP2_PRUNE_VACUUM_CLEANUP record. Visibility checks are now performed before marking dead items unused. This is safe because the heap page is held under exclusive lock for the entire operation. This reduces the number of WAL records generated by VACUUM phase III by up to 50%. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com
2025-10-09Eliminate COPY FREEZE use of XLOG_HEAP2_VISIBLEMelanie Plageman
Instead of emitting a separate WAL XLOG_HEAP2_VISIBLE record for setting bits in the VM, specify the VM block changes in the XLOG_HEAP2_MULTI_INSERT record. This halves the number of WAL records emitted by COPY FREEZE. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com
2025-10-08bufmgr: fewer calls to BufferDescriptorGetContentLockAndres Freund
We're planning to merge buffer content locks into BufferDesc.state. To reduce the size of that patch, centralize calls to BufferDescriptorGetContentLock(). The biggest part of the change is in assertions, by introducing BufferIsLockedByMe[InMode]() (and removing BufferIsExclusiveLocked()). This seems like an improvement even without aforementioned plans. Additionally replace some direct calls to LWLockAcquire() with calls to LockBuffer(). Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff
2025-10-03Fix reuse-after-free hazard in dead_items_resetJohn Naylor
In similar vein to commit ccc8194e427, a reset instance of a shared memory TID store happened to occupy the same private memory as the old one for the entry point, since the chunk freed after the last round of index vacuuming was put on the context's freelist. The failure to update the vacrel->dead_items pointer was evident by nudging the system to allocate memory in a different area. This was not discovered at the time of the earlier commit since our regression tests didn't cover multiple index passes with parallel vacuum. Backpatch to v17, when TidStore came in. Author: Kevin Oommen Anish <kevin.o@zohocorp.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Tested-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/199a07cbdfc.7a1c4aac25838.1675074408277594551%40zohocorp.com Backpatch-through: 17
2025-09-24Correct prune WAL record opcode name in commentMelanie Plageman
f83d709760d8 incorrectly refers to a XLOG_HEAP2_PRUNE_FREEZE WAL record opcode. No such code exists. The relevant opcodes are XLOG_HEAP2_PRUNE_ON_ACCESS, XLOG_HEAP2_PRUNE_VACUUM_SCAN, and XLOG_HEAP2_PRUNE_VACUUM_CLEANUP. Correct it. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/yn4zp35kkdsjx6wf47zcfmxgexxt4h2og47pvnw2x5ifyrs3qc%407uw6jyyxuyf7
2025-09-24Fix incorrect and inconsistent comments in tableam.h and heapam.c.Fujii Masao
This commit corrects several issues in function comments: * The parameter "rel" was incorrectly referred to as "relation" in the comments for table_tuple_delete(), table_tuple_update(), and table_tuple_lock(). * In table_tuple_delete(), "changingPart" was listed as an output parameter in the comments but is actually input. * In table_tuple_update(), "slot" was listed as an input parameter in the comments but is actually output. * The comment for "update_indexes" in table_tuple_update() was mis-indented. * The comments for heap_lock_tuple() incorrectly referenced a non-existent "tid" parameter. Author: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAEoWx2nB6Ay8g=KEn7L3qbYX_4+sLk9XOMkV0XZqHR4cTY8ZvQ@mail.gmail.com
2025-09-08Add error codes when vacuum discovers VM corruptionMelanie Plageman
Commit fd6ec93bf890314a and other previous work established the principle that when an error is potentially reachable in case of on-disk corruption but is not expected to be reached otherwise, ERRCODE_DATA_CORRUPTED should be used. This allows log monitoring software to search for evidence of corruption by filtering on the error code. Enhance the existing log messages emitted when the heap page is found to be inconsistent with the VM by adding this error code. Suggested-by: Andrey Borodin <x4mmm@yandex-team.ru> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/87DD95AA-274F-4F4F-BAD9-7738E5B1F905%40yandex-team.ru
2025-09-08Remove unneeded VM pin from VM replayMelanie Plageman
Previously, heap_xlog_visible() called visibilitymap_pin() even after getting a buffer from XLogReadBufferForRedoExtended() -- which returns a pinned buffer containing the specified block of the visibility map. This would just have resulted in visibilitymap_pin() returning early since the specified page was already present and pinned, but it was confusing extraneous code, so remove it. It doesn't seem worth backporting, though. It appears to be an oversight in 2c03216. While we are at it, remove two VM-related redundant asserts in the COPY FREEZE code path. visibilitymap_set() already asserts that PD_ALL_VISIBLE is set on the heap page and checks that the vmbuffer contains the bits corresponding to the specified heap block, so callers do not also need to check this. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CALdSSPhu7WZd%2BEfQDha1nz%3DDC93OtY1%3DUFEdWwSZsASka_2eRQ%40mail.gmail.com
2025-09-05Add assert and log message to visibilitymap_setMelanie Plageman
Add an assert to visibilitymap_set() that the provided heap buffer is exclusively locked, which is expected. Also, enhance the debug logging message to specify which VM flags were set. Based on a related suggestion by Kirill Reshke on an in-progress patchset. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CALdSSPhAU56g1gGVT0%2BwG8RrSWE6qW8TOfNJS1HNAWX6wPgbFA%40mail.gmail.com
2025-08-29Remove unneeded casts of BufferGetPage() resultPeter Eisentraut
BufferGetPage() already returns type Page, so casting it to Page doesn't achieve anything. A sizable number of call sites does this casting; remove that. This was already done inconsistently in the code in the first import in 1996 (but didn't exist in the pre-1995 code), and it was then apparently just copied around. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CALdSSPgFhc5=vLqHdk-zCcnztC0zEY3EU_Q6a9vPEaw7FkE9Vw@mail.gmail.com
2025-08-28Avoid including commands/dbcommands.h in so many placesÁlvaro Herrera
This has been done historically because of get_database_name (which since commit cb98e6fb8fd4 belongs in lsyscache.c/h, so let's move it there) and get_database_oid (which is in the right place, but whose declaration should appear in pg_database.h rather than dbcommands.h). Clean this up. Also, xlogreader.h and stringinfo.h are no longer needed by dbcommands.h since commit f1fd515b393a, so remove them. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/202508191031.5ipojyuaswzt@alvherre.pgsql
2025-08-22Use ereport() rather than elog()Heikki Linnakangas
Noah pointed this out before I committed 50f770c3d9, but I accidentally pushed the old version with elog() anyway. Oops. Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/20250820003756.31.nmisch@google.com
2025-08-22Revert GetTransactionSnapshot() to return historic snapshot during LRHeikki Linnakangas
Commit 1585ff7387 changed GetTransactionSnapshot() to throw an error if it's called during logical decoding, instead of returning the historic snapshot. I made that change for extra protection, because a historic snapshot can only be used to access catalog tables while GetTransactionSnapshot() is usually called when you're executing arbitrary queries. You might get very subtle visibility problems if you tried to use the historic snapshot for arbitrary queries. There's no built-in code in PostgreSQL that calls GetTransactionSnapshot() during logical decoding, but it turns out that the pglogical extension does just that, to evaluate row filter expressions. You would get weird results if the row filter runs arbitrary queries, but it is sane as long as you don't access any non-catalog tables. Even though there are no checks to enforce that in pglogical, a typical row filter expression does not access any tables and works fine. Accessing tables marked with the user_catalog_table = true option is also OK. To fix pglogical with row filters, and any other extensions that might do similar things, revert GetTransactionSnapshot() to return a historic snapshot during logical decoding. To try to still catch the unsafe usage of historic snapshots, add checks in heap_beginscan() and index_beginscan() to complain if you try to use a historic snapshot to scan a non-catalog table. We're very close to the version 18 release however, so add those new checks only in master. Backpatch-through: 18 Reported-by: Noah Misch <noah@leadboat.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/20250809222338.cc.nmisch@google.com
2025-07-24Fix return value of visibilitymap_get_status().Nathan Bossart
This function is declared as returning a uint8, but it returns a bool in one code path. To fix, return (uint8) 0 instead of false there. This should behave exactly the same as before, but it might prevent future compiler complaints. Oversight in commit a892234f83. Author: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://postgr.es/m/aIHluT2isN58jqHV%40jrouhaud
2025-07-02Fix missing FSM vacuum opportunities on tables without indexes.Masahiko Sawada
Commit c120550edb86 optimized the vacuuming of relations without indexes (a.k.a. one-pass strategy) by directly marking dead item IDs as LP_UNUSED. However, the periodic FSM vacuum was still checking if dead item IDs had been marked as LP_DEAD when attempting to vacuum the FSM every VACUUM_FSM_EVERY_PAGES blocks. This condition was never met due to the optimization, resulting in missed FSM vacuum opportunities. This commit modifies the periodic FSM vacuum condition to use the number of tuples deleted during HOT pruning. This count includes items marked as either LP_UNUSED or LP_REDIRECT, both of which are expected to result in new free space to report. Back-patch to v17 where the vacuum optimization for tables with no indexes was introduced. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAD21AoBL8m6B9GSzQfYxVaEgvD7-Kr3AJaS-hJPHC+avm-29zw@mail.gmail.com Backpatch-through: 17
2025-06-30Rationalize handling of VacuumParamsMichael Paquier
This commit refactors the vacuum routines that rely on VacuumParams, adding const markers where necessary to force a new policy in the code. This structure should not use a pointer as it may be used across multiple relations, and its contents should never be updated. vacuum_rel() stands as an exception as it touches the "index_cleanup" and "truncate" options. VacuumParams has been introduced in 0d831389749a, and 661643dedad9 has fixed a bug impacting VACUUM operating on multiple relations. The changes done in tableam.h break ABI compatibility, so this commit can only happen on HEAD. Author: Shihao Zhong <zhong950419@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CAGRkXqTo+aK=GTy5pSc-9cy8H2F2TJvcrZ-zXEiNJj93np1UUw@mail.gmail.com
2025-06-28Message style improvementsPeter Eisentraut
2025-06-26Remove unused check in heap_xlog_insert()Melanie Plageman
8e03eb92e9a reverted the commit 39b66a91bd which allowed freezing in the heap_insert() code path but forgot to remove the corresponding check in heap_xlog_insert(). This code is extraneous but not harmful. However, cleaning it up makes it very clear that, as of now, we do not support any freezing of pages in the heap_insert() path. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/flat/CAAKRu_Zp4Pi-t51OFWm1YZ-cctDfBhHCMZ%3DEx6PKxv0o8y2GvA%40mail.gmail.com Backpatch-through: 14
2025-06-26Simplify vacuum VM update logging countersMelanie Plageman
We can simplify the VM counters added in dc6acfd910b8 to lazy_vacuum_heap_page() and lazy_scan_new_or_empty(). We won't invoke lazy_vacuum_heap_page() unless there are dead line pointers, so we know the page can't be all-visible. In lazy_scan_new_or_empty(), we only update the VM if the page-level hint PD_ALL_VISIBLE is clear, and the VM bit cannot be set if the page level bit is clear because a subsequent page update would fail to clear the visibility map bit. Simplify the logic for determining which log counters to increment based on this knowledge. Doing so is worthwhile because the old logic was confusing and misguided. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_a9w_n2mwY%3DG4LjfWTvRTJtjbfvnYAKi4WjO8QXHHrA0g%40mail.gmail.com