Skip to content

Commit ce2a644

Browse files
author
Commitfest Bot
committed
[CF 6265] v4 - Add MODE parameter to WAIT FOR LSN command
This branch was automatically generated by a robot using patches from an email thread registered at: https://commitfest.postgresql.org/patch/6265 The branch will be overwritten each time a new patch version is posted to the thread, and also periodically to check for bitrot caused by changes on the master branch. Patch(es): https://www.postgresql.org/message-id/CABPTF7U2cYN=bMZirqj93Zv-aqBdw4f=wPRwovTzWKP=adYhDg@mail.gmail.com Author(s): Xuneng Zhou
2 parents 4f7dacc + 87ac09a commit ce2a644

File tree

14 files changed

+662
-127
lines changed

14 files changed

+662
-127
lines changed

doc/src/sgml/ref/wait_for.sgml

Lines changed: 143 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,13 @@ PostgreSQL documentation
1616

1717
<refnamediv>
1818
<refname>WAIT FOR</refname>
19-
<refpurpose>wait for target <acronym>LSN</acronym> to be replayed, optionally with a timeout</refpurpose>
19+
<refpurpose>wait for WAL to reach a target <acronym>LSN</acronym> on a replica</refpurpose>
2020
</refnamediv>
2121

2222
<refsynopsisdiv>
2323
<synopsis>
24-
WAIT FOR LSN '<replaceable class="parameter">lsn</replaceable>' [ WITH ( <replaceable class="parameter">option</replaceable> [, ...] ) ]
24+
WAIT FOR LSN '<replaceable class="parameter">lsn</replaceable>' [ MODE { REPLAY | FLUSH | WRITE } ]
25+
[ WITH ( <replaceable class="parameter">option</replaceable> [, ...] ) ]
2526

2627
<phrase>where <replaceable class="parameter">option</replaceable> can be:</phrase>
2728

@@ -34,20 +35,22 @@ WAIT FOR LSN '<replaceable class="parameter">lsn</replaceable>' [ WITH ( <replac
3435
<title>Description</title>
3536

3637
<para>
37-
Waits until recovery replays <parameter>lsn</parameter>.
38-
If no <parameter>timeout</parameter> is specified or it is set to
39-
zero, this command waits indefinitely for the
40-
<parameter>lsn</parameter>.
41-
On timeout, or if the server is promoted before
42-
<parameter>lsn</parameter> is reached, an error is emitted,
43-
unless <literal>NO_THROW</literal> is specified in the WITH clause.
44-
If <parameter>NO_THROW</parameter> is specified, then the command
45-
doesn't throw errors.
38+
Waits until the specified <parameter>lsn</parameter> is reached
39+
according to the specified <parameter>mode</parameter>,
40+
which determines whether to wait for WAL to be written, flushed, or replayed.
41+
If no <parameter>timeout</parameter> is specified or it is set to
42+
zero, this command waits indefinitely for the
43+
<parameter>lsn</parameter>.
44+
On timeout, or if the server is promoted before
45+
<parameter>lsn</parameter> is reached, an error is emitted,
46+
unless <literal>NO_THROW</literal> is specified in the WITH clause.
47+
If <parameter>NO_THROW</parameter> is specified, then the command
48+
doesn't throw errors.
4649
</para>
4750

4851
<para>
49-
The possible return values are <literal>success</literal>,
50-
<literal>timeout</literal>, and <literal>not in recovery</literal>.
52+
The possible return values are <literal>success</literal>,
53+
<literal>timeout</literal>, and <literal>not in recovery</literal>.
5154
</para>
5255
</refsect1>
5356

@@ -64,6 +67,61 @@ WAIT FOR LSN '<replaceable class="parameter">lsn</replaceable>' [ WITH ( <replac
6467
</listitem>
6568
</varlistentry>
6669

70+
<varlistentry>
71+
<term><literal>MODE</literal></term>
72+
<listitem>
73+
<para>
74+
Specifies the type of LSN processing to wait for. If not specified,
75+
the default is <literal>REPLAY</literal>. The valid modes are:
76+
</para>
77+
78+
<variablelist>
79+
<varlistentry>
80+
<term><literal>REPLAY</literal></term>
81+
<listitem>
82+
<para>
83+
Wait for the LSN to be replayed (applied to the database).
84+
After successful completion, <function>pg_last_wal_replay_lsn()</function>
85+
will return a value greater than or equal to the target LSN.
86+
</para>
87+
</listitem>
88+
</varlistentry>
89+
90+
<varlistentry>
91+
<term><literal>FLUSH</literal></term>
92+
<listitem>
93+
<para>
94+
Wait for the WAL containing the LSN to be received from the
95+
primary and flushed to disk. This provides a durability guarantee
96+
without waiting for the WAL to be applied. After successful
97+
completion, <function>pg_last_wal_receive_lsn()</function>
98+
will return a value greater than or equal to the target LSN.
99+
This value is also available as the <structfield>flushed_lsn</structfield>
100+
column in <link linkend="monitoring-pg-stat-wal-receiver-view">
101+
<structname>pg_stat_wal_receiver</structname></link>.
102+
</para>
103+
</listitem>
104+
</varlistentry>
105+
106+
<varlistentry>
107+
<term><literal>WRITE</literal></term>
108+
<listitem>
109+
<para>
110+
Wait for the WAL containing the LSN to be received from the
111+
primary and written to disk, but not yet flushed. This is faster
112+
than <literal>FLUSH</literal> but provides weaker durability
113+
guarantees since the data may still be in operating system buffers.
114+
After successful completion, the <structfield>written_lsn</structfield>
115+
column in <link linkend="monitoring-pg-stat-wal-receiver-view">
116+
<structname>pg_stat_wal_receiver</structname></link> will show
117+
a value greater than or equal to the target LSN.
118+
</para>
119+
</listitem>
120+
</varlistentry>
121+
</variablelist>
122+
</listitem>
123+
</varlistentry>
124+
67125
<varlistentry>
68126
<term><literal>WITH ( <replaceable class="parameter">option</replaceable> [, ...] )</literal></term>
69127
<listitem>
@@ -135,9 +193,12 @@ WAIT FOR LSN '<replaceable class="parameter">lsn</replaceable>' [ WITH ( <replac
135193
<listitem>
136194
<para>
137195
This return value denotes that the database server is not in a recovery
138-
state. This might mean either the database server was not in recovery
139-
at the moment of receiving the command, or it was promoted before
140-
reaching the target <parameter>lsn</parameter>.
196+
state. This might mean either the database server was not in recovery
197+
at the moment of receiving the command (i.e., executed on a primary),
198+
or it was promoted before reaching the target <parameter>lsn</parameter>.
199+
In the promotion case, this status indicates a timeline change occurred,
200+
and the application should re-evaluate whether the target LSN is still
201+
relevant.
141202
</para>
142203
</listitem>
143204
</varlistentry>
@@ -148,25 +209,33 @@ WAIT FOR LSN '<replaceable class="parameter">lsn</replaceable>' [ WITH ( <replac
148209
<title>Notes</title>
149210

150211
<para>
151-
<command>WAIT FOR</command> command waits till
152-
<parameter>lsn</parameter> to be replayed on standby.
153-
That is, after this command execution, the value returned by
154-
<function>pg_last_wal_replay_lsn</function> should be greater or equal
155-
to the <parameter>lsn</parameter> value. This is useful to achieve
156-
read-your-writes-consistency, while using async replica for reads and
157-
primary for writes. In that case, the <acronym>lsn</acronym> of the last
158-
modification should be stored on the client application side or the
159-
connection pooler side.
212+
<command>WAIT FOR</command> waits until the specified
213+
<parameter>lsn</parameter> is reached according to the specified
214+
<parameter>mode</parameter>. The <literal>REPLAY</literal> mode waits
215+
for the LSN to be replayed (applied to the database), which is useful
216+
to achieve read-your-writes consistency while using an async replica
217+
for reads and the primary for writes. The <literal>FLUSH</literal> mode
218+
waits for the WAL to be flushed to durable storage on the replica,
219+
providing a durability guarantee without waiting for replay. The
220+
<literal>WRITE</literal> mode waits for the WAL to be written to the
221+
operating system, which is faster than flush but provides weaker
222+
durability guarantees. In all cases, the <acronym>LSN</acronym> of the
223+
last modification should be stored on the client application side or
224+
the connection pooler side.
160225
</para>
161226

162227
<para>
163-
<command>WAIT FOR</command> command should be called on standby.
164-
If a user runs <command>WAIT FOR</command> on primary, it
165-
will error out unless <parameter>NO_THROW</parameter> is specified in the WITH clause.
166-
However, if <command>WAIT FOR</command> is
167-
called on primary promoted from standby and <literal>lsn</literal>
168-
was already replayed, then the <command>WAIT FOR</command> command just
169-
exits immediately.
228+
<command>WAIT FOR</command> should be called on a standby.
229+
If a user runs <command>WAIT FOR</command> on the primary, it
230+
will error out unless <parameter>NO_THROW</parameter> is specified
231+
in the WITH clause. However, if <command>WAIT FOR</command> is
232+
called on a primary promoted from standby and <literal>lsn</literal>
233+
was already reached, then the <command>WAIT FOR</command> command
234+
just exits immediately. If the replica is promoted while waiting,
235+
the command will return <literal>not in recovery</literal> (or throw
236+
an error if <literal>NO_THROW</literal> is not specified). Promotion
237+
creates a new timeline, and the LSN being waited for may refer to
238+
WAL from the old timeline.
170239
</para>
171240

172241
</refsect1>
@@ -175,21 +244,21 @@ WAIT FOR LSN '<replaceable class="parameter">lsn</replaceable>' [ WITH ( <replac
175244
<title>Examples</title>
176245

177246
<para>
178-
You can use <command>WAIT FOR</command> command to wait for
179-
the <type>pg_lsn</type> value. For example, an application could update
180-
the <literal>movie</literal> table and get the <acronym>lsn</acronym> after
181-
changes just made. This example uses <function>pg_current_wal_insert_lsn</function>
182-
on primary server to get the <acronym>lsn</acronym> given that
183-
<varname>synchronous_commit</varname> could be set to
184-
<literal>off</literal>.
247+
You can use <command>WAIT FOR</command> command to wait for
248+
the <type>pg_lsn</type> value. For example, an application could update
249+
the <literal>movie</literal> table and get the <acronym>lsn</acronym> after
250+
changes just made. This example uses <function>pg_current_wal_insert_lsn</function>
251+
on primary server to get the <acronym>lsn</acronym> given that
252+
<varname>synchronous_commit</varname> could be set to
253+
<literal>off</literal>.
185254

186255
<programlisting>
187256
postgres=# UPDATE movie SET genre = 'Dramatic' WHERE genre = 'Drama';
188257
UPDATE 100
189258
postgres=# SELECT pg_current_wal_insert_lsn();
190-
pg_current_wal_insert_lsn
191-
--------------------
192-
0/306EE20
259+
pg_current_wal_insert_lsn
260+
---------------------------
261+
0/306EE20
193262
(1 row)
194263
</programlisting>
195264

@@ -198,9 +267,9 @@ pg_current_wal_insert_lsn
198267
changes made on primary should be guaranteed to be visible on replica.
199268

200269
<programlisting>
201-
postgres=# WAIT FOR LSN '0/306EE20';
270+
postgres=# WAIT FOR LSN '0/306EE20' MODE REPLAY;
202271
status
203-
--------
272+
---------
204273
success
205274
(1 row)
206275
postgres=# SELECT * FROM movie WHERE genre = 'Drama';
@@ -211,21 +280,46 @@ postgres=# SELECT * FROM movie WHERE genre = 'Drama';
211280
</para>
212281

213282
<para>
214-
If the target LSN is not reached before the timeout, the error is thrown.
283+
Wait for flush (data durable on replica):
215284

216285
<programlisting>
217-
postgres=# WAIT FOR LSN '0/306EE20' WITH (TIMEOUT '0.1s');
286+
postgres=# WAIT FOR LSN '0/306EE20' MODE FLUSH;
287+
status
288+
---------
289+
success
290+
(1 row)
291+
</programlisting>
292+
</para>
293+
294+
<para>
295+
Wait for write with timeout:
296+
297+
<programlisting>
298+
postgres=# WAIT FOR LSN '0/306EE20' MODE WRITE WITH (TIMEOUT '100ms', NO_THROW);
299+
status
300+
---------
301+
success
302+
(1 row)
303+
</programlisting>
304+
</para>
305+
306+
<para>
307+
If the target LSN is not reached before the timeout, an error is thrown:
308+
309+
<programlisting>
310+
postgres=# WAIT FOR LSN '0/306EE20' MODE REPLAY WITH (TIMEOUT '0.1s');
218311
ERROR: timed out while waiting for target LSN 0/306EE20 to be replayed; current replay LSN 0/306EA60
219312
</programlisting>
220313
</para>
221314

222315
<para>
223316
The same example uses <command>WAIT FOR</command> with
224-
<parameter>NO_THROW</parameter> option.
317+
<parameter>NO_THROW</parameter> option:
318+
225319
<programlisting>
226-
postgres=# WAIT FOR LSN '0/306EE20' WITH (TIMEOUT '100ms', NO_THROW);
320+
postgres=# WAIT FOR LSN '0/306EE20' MODE REPLAY WITH (TIMEOUT '100ms', NO_THROW);
227321
status
228-
--------
322+
---------
229323
timeout
230324
(1 row)
231325
</programlisting>

src/backend/access/transam/xlog.c

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6238,10 +6238,12 @@ StartupXLOG(void)
62386238
LWLockRelease(ControlFileLock);
62396239

62406240
/*
6241-
* Wake up all waiters for replay LSN. They need to report an error that
6242-
* recovery was ended before reaching the target LSN.
6241+
* Wake up all waiters. They need to report an error that recovery was
6242+
* ended before reaching the target LSN.
62436243
*/
6244-
WaitLSNWakeup(WAIT_LSN_TYPE_REPLAY, InvalidXLogRecPtr);
6244+
WaitLSNWakeup(WAIT_LSN_TYPE_REPLAY_STANDBY, InvalidXLogRecPtr);
6245+
WaitLSNWakeup(WAIT_LSN_TYPE_WRITE_STANDBY, InvalidXLogRecPtr);
6246+
WaitLSNWakeup(WAIT_LSN_TYPE_FLUSH_STANDBY, InvalidXLogRecPtr);
62456247

62466248
/*
62476249
* Shutdown the recovery environment. This must occur after

src/backend/access/transam/xlogrecovery.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1846,8 +1846,8 @@ PerformWalRecovery(void)
18461846
*/
18471847
if (waitLSNState &&
18481848
(XLogRecoveryCtl->lastReplayedEndRecPtr >=
1849-
pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_REPLAY])))
1850-
WaitLSNWakeup(WAIT_LSN_TYPE_REPLAY, XLogRecoveryCtl->lastReplayedEndRecPtr);
1849+
pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_REPLAY_STANDBY])))
1850+
WaitLSNWakeup(WAIT_LSN_TYPE_REPLAY_STANDBY, XLogRecoveryCtl->lastReplayedEndRecPtr);
18511851

18521852
/* Else, try to fetch the next WAL record */
18531853
record = ReadRecord(xlogprefetcher, LOG, false, replayTLI);

0 commit comments

Comments
 (0)