引入

表级别的storage parameter参数vacuum_truncate是在PG12后引入的,加入该参数用来控制是否回收表尾部的空闲空间,在之前的版本中默认是一直回收表尾部的空闲空间。在执行表的尾部空闲空间时,会对表添加 ACCESS EXCLUSIVE lock,这是最严格的锁,会堵塞对表的所有操作。

Add vacuum_truncate reloption.
vacuum_truncate controls whether vacuum tries to truncate off
any empty pages at the end of the table. Previously vacuum always
tried to do the truncation. However, the truncation could cause
some problems; for example, ACCESS EXCLUSIVE lock needs to
be taken on the table during the truncation and can cause
the query cancellation on the standby even if hot_standby_feedback
is true. Setting this reloption to false can be helpful to avoid
such problems.

Author: Tsunakawa Takayuki
Reviewed-By: Julien Rouhaud, Masahiko Sawada, Michael Paquier, Kirk Jamison and Fujii Masao
Discussion: https://postgr.es/m/CAHGQGwE5UqFqSq1=kV3QtTUtXphTdyHA-8rAj4A=Y+e4kyp3BQ@mail.gmail.com

执行truncate操作部分是在函数lazy_truncate_heap中执行的,大题流程如下:

需要注意的是,表尾部空闲页的truncate,pg是通过扫描整个buffer pool来实现的,具体代码调用路径如下:

lazy_truncate_heap->RelationTruncate->smgrtruncate->DropRelFileNodeBuffers

所以,buffer pool越大,独占锁加的时间越久,对表的堵塞也就越久。这里和执行drop/truncate操作都是一样的。

/*
 * lazy_truncate_heap - try to truncate off any empty pages at the end
 */
static void
lazy_truncate_heap(LVRelState *vacrel)
{
	BlockNumber old_rel_pages = vacrel->rel_pages;
	BlockNumber new_rel_pages;
	bool		lock_waiter_detected;
	int			lock_retry;

	/* Report that we are now truncating */
	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
								 PROGRESS_VACUUM_PHASE_TRUNCATE);

	/*
	 * Loop until no more truncating can be done.
	 */
	do
	{
		PGRUsage	ru0;

		pg_rusage_init(&ru0);

		/*
		 * We need full exclusive lock on the relation in order to do
		 * truncation. If we can't get it, give up rather than waiting --- we
		 * don't want to block other backends, and we don't want to deadlock
		 * (which is quite possible considering we already hold a lower-grade
		 * lock).
		 */
		lock_waiter_detected = false;
		lock_retry = 0;
		while (true)
		{
			if (ConditionalLockRelation(vacrel->rel, AccessExclusiveLock))
				break;

			/*
			 * Check for interrupts while trying to (re-)acquire the exclusive
			 * lock.
			 */
			CHECK_FOR_INTERRUPTS();

			if (++lock_retry > (VACUUM_TRUNCATE_LOCK_TIMEOUT /
								VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL))
			{
				/*
				 * We failed to establish the lock in the specified number of
				 * retries. This means we give up truncating.
				 */
				ereport(elevel,
						(errmsg("\"%s\": stopping truncate due to conflicting lock request",
								vacrel->relname)));
				return;
			}

			pg_usleep(VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL * 1000L);
		}

		/*
		 * Now that we have exclusive lock, look to see if the rel has grown
		 * whilst we were vacuuming with non-exclusive lock.  If so, give up;
		 * the newly added pages presumably contain non-deletable tuples.
		 */
		new_rel_pages = RelationGetNumberOfBlocks(vacrel->rel);
		if (new_rel_pages != old_rel_pages)
		{
			/*
			 * Note: we intentionally don't update vacrel->rel_pages with the
			 * new rel size here.  If we did, it would amount to assuming that
			 * the new pages are empty, which is unlikely. Leaving the numbers
			 * alone amounts to assuming that the new pages have the same
			 * tuple density as existing ones, which is less unlikely.
			 */
			UnlockRelation(vacrel->rel, AccessExclusiveLock);
			return;
		}

		/*
		 * Scan backwards from the end to verify that the end pages actually
		 * contain no tuples.  This is *necessary*, not optional, because
		 * other backends could have added tuples to these pages whilst we
		 * were vacuuming.
		 */
		new_rel_pages = count_nondeletable_pages(vacrel, &lock_waiter_detected);
		vacrel->blkno = new_rel_pages;

		if (new_rel_pages >= old_rel_pages)
		{
			/* can't do anything after all */
			UnlockRelation(vacrel->rel, AccessExclusiveLock);
			return;
		}

		/*
		 * Okay to truncate.
		 */
		RelationTruncate(vacrel->rel, new_rel_pages);

		/*
		 * We can release the exclusive lock as soon as we have truncated.
		 * Other backends can't safely access the relation until they have
		 * processed the smgr invalidation that smgrtruncate sent out ... but
		 * that should happen as part of standard invalidation processing once
		 * they acquire lock on the relation.
		 */
		UnlockRelation(vacrel->rel, AccessExclusiveLock);

		/*
		 * Update statistics.  Here, it *is* correct to adjust rel_pages
		 * without also touching reltuples, since the tuple count wasn't
		 * changed by the truncation.
		 */
		vacrel->pages_removed += old_rel_pages - new_rel_pages;
		vacrel->rel_pages = new_rel_pages;

		ereport(elevel,
				(errmsg("table \"%s\": truncated %u to %u pages",
						vacrel->relname,
						old_rel_pages, new_rel_pages),
				 errdetail_internal("%s",
									pg_rusage_show(&ru0))));
		old_rel_pages = new_rel_pages;
	} while (new_rel_pages > vacrel->nonempty_pages && lock_waiter_detected);
}

对表尾部空闲页的truncate属于vacuum的倒数第二步,vaccum的操作步骤如下:

image-20250109175206920

相比于表数据的空闲页truncate,索引清理的操作在倒数第三步,索引中的空闲页并不会执行truncate操作,被回收到操作系统,而是放入到fsm中,等待下次数据写入或是索引分裂。

影响

由于执行vacuum的truncate操作时,会对表加独占锁,所以存在下面几种影响:

上面的三种情况,第三种可能不容易遇见,不过遇到后可能不容易排查。

优化

由于目前官方并没有全局参数控制,如果要关闭只能一个一个表的执行alter进行关闭。

建议对于在主库频繁执行vacuum和从库经常查询的表上禁止vacuum的truncate操作,减少对主库和从库上操作的影响。

参考:

https://www.postgresql.org/message-id/flat/Z2DE4lDX4tHqNGZt%40dev.null

https://mp.weixin.qq.com/s/adkqKqNBP9b_yJhZu4i1Jw

https://www.cybertec-postgresql.com/en/drop-table-killing-shared_buffers/

https://blog.summercat.com/postgres-vacuum-taking-an-access-exclusive-lock.html

PG14.4:vacuumlazy.c:lazy_truncate_heap

❤️ 转载文章请注明出处,谢谢!❤️