PostgreSQL源码分析——日志归档

发布于:2024-06-26 ⋅ 阅读:(136) ⋅ 点赞:(0)

PG中有日志归档功能,主要目的就是备份恢复,PITR,为啥要做日志归档呢?因为在做检查点时会清理WAL日志,清理了之后,就没法实现恢复到任意时刻数据库状态了,而有了日志归档,我们可以保存从数据库初始状态到当前时刻的所有日志,相当于给数据库做了一个备份。当发生故障或者误操作时,可以恢复到指定时刻数据库的状态。

打开日志归档

在配置文件中配置archive_mode=on打开日志归档,启动时会创建归档进程archiver,通过archive_command中配置的命令进行归档。

# - Archiving -
archive_mode = on               # enables archiving; off, on, or always (change requires restart)
archive_command = 'cp %p /home/postgres/pgsql/archive/%f'               # command to use to archive a logfile segment
                                # placeholders: %p = path of file to archive
                                #               %f = file name only
                                # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
archive_timeout = 1800          # force a logfile segment switch after this
                                # number of seconds; 0 disables
归档进程源码

我们看一下归档进程的源码,在src/backend/postmaster/pgarch.c中:

PgArchiverMain(void)
--> pgarch_MainLoop();	// 进入归档主循环
	--> pgarch_ArchiverCopyLoop(); 
		--> pgarch_readyXlog

日志归档的逻辑,主要是什么时候进行归档?核心要点是发生日志段切换时会触发,那我们看一下那些情况会触发日志切换

  • 当WAL日志中的一个日志段(日志文件)已满,需要切换到下一个日志段时,就可以通知archiver进程将这个日志归档。产生日志切换的进程会在通知Postmaster之前先在pg_wal/archive_status下生成一个.ready文件,这个文件和待归档日志同名。
  • 如果长时间没有归档,触发archive_timeout超时,则强制进行日志切换,强制归档
  • 调用pg_switch_wal()函数手动触发

我们看一下归档进程主循环的实现逻辑,就是等待归档通知信号,拷贝日志:

static void pgarch_MainLoop(void)
{
   
	pg_time_t	last_copy_time = 0;
	bool		time_to_stop;
	// 进入主循环, 等待收到日志归档通知
	do {
   
		ResetLatch(MyLatch);

		/* When we get SIGUSR2, we do one more archive cycle, then exit */
		time_to_stop = ready_to_stop;

		/* Check for barrier events and config update */
		HandlePgArchInterrupts();

		// ...

		/* Do what we're here for */
		pgarch_ArchiverCopyLoop();		// 进行日志归档,拷贝WAL日志
		last_copy_time = time(NULL);

		/* Sleep until a signal is received, or until a poll is forced by
		 * PGARCH_AUTOWAKE_INTERVAL having passed since last_copy_time, or until postmaster dies. */
		if (!time_to_stop)		/* Don't wait during last iteration */
		{
   
			pg_time_t	curtime = (pg_time_t) time(NULL);
			int			timeout;

			timeout = PGARCH_AUTOWAKE_INTERVAL - (curtime - last_copy_time);
			if (timeout > 0) {
   
				int			rc;
				rc = WaitLatch(MyLatch, WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,timeout * 1000L, WAIT_EVENT_ARCHIVER_MAIN);
				if (rc & WL_POSTMASTER_DEATH)
					time_to_stop = true;
			}
		}
	} while (!time_to_stop);
}
触发归档的时机1

其中最重要的就是什么时候发信号,通知可以归档,是在切换日志段的时候,什么时候会切换日志段呢?用户可以通过调用pg_switch_wal函数强制切换日志段,正常情况下是不断插入日志的过程中,如果超出了日志段的大小,会触发切换日志段。我们看一下这块的处理逻辑。

具体的XLogWrite调用过程可参考文章PostgreSQL源码分析——WAL日志(二)

static void XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
{
   
	// ...
			/*
			 * If we just wrote the whole last page of a logfile segment,
			 * fsync the segment immediately.  This avoids having to go back
			 * and re-open prior segments when an fsync request comes along
			 * later. Doing it here ensures that one and only one backend will
			 * perform this fsync.
			 *
			 * This is also the right place to notify the Archiver that the
			 * segment is ready to copy to archival storage, and to update the
			 * timer for archive_timeout, and to signal for a checkpoint if
			 * too many logfile segments have been used since the last checkpoint. */
			if (finishing_seg)	// 一个段已满
			{
   
				// 将该段刷入磁盘,保证归档日志的数据完整性
				issue_xlog_fsync(openLogFile, openLogSegNo);

				// 通知walsender进程发送日志给standby
				/* signal that we need to wakeup walsenders later */
				WalSndWakeupRequest();

				LogwrtResult.Flush = LogwrtResult.Write;	/* end of page */

				if (XLogArchivingActive())
					XLogArchiveNotifySeg(openLogSegNo);	// 发送日志归档通知信息

				// 更新日志切换时间,计算archive_timeout用
				XLogCtl->lastSegSwitchTime = (pg_time_t) time(NULL);
				XLogCtl->lastSegSwitchLSN = LogwrtResult.Flush;

				/*
				 * Request a checkpoint if we've consumed too much xlog since
				 * the last one.  For speed, we first check using the local
				 * copy of RedoRecPtr, which might be out of date; if it looks
				 * like a checkpoint is needed, forcibly update RedoRecPtr and
				 * recheck.
				 */
				if (IsUnderPostmaster && XLogCheckpointNeeded(openLogSegNo))
				{
   
					(void) GetRedoRecPtr();
					if (XLogCheckpointNeeded(openLogSegNo))
						RequestCheckpoint(CHECKPOINT_CAUSE_XLOG);
				}
			}
}


我们看一下这个XLogArchiveNotify函数实现,日志归档通知,创建一个.ready文件,表示可以进行归档。当归档完成时,将对应的.ready文件重命名为.done文件

void XLogArchiveNotifySeg