ITPub博客

首页 > 数据库 > PostgreSQL > PostgreSQL 源码解读(134)- MVCC#18(vacuum过程-HeapTupleSatisfiesVacuum函数)

PostgreSQL 源码解读(134)- MVCC#18(vacuum过程-HeapTupleSatisfiesVacuum函数)

原创 PostgreSQL 作者:husthxd 时间:2019-02-01 20:13:07 0 删除 编辑

本节简单介绍了PostgreSQL手工执行vacuum的处理流程,主要分析了ExecVacuum->vacuum->vacuum_rel->heap_vacuum_rel->lazy_scan_heap->HeapTupleSatisfiesVacuum函数的实现逻辑,该函数判断一个元组是否可对所有正在运行中的事务可见。

一、数据结构

宏定义
Vacuum和Analyze命令选项


/* ----------------------
 *      Vacuum and Analyze Statements
 *      Vacuum和Analyze命令选项
 * 
 * Even though these are nominally two statements, it's convenient to use
 * just one node type for both.  Note that at least one of VACOPT_VACUUM
 * and VACOPT_ANALYZE must be set in options.
 * 虽然在这里有两种不同的语句,但只需要使用统一的Node类型即可.
 * 注意至少VACOPT_VACUUM/VACOPT_ANALYZE在选项中设置.
 * ----------------------
 */
typedef enum VacuumOption
{
    VACOPT_VACUUM = 1 << 0,     /* do VACUUM */
    VACOPT_ANALYZE = 1 << 1,    /* do ANALYZE */
    VACOPT_VERBOSE = 1 << 2,    /* print progress info */
    VACOPT_FREEZE = 1 << 3,     /* FREEZE option */
    VACOPT_FULL = 1 << 4,       /* FULL (non-concurrent) vacuum */
    VACOPT_SKIP_LOCKED = 1 << 5,    /* skip if cannot get lock */
    VACOPT_SKIPTOAST = 1 << 6,  /* don't process the TOAST table, if any */
    VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7   /* don't skip any pages */
} VacuumOption;

二、源码解读

HeapTupleSatisfiesVacuum
HeapTupleSatisfiesVacuum为VACUUM操作确定元组的状态.在这里,我们主要想知道的是一个元组是否可对所有正在运行中的事务可见.如可见,则不能通过VACUUM删除该元组.

主要处理流程如下:
0.获取tuple并执行相关校验
1.条件:插入事务未提交
1.1条件:无效的xmin,该元组已废弃可删除
1.2条件:旧版本(9.0-)的判断
1.3条件:xmin为当前事务ID
1.4条件:插入事务非当前事务,正在进行中
1.5条件:xmin事务确实已提交(通过clog判断)
1.6条件:其他情况
— 至此,可以确定xmin已提交
2.条件:xmax是无效的事务ID,直接返回LIVE
3.条件:xmax只是锁定
3.1条件:xmax事务未提交,分多事务&非多事务进行判断
3.2条件:只是锁定,返回LIVE
4.条件:存在子事务
4.1条件:xmax正在进行,返回事务进行中
4.2条件:xmax已提交,区分xmax在OldestXmin之前还是之后
4.3条件:xmax不在运行中/没有提交/没有回滚或崩溃,则设置xmax为无效事务ID
4.4默认返回LIVE
5.条件:xmax没有提交
5.1条件:删除过程中
5.2条件:通过clog判断,该事务已提交,设置事务标记位
5.3条件:其他情况,设置为无效事务ID
5.4默认返回LIVE
— 至此,可以确定xmax已提交
6.元组xmax≥OldestXmin,最近删除
7.默认元组已DEAD


/*
 * HeapTupleSatisfiesVacuum
 *
 *  Determine the status of tuples for VACUUM purposes.  Here, what
 *  we mainly want to know is if a tuple is potentially visible to *any*
 *  running transaction.  If so, it can't be removed yet by VACUUM.
 *  为VACUUM确定元组的状态.
 *  在这里,我们主要想知道的是一个元组是否可对所有正在运行中的事务可见.
 *  如可见,则不能通过VACUUM删除该元组.
 *
 * OldestXmin is a cutoff XID (obtained from GetOldestXmin()).  Tuples
 * deleted by XIDs >= OldestXmin are deemed "recently dead"; they might
 * still be visible to some open transaction, so we can't remove them,
 * even if we see that the deleting transaction has committed.
 * OldestXmin是一个cutoff XID(通过GetOldestXmin函数获得).
 * 通过XIDs >= OldestXmin删除的元组被视为"最近死亡",它们可能仍然对某些正在进行中的事务可见,
 *   因此就算删除事务已提交,我们仍然不能清除它们.
 */
HTSV_Result
HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
                         Buffer buffer)
{
    //获取tuple
    HeapTupleHeader tuple = htup->t_data;
    //校验
    Assert(ItemPointerIsValid(&htup->t_self));
    Assert(htup->t_tableOid != InvalidOid);
    /*
     * Has inserting transaction committed?
     * 插入事务已提交?
     *
     * If the inserting transaction aborted, then the tuple was never visible
     * to any other transaction, so we can delete it immediately.
     * 如果插入事务已回滚,元组对其他事务均不可见,因此可以马上删除.
     */
    if (!HeapTupleHeaderXminCommitted(tuple))
    {
        //1.插入事务未提交
        if (HeapTupleHeaderXminInvalid(tuple))
            //1-1.无效的xmin,该元组已废弃可删除
            return HEAPTUPLE_DEAD;
        /* Used by pre-9.0 binary upgrades */
        //用于9.0以前版本的升级,HEAP_MOVED_OFF&HEAP_MOVED_IN已不再使用
        else if (tuple->t_infomask & HEAP_MOVED_OFF)
        {
            TransactionId xvac = HeapTupleHeaderGetXvac(tuple);
            if (TransactionIdIsCurrentTransactionId(xvac))
                return HEAPTUPLE_DELETE_IN_PROGRESS;
            if (TransactionIdIsInProgress(xvac))
                return HEAPTUPLE_DELETE_IN_PROGRESS;
            if (TransactionIdDidCommit(xvac))
            {
                SetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
                            InvalidTransactionId);
                return HEAPTUPLE_DEAD;
            }
            SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
                        InvalidTransactionId);
        }
        /* Used by pre-9.0 binary upgrades */
        //用于9.0以前版本的升级
        else if (tuple->t_infomask & HEAP_MOVED_IN)
        {
            TransactionId xvac = HeapTupleHeaderGetXvac(tuple);
            if (TransactionIdIsCurrentTransactionId(xvac))
                return HEAPTUPLE_INSERT_IN_PROGRESS;
            if (TransactionIdIsInProgress(xvac))
                return HEAPTUPLE_INSERT_IN_PROGRESS;
            if (TransactionIdDidCommit(xvac))
                SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
                            InvalidTransactionId);
            else
            {
                SetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
                            InvalidTransactionId);
                return HEAPTUPLE_DEAD;
            }
        }
        else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple)))
        {
            //1-3.xmin为当前事务ID
            if (tuple->t_infomask & HEAP_XMAX_INVALID)  /* xid invalid */
                //1-3-1.xmax无效,说明插入事务正在进行中
                return HEAPTUPLE_INSERT_IN_PROGRESS;
            /* only locked? run infomask-only check first, for performance */
            //只是锁定?性能考虑,首先执行infomask-only检查
            if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) ||
                HeapTupleHeaderIsOnlyLocked(tuple))
                //1-3-2.锁定状态(如for update之类),事务正在进行中
                return HEAPTUPLE_INSERT_IN_PROGRESS;
            /* inserted and then deleted by same xact */
            //插入,然后删除
            if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetUpdateXid(tuple)))
                //1-3-3.插入,然后删除
                return HEAPTUPLE_DELETE_IN_PROGRESS;
            /* deleting subtransaction must have aborted */
            //默认:插入事务正在进行中
            return HEAPTUPLE_INSERT_IN_PROGRESS;
        }
        else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple)))
        {
            //1-4.插入事务非当前事务,正在进行中
            /*
             * It'd be possible to discern between INSERT/DELETE in progress
             * here by looking at xmax - but that doesn't seem beneficial for
             * the majority of callers and even detrimental for some. We'd
             * rather have callers look at/wait for xmin than xmax. It's
             * always correct to return INSERT_IN_PROGRESS because that's
             * what's happening from the view of other backends.
             * 通过查看xmax,可以区分正在进行的插入/删除操作 - 但这对于大多数调用者并没有好处,甚至有害
             * 我们宁愿让调用者查看/等待xmin而不是xmax。
             * 返回INSERT_IN_PROGRESS总是正确的,因为这是从其他后台进程视图中看到正在发生的。
             */
            return HEAPTUPLE_INSERT_IN_PROGRESS;
        }
        else if (TransactionIdDidCommit(HeapTupleHeaderGetRawXmin(tuple)))
            //1-5.xmin事务确实已提交(通过clog判断)
            SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
                        HeapTupleHeaderGetRawXmin(tuple));
        else
        {
            //1-5.其他情况
            //既不在进行中,也没有提交,要么是回滚,要么是崩溃了
            /*
             * Not in Progress, Not Committed, so either Aborted or crashed
             */
            //设置标记位
            SetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
                        InvalidTransactionId);
            //返回废弃标记
            return HEAPTUPLE_DEAD;
        }
        /*
         * At this point the xmin is known committed, but we might not have
         * been able to set the hint bit yet; so we can no longer Assert that
         * it's set.
         * 在这个点上,xmin事务确认已提交,但这时候还是不能设置hint bit,
         *   因此不能断定已设置标记.
         */
    }
    /*
     * Okay, the inserter committed, so it was good at some point.  Now what
     * about the deleting transaction?
     * 插入数据的事务已提交,现在可以看看删除事务的状态了.
     */
    if (tuple->t_infomask & HEAP_XMAX_INVALID)
        //------- 2.xmax是无效的事务ID,直接返回LIVE
        return HEAPTUPLE_LIVE;
    if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))
    {
        //------- 3.锁定
        /*
         * "Deleting" xact really only locked it, so the tuple is live in any
         * case.  However, we should make sure that either XMAX_COMMITTED or
         * XMAX_INVALID gets set once the xact is gone, to reduce the costs of
         * examining the tuple for future xacts.
         * "Deleting"事务确实只是锁定该元组,因此该元组是存活状态.
         * 但是,我们应该确保不管是XMAX_COMMITTED还是XMAX_INVALID标记,应该在事务完结后马上设置,
         *   这样可以减少为了事务检查元组状态的成本.
         */
        if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED))
        {
            //3.1 xmax事务未提交
            if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
            {
                //3.1.1 多事务
                /*
                 * If it's a pre-pg_upgrade tuple, the multixact cannot
                 * possibly be running; otherwise have to check.
                 * 如果是pre-pg_upgrade元组,多事务不可能运行,否则的话,只能执行检查
                 */
                if (!HEAP_LOCKED_UPGRADED(tuple->t_infomask) &&
                    MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple),
                                         true))
                    return HEAPTUPLE_LIVE;
                //其他情况,根据clog重新设置事务状态标记位
                SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId);
            }
            else
            {
                //3.1.2 非多事务
                if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmax(tuple)))
                    //xmax事务正在进行,返回LIVE
                    return HEAPTUPLE_LIVE;
                //否则,根据clog重新设置事务状态标记位
                SetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
                            InvalidTransactionId);
            }
        }
        /*
         * We don't really care whether xmax did commit, abort or crash. We
         * know that xmax did lock the tuple, but it did not and will never
         * actually update it.
         * 我们确实不需要真正关心xmax是否提交/回滚/崩溃.
         * 我们知道xmax事务锁定了元组,但没有而且"从未"更新过该元组.
         */
        //3.2 只是锁定,返回LIVE
        return HEAPTUPLE_LIVE;
    }
    if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
    {
        //4.存在子事务
        //获取删除事务号xmax
        TransactionId xmax = HeapTupleGetUpdateXid(tuple);
        /* already checked above */
        Assert(!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask));
        /* not LOCKED_ONLY, so it has to have an xmax */
        //根据上述xmax的判断,到这里可以肯定xmax是有效的
        Assert(TransactionIdIsValid(xmax));
        if (TransactionIdIsInProgress(xmax))
            //4.1 xmax正在进行,返回进行中
            return HEAPTUPLE_DELETE_IN_PROGRESS;
        else if (TransactionIdDidCommit(xmax))
        {
            //4.2 xmax已提交
            /*
             * The multixact might still be running due to lockers.  If the
             * updater is below the xid horizon, we have to return DEAD
             * regardless -- otherwise we could end up with a tuple where the
             * updater has to be removed due to the horizon, but is not pruned
             * away.  It's not a problem to prune that tuple, because any
             * remaining lockers will also be present in newer tuple versions.
             */
            if (!TransactionIdPrecedes(xmax, OldestXmin))
                //4.2.1 xmax在OldestXmin之后,
                //表示在OldestXmin之后才删除,返回HEAPTUPLE_RECENTLY_DEAD
                return HEAPTUPLE_RECENTLY_DEAD;
            //4.2.2 xmax在OldestXmin之前,返回DEAD
            return HEAPTUPLE_DEAD;
        }
        else if (!MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), false))
        {
            /*
             * Not in Progress, Not Committed, so either Aborted or crashed.
             * Mark the Xmax as invalid.
             */
            //4.3 xmax不在运行中/没有提交/没有回滚或崩溃,则设置xmax为无效事务ID
            SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId);
        }
        //4.4 默认返回LIVE
        return HEAPTUPLE_LIVE;
    }
    if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED))
    {
        //5.xmax没有提交
        if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmax(tuple)))
            //5.1 删除过程中
            return HEAPTUPLE_DELETE_IN_PROGRESS;
        else if (TransactionIdDidCommit(HeapTupleHeaderGetRawXmax(tuple)))
            //5.2 通过clog判断,该事务已提交,设置事务标记位
            SetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
                        HeapTupleHeaderGetRawXmax(tuple));
        else
        {
            /*
             * Not in Progress, Not Committed, so either Aborted or crashed
             */
            //5.3 其他情况,设置为无效事务ID
            SetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
                        InvalidTransactionId);
            //返回LIVE
            return HEAPTUPLE_LIVE;
        }
        /*
         * At this point the xmax is known committed, but we might not have
         * been able to set the hint bit yet; so we can no longer Assert that
         * it's set.
         */
        //至此,xmax可以确认已提交
    }
    /*
     * Deleter committed, but perhaps it was recent enough that some open
     * transactions could still see the tuple.
     */
    if (!TransactionIdPrecedes(HeapTupleHeaderGetRawXmax(tuple), OldestXmin))
        //6.元组xmax≥OldestXmin,最近删除
        return HEAPTUPLE_RECENTLY_DEAD;
    /* Otherwise, it's dead and removable */
    //7. 默认元组已DEAD
    return HEAPTUPLE_DEAD;
}

三、跟踪分析

参见 PostgreSQL 源码解读(118)- MVCC#3(Tuple可见性判断)

四、参考资料

PG Source Code
PostgreSQL 源码解读(118)- MVCC#3(Tuple可见性判断)
PostgreSQL 源码解读(130)- MVCC#14(vacuum过程-lazy_scan_heap函数)

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/6906/viewspace-2565402/,如需转载,请注明出处,否则将追究法律责任。

上一篇: 闲话杂谈之境界
请登录后发表评论 登录
全部评论
长期从事政务、金融等行业产品研发和架构设计工作,对Oracle、PostgreSQL以及大数据等相关技术有深入研究。现就职于广州云图数据技术有限公司,系统架构师。

注册时间:2007-12-28

  • 博文量
    1169
  • 访问量
    3634724