博客 - VF2++ 算法 - 科学 Python 博客

最后一篇讨论**VF2++ 辅助函数**的文章可以在这里找到这里。现在我们已经弄清楚了如何解决**VF2++**包含的所有子问题，我们准备将我们实现的功能结合起来，为**图同构**问题创建最终的求解器。

引言#

我们应该快速回顾一下 VF2++ 算法中使用的各个功能

**节点排序**，它找到访问节点的最佳顺序，以便将更有可能匹配的节点放在顺序中的第一个位置。这减少了首先进行不可行搜索的可能性。
**候选选择**，使得给定来自 $G_1$ 的节点 $u$，我们获得来自 $G_2$ 的候选节点 $v$。
**可行性规则** 引入易于检查的切割和一致性条件，如果来自 $G_1$ 的节点 $u$ 和来自 $G_2$ 的节点 $v$ 的候选节点对满足这些条件，则扩展映射。
**$T_i$ 更新**，如果将新节点对添加到映射中，则更新 $T_i$ 和 $\tilde{T}_i$，$i=1,2$ 参数，并在从映射中弹出节点对时恢复它们。

我们将使用所有这些功能来形成我们的**同构求解器**。

VF2++#

首先，让我们用简单的术语描述该算法，然后再介绍伪代码。该算法将如下所示

在调用实际求解器之前，检查是否满足所有**先决条件**。例如，没有必要检查两个节点数不同的图是否同构。
初始化所有必要的**参数**（$T_i$，$\tilde{T}_i$，$i=1,2$）并可能缓存一些稍后将使用的信息。
从排序中获取下一个未检查的节点 $u$。
找到它的候选对象，并检查是否存在一个候选对象 $v$ 使得节点对 $u-v$ 满足**可行性规则**
如果有，则扩展映射并**转到步骤3**。
如果没有，则从映射中弹出最后一个节点对 $\hat{u}-\hat{v}$，并尝试来自 $\hat{u}$ 的剩余候选对象中的一个不同的候选对象 $\hat{v}$
如果**映射节点**的数量等于两个图的节点数，则这两个图是**同构的**。
如果排序的第一个节点（根节点）没有剩余的候选对象，则这两个图是**不同构的**。

下面给出**VF2++**的正式代码。

# Check if there's a graph with no nodes in it
if G1.number_of_nodes() == 0 or G2.number_of_nodes() == 0:
    return False

# Check that both graphs have the same number of nodes and degree sequence
if not nx.faster_could_be_isomorphic(G1, G2):
    return False

# Initialize parameters (Ti/Ti_tilde, i=1,2) and cache necessary information about degree and labels
graph_params, state_params = _initialize_parameters(G1, G2, node_labels, default_label)

# Check if G1 and G2 have the same labels, and that number of nodes per label is equal between the two graphs
if not _precheck_label_properties(graph_params):
    return False

# Calculate the optimal node ordering
node_order = _matching_order(graph_params)

# Initialize the stack to contain node-candidates pairs
stack = []
candidates = iter(_find_candidates(node_order[0], graph_params, state_params))
stack.append((node_order[0], candidates))

mapping = state_params.mapping
reverse_mapping = state_params.reverse_mapping

# Index of the node from the order, currently being examined
matching_node = 1

while stack:
    current_node, candidate_nodes = stack[-1]

    try:
        candidate = next(candidate_nodes)
    except StopIteration:
        # If no remaining candidates, return to a previous state, and follow another branch
        stack.pop()
        matching_node -= 1
        if stack:
            # Pop the previously added u-v pair, and look for a different candidate _v for u
            popped_node1, _ = stack[-1]
            popped_node2 = mapping[popped_node1]
            mapping.pop(popped_node1)
            reverse_mapping.pop(popped_node2)
            _restore_Tinout(popped_node1, popped_node2, graph_params, state_params)
        continue

    if _feasibility(current_node, candidate, graph_params, state_params):
        # Terminate if mapping is extended to its full
        if len(mapping) == G2.number_of_nodes() - 1:
            cp_mapping = mapping.copy()
            cp_mapping[current_node] = candidate
            yield cp_mapping
            continue

        # Feasibility rules pass, so extend the mapping and update the parameters
        mapping[current_node] = candidate
        reverse_mapping[candidate] = current_node
        _update_Tinout(current_node, candidate, graph_params, state_params)
        # Append the next node and its candidates to the stack
        candidates = iter(
            _find_candidates(node_order[matching_node], graph_params, state_params)
        )
        stack.append((node_order[matching_node], candidates))
        matching_node += 1

性能#

本节专门用于**VF2**和**VF2++**之间的性能比较。比较是在**随机图**（无标签）上进行的，节点数在 $(100-2000)$ 范围内。结果显示在以下两个图表中。

我们注意到，实现的最大加速比为**14倍**，并且随着节点数的增加而继续增加。同样非常突出的是，与对**VF2**性能的巨大影响相比，节点数的增加似乎对**VF2++**的性能没有显著影响。我们的结果与原始**VF2++ 论文**中提出的结果几乎相同，验证了文献的理论分析和前提。

优化#

实现的提升是由于一些关键改进和优化，具体来说：

**最佳节点排序**，避免遵循会导致不可行状态的毫无结果的分支。我们确保首先访问最有可能匹配的节点。
**以非递归方式实现**，避免 Python 的最大递归限制，同时减少函数调用的开销。
一开始就**缓存**节点度和每个度的节点，这样我们就不必在每次度检查中都访问这些特征。例如，而不是执行

res = []
for node in G2.nodes():
    if G1.degree[u] == G2.degree[node]:
        res.append(node)
# do stuff with res ...

获取与 u 度数相同的节点（在实现中发生很多次），我们只需执行

res = G2_nodes_of_degree[G1.degree[u]]
# do stuff with res ...

其中“G2_nodes_of_degree”存储给定度数的节点集。节点标签也是如此。

通过在候选选择方法中添加更多检查并从可行性检查中删除一些检查，**进一步缩减每个节点的候选集**。简单来说，与其在更大的候选集上检查很多条件，不如在更目标化且更小的候选集上检查更少的条件。例如，在此代码中

candidates = set(G2.nodes())
for candidate in candidates:
    if feasibility(u, candidate):
        do_stuff()

我们获取一组庞大的候选对象，由于最大化“可行性”的调用次数，导致性能不佳，从而在非常大的集合中执行可行性检查。现在将其与以下替代方案进行比较

candidates = [
    n
    for n in G2_nodes_of_degree[G1.degree[u]].intersection(
        G2_nodes_of_label[G1_labels[u]]
    )
]
for candidate in candidates:
    if feasibility(u, candidate):
        do_stuff()

我们立即大大减少了执行的检查次数和对函数的调用次数，因为现在我们仅将它们应用于与 $u$ 度数和标签相同的节点。这只是一个为了演示目的的简化。在实际实现中，有更多的检查和候选集的额外缩减。

演示#

让我们在实际图上演示我们的**VF2++**求解器。我们将使用图同构维基百科中的图。

让我们从上面图像中构建图开始。我们将左侧的图称为G，右侧的图称为H

import networkx as nx

G = nx.Graph(
    [
        ("a", "g"),
        ("a", "h"),
        ("a", "i"),
        ("g", "b"),
        ("g", "c"),
        ("b", "h"),
        ("b", "j"),
        ("h", "d"),
        ("c", "i"),
        ("c", "j"),
        ("i", "d"),
        ("d", "j"),
    ]
)

H = nx.Graph(
    [
        (1, 2),
        (1, 5),
        (1, 4),
        (2, 6),
        (2, 3),
        (3, 7),
        (3, 4),
        (4, 8),
        (5, 6),
        (5, 8),
        (6, 7),
        (7, 8),
    ]
)

使用 VF2++ 而不考虑标签#

res = nx.vf2pp_is_isomorphic(G, H, node_label=None)
# res: True

res = nx.vf2pp_isomorphism(G, H, node_label=None)
# res: {1: "a", 2: "h", 3: "d", 4: "i", 5: "g", 6: "b", 7: "j", 8: "c"}

res = list(nx.vf2pp_all_isomorphisms(G, H, node_label=None))
# res: all isomorphic mappings (there might be more than one). This function is a generator.

使用 VF2++ 考虑标签#

# Assign some label to each node
G_node_attributes = {
    "a": "blue",
    "g": "green",
    "b": "pink",
    "h": "red",
    "c": "yellow",
    "i": "orange",
    "d": "cyan",
    "j": "purple",
}

nx.set_node_attributes(G, G_node_attributes, name="color")

H_node_attributes = {
    1: "blue",
    2: "red",
    3: "cyan",
    4: "orange",
    5: "green",
    6: "pink",
    7: "purple",
    8: "yellow",
}

nx.set_node_attributes(H, H_node_attributes, name="color")

res = nx.vf2pp_is_isomorphic(G, H, node_label="color")
# res: True

res = nx.vf2pp_isomorphism(G, H, node_label="color")
# res: {1: "a", 2: "h", 3: "d", 4: "i", 5: "g", 6: "b", 7: "j", 8: "c"}

res = list(nx.vf2pp_all_isomorphisms(G, H, node_label="color"))
# res: {1: "a", 2: "h", 3: "d", 4: "i", 5: "g", 6: "b", 7: "j", 8: "c"}

请注意，在第一种情况下，我们的求解器每次可能返回不同的映射，因为标签的缺失导致节点可以映射到多个其他节点。例如，节点 1 可以映射到 a 和 h，因为图是对称的。但在第二种情况下，每个节点只有一个唯一标签的存在规定每个节点只有一个匹配，因此返回的映射是确定性的。这很容易从list(nx.vf2pp_all_isomorphisms)的输出中观察到，在第一种情况下，它返回所有可能的映射，而在后者中，它返回一个唯一的同构映射。