博客 - NetworkX 函数存根 - 科学 Python 博客

现在我的提案已经被 NetworkX 接受，用于 2021 年的 Google Summer of Code (GSoC)，我可以更深入地了解如何在 NetworkX 中实现 Asadpour 算法的技术细节。

在这篇文章中，我将概述我对实现控制方案的思路，并根据我的 GSoC 提案创建函数存根。该项目的大部分工作将在 netowrkx.algorithms.approximation.traveling_salesman.py 中进行，在那里我将完成旅行商问题最后一个算法，以便将其合并到项目中。traveling_salesman.py 中的主要函数是

def traveling_salesman_problem(G, weight="weight", nodes=None, cycle=True, method=None):
    """
    ...

    Parameters
    ----------
    G : NetworkX graph
        Undirected possibly weighted graph

    nodes : collection of nodes (default=G.nodes)
        collection (list, set, etc.) of nodes to visit

    weight : string, optional (default="weight")
        Edge data key corresponding to the edge weight.
        If any edge does not have this attribute the weight is set to 1.

    cycle : bool (default: True)
        Indicates whether a cycle should be returned, or a path.
        Note: the cycle is the approximate minimal cycle.
        The path simply removes the biggest edge in that cycle.

    method : function (default: None)
        A function that returns a cycle on all nodes and approximates
        the solution to the traveling salesman problem on a complete
        graph. The returned cycle is then used to find a corresponding
        solution on `G`. `method` should be callable; take inputs
        `G`, and `weight`; and return a list of nodes along the cycle.

        Provided options include :func:`christofides`, :func:`greedy_tsp`,
        :func:`simulated_annealing_tsp` and :func:`threshold_accepting_tsp`.

        If `method is None`: use :func:`christofides` for undirected `G` and
        :func:`threshold_accepting_tsp` for directed `G`.

        To specify parameters for these provided functions, construct lambda
        functions that state the specific value. `method` must have 2 inputs.
        (See examples).

    ...
    """

所有用户调用以查找旅行商问题的近似解都将通过此函数进行。我实现的 Asadpour 算法也需要与该函数兼容。traveling_salesman_problem 将处理使用节点 $u$ 和 $v$ 之间最短路径的权重作为该弧的权重来创建一个新的完整图，因此我们知道，当图传递到 Asadpour 算法时，它是一个完整的有向图，满足三角形不等式。该主函数还通过仅将必要的节点复制到完整的有向图中（在调用请求的方法之前），并在之后搜索并删除返回循环中最大的弧，来处理 nodes 和 cycles 参数。因此，Asadpour 算法的父函数只需要处理图本身以及图中弧的权重或成本。

我的控制函数将具有以下签名，我还包括了文档字符串的草稿。

def asadpour_tsp(G, weight="weight"):
    """
    Returns an O( log n / log log n ) approximate solution to the traveling
    salesman problem.

    This approximate solution is one of the best known approximations for
    the asymmetric traveling salesman problem developed by Asadpour et al,
    [1]_. The algorithm first solves the Held-Karp relaxation to find a
    lower bound for the weight of the cycle. Next, it constructs an
    exponential distribution of undirected spanning trees where the
    probability of an edge being in the tree corresponds to the weight of
    that edge using a maximum entropy rounding scheme. Next we sample that
    distribution $2 \\\\\\log n$ times and saves the minimum sampled tree once
    the direction of the arcs is added back to the edges. Finally,
    we argument then short circuit that graph to find the approximate tour
    for the salesman.

    Parameters
    ----------
    G : nx.DiGraph
        The graph should be a complete weighted directed graph.
        The distance between all pairs of nodes should be included.

    weight : string, optional (default="weight")
        Edge data key corresponding to the edge weight.
        If any edge does not have this attribute the weight is set to 1.

    Returns
    -------
    cycle : list of nodes
        Returns the cycle (list of nodes) that a salesman can follow to minimize
        the total weight of the trip.

    Raises
    ------
    NetworkXError
        If `G` is not complete, the algorithm raises an exception.

    References
    ----------
    .. [1] A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi,
       An o(log n/log log n)-approximation algorithm for the asymmetric
       traveling salesman problem, Operations research, 65 (2017),
       pp. 1043–1061
    """
    pass

根据我的 GSoC 提案，下一个函数是 held_karp，它将使用椭球法在完整的有向图上求解 Held-Karp 松弛（请参阅我的最后两篇文章这里和这里，了解我对为什么以及如何实现这一点的想法）。求解 Held-Karp 松弛是该算法的第一步。

回想一下，Held-Karp 松弛定义为以下线性规划

$$ \begin{array}{c l l} \text{min} & \sum_{a} c(a)x_a \\\ \text{s.t.} & x(\delta^+(U)) \geqslant 1 & \forall\ U \subset V \text{ and } U \not= \emptyset \\\ & x(\delta^+(v)) = x(\delta^-(v)) = 1 & \forall\ v \in V \\\ & x_a \geqslant 0 & \forall\ a \end{array} $$

并且它是一个半无限规划，因此它太大而无法以传统形式求解。该算法使用 Held-Karp 松弛的解来创建一个向量 $z^*$，该向量是对真实 Held-Karp 解 $x^*$ 进行对称化和略微缩小的版本。$z^*$ 定义为

$$ z^*_{{u, v}} = \frac{n - 1}{n} \left(x^*_{uv} + x^*_{vu}\right) $$

并且由于这是该算法用于构建近似解其余部分的内容，因此这应该是 held_karp 的返回值之一。我还将返回 $x^*$ 的成本值，在 Asadpour 论文 [1] 中记为 $c(x^*)$ 或 $OPT_{HK}$。

此外，分离预言机将被定义为 held_karp 内部的内部函数。目前，我不确定分离预言机 sep_oracle 的确切参数，但它应该是算法想要测试的点，并且需要访问算法正在松弛的图。特别是，我现在还不确定我将如何表示由分离预言机返回的超平面。

def _held_karp(G, weight="weight"):
    """
    Solves the Held-Karp relaxation of the input complete digraph and scales
    the output solution for use in the Asadpour [1]_ ASTP algorithm.

    The Held-Karp relaxation defines the lower bound for solutions to the
    ATSP, although it does return a fractional solution. This is used in the
    Asadpour algorithm as an initial solution which is later rounded to a
    integral tree within the spanning tree polytopes. This function solves
    the relaxation with the ellipsoid method for linear programs.

    Parameters
    ----------
    G : nx.DiGraph
        The graph should be a complete weighted directed graph.
        The distance between all paris of nodes should be included.

    weight : string, optional (default="weight")
        Edge data key corresponding to the edge weight.
        If any edge does not have this attribute the weight is set to 1.

    Returns
    -------
    OPT : float
        The cost for the optimal solution to the Held-Karp relaxation
    z_star : numpy array
        A symmetrized and scaled version of the optimal solution to the
        Held-Karp relaxation for use in the Asadpour algorithm

    References
    ----------
    .. [1] A. Asadpour, M. X. Goemans, A. Madry, S. O. Gharan, and A. Saberi,
       An o(log n/log log n)-approximation algorithm for the asymmetric
       traveling salesman problem, Operations research, 65 (2017),
       pp. 1043–1061
    """

    def sep_oracle(point):
        """
        The separation oracle used in the ellipsoid algorithm to solve the
        Held-Karp relaxation.

        This 'black-box' takes a point and check to see if it violates any
        of the Held-Karp constraints, which are defined as

            - The out-degree of all non-empty subsets of $V$ is at lest one.
            - The in-degree and out-degree of each vertex in $V$ is equal to
              one. Note that if a vertex has more than one incoming or
              outgoing arcs the values of each could be less than one so long
              as they sum to one.
            - The current value for each arc is greater
              than zero.

        Parameters
        ----------
        point : numpy array
            The point in n dimensional space we will to test to see if it
            violations any of the Held-Karp constraints.

        Returns
        -------
        numpy array
            The hyperplane which was the most violated by `point`, i.e the
            hyperplane defining the polytope of spanning trees which `point`
            was farthest from, None if no constraints are violated.
        """
        pass

    pass

接下来，该算法使用 Held-Karp 解的对称化和缩小版本来构建一个无向生成树的指数分布，该分布保留了边际概率。

def _spanning_tree_distribution(z_star):
    """
    Solves the Maximum Entropy Convex Program in the Asadpour algorithm [1]_
    using the approach in section 7 to build an exponential distribution of
    undirected spanning trees.

    This algorithm ensures that the probability of any edge in a spanning
    tree is proportional to the sum of the probabilities of the trees
    containing that edge over the sum of the probabilities of all spanning
    trees of the graph.

    Parameters
    ----------
    z_star : numpy array
        The output of `_held_karp()`, a scaled version of the Held-Karp
        solution.

    Returns
    -------
    gamma : numpy array
        The probability distribution which approximately preserves the marginal
        probabilities of `z_star`.
    """
    pass

现在该算法已经有了生成树的分布，我们需要对其进行采样。每个采样的树都是一个 $\lambda$ 随机树，可以使用 [2] 中的算法 A8 进行采样。

def _sample_spanning_tree(G, gamma):
    """
    Sample one spanning tree from the distribution defined by `gamma`,
    roughly using algorithm A8 in [1]_ .

    We 'shuffle' the edges in the graph, and then probabilistically
    determine whether to add the edge conditioned on all of the previous
    edges which were added to the tree. Probabilities are calculated using
    Kirchhoff's Matrix Tree Theorem and a weighted Laplacian matrix.

    Parameters
    ----------
    G : nx.Graph
        An undirected version of the original graph.

    gamma : numpy array
        The probabilities associated with each of the edges in the undirected
        graph `G`.

    Returns
    -------
    nx.Graph
        A spanning tree using the distribution defined by `gamma`.

    References
    ----------
    .. [1] V. Kulkarni, Generating random combinatorial objects, Journal of
       algorithms, 11 (1990), pp. 185–207
    """
    pass

此时，只剩下一个函数需要讨论，那就是 laplacian_matrix。该函数已经在 NetworkX 中的 networkx.linalg.laplacianmatrix.laplacian_matrix 中存在，即使实现起来相对简单，我宁愿使用现有的版本，而不是在项目中创建重复代码。更深入地了解函数签名会发现

@not_implemented_for("directed")
def laplacian_matrix(G, nodelist=None, weight="weight"):
    """Returns the Laplacian matrix of G.

    The graph Laplacian is the matrix L = D - A, where
    A is the adjacency matrix and D is the diagonal matrix of node degrees.

    Parameters
    ----------
    G : graph
       A NetworkX graph

    nodelist : list, optional
       The rows and columns are ordered according to the nodes in nodelist.
       If nodelist is None, then the ordering is produced by G.nodes().

    weight : string or None, optional (default='weight')
       The edge data key used to compute each value in the matrix.
       If None, then each edge has weight 1.

    Returns
    -------
    L : SciPy sparse matrix
      The Laplacian matrix of G.

    Notes
    -----
    For MultiGraph/MultiDiGraph, the edges weights are summed.

    See Also
    --------
    to_numpy_array
    normalized_laplacian_matrix
    laplacian_spectrum
    """

这正是我需要的，除了装饰器指出它不支持有向图，而该算法处理的是这些类型的图。幸运的是，我们的生成树分布是针对有向图中的树（一旦方向被忽略），因此我们实际上可以使用现有的函数。Asadpour 论文 [1] 中给出的定义是

$$ L_{i,j} = \left\{ \begin{array}{l l} -\lambda_e & e = (i, j) \in E \\\ \sum_{e \in \delta({i})} \lambda_e & i = j \\\ 0 & \text{otherwise} \end{array} \right. $$

其中 $E$ 定义为“令 $E$ 为当忽略弧的方向时 $z^*$ 图的支撑”，参见 Asadpour 论文的第 5 页。因此，我可以使用现有的方法，而无需创建新的方法，这将节省这个 GSoC 项目的时间和精力。

除了在这里讨论之外，这些函数存根也已添加到我在 NetworkX 中的 fork 上的 bothTSP 分支中。该提交，添加了 Asadpour 算法的函数存根和文档字符串草稿 可以使用该链接在我的 GitHub 上看到。

参考文献#

[1] A. Asadpour, M. X. Goemans, A. Mardry, S. O. Ghran 和 A. Saberi，非对称旅行商问题的 o(log n / log log n) 近似算法，运筹学，65 (2017)，第 1043-1061 页，https://homes.cs.washington.edu/~shayan/atsp.pdf.

[2] V. Kulkarni，生成随机组合对象，算法杂志，11 (1990)，第 185–207 页