Skip to content


重新审视力导向图布局:一种基于 t 分布的新力

Abstract

In this paper, we propose the t-FDP model, a force-directed placement method based on a novel bounded short-range force (t-force) defined by Student's t-distribution. Our formulation is flexible, exerts limited repulsive forces for nearby nodes and can be adapted separately in its short- and long-range effects. Using such forces in force-directed graph layouts yields better neighborhood preservation than current methods, while maintaining low stress errors. Our efficient implementation using a Fast Fourier Transform is one order of magnitude faster than state-of-the-art methods and two orders faster on the GPU, enabling us to perform parameter tuning by globally and locally adjusting the t-force in real-time for complex graphs. We demonstrate the quality of our approach by numerical evaluation against state-of-the-art approaches and extensions for interactive exploration.

本文提出 t-FDP 模型,这是一种基于 Student t 分布定义的新型有界短程力(t-force)的力导向布局方法。 该公式具有灵活性:它会限制近邻节点之间的排斥力,并且可以分别调节短程效应和长程效应。 将这类力用于力导向图布局,可以在保持较低 stress error 的同时,比现有方法更好地保留邻域结构。 作者基于快速傅里叶变换实现了高效算法:CPU 上比现有先进方法快一个数量级,在 GPU 上快两个数量级,因此可以在复杂图上实时地通过全局或局部调整 t-force 来调参。 作者通过与先进方法的定量比较,以及面向交互式探索的扩展实验,展示了该方法的质量。

1. Introduction

t-FDP teaser comparison
图1:t-FDP 与标准 FDP 和 tsNET 在带标签合作者网络上的对比。底部小提琴图总结了图节点在布局空间中的归一化欧氏距离。t-FDP 既能让图空间中相近的点在布局中保持较短距离,也能让远距离点分离,因此八个类别分得更清楚。

Graphs are a ubiquitous form of data, whenever objects and their relations have to be represented; lots of effort has been spent on finding good ways to visualize them. Arguably one of the most prevalent representations for graphs are node-link diagrams, which are intuitive and efficient in supporting various graph analysis tasks.

只要需要表示对象及其关系,图就是一种无处不在的数据形式;因此,人们投入了大量精力来寻找有效的图可视化方式。 节点-连线图可以说是最常见的图表示之一,它直观,并且能够高效支持多种图分析任务。

Despite the existence of many other techniques, force-directed placement (FDP) methods have gained a lot of attention for automatically layouting node-link diagrams. Here, a graph is modeled as a physical system of bodies with forces acting between them, the layout is computed by minimizing the overall energy of the system. While some variants of forces have been proposed, the spring-electric model is the most popular method. It assigns spring-like attractive forces between connected nodes and repulsive electrical forces between all node pairs to keep them at a distance. For most graphs, the method creates layouts with compelling performance, especially when applying its multi-level version, the scalable force-directed placement (SFDP). Since it is intuitive and easy to implement, the method is a key component in many visualization systems.

尽管还存在许多其他技术,力导向布局(FDP)方法仍然因为能够自动布局节点-连线图而受到大量关注。 在这类方法中,图被建模为一个由物体及其相互作用力组成的物理系统,布局则通过最小化系统总能量来计算。 虽然已有许多力的变体被提出,但弹簧-电荷模型仍是最流行的方法。 它在相连节点之间设置类似弹簧的吸引力,并在所有节点对之间设置电荷排斥力,以保持节点间距。 对于大多数图,这种方法可以得到表现良好的布局,尤其是在使用其多层版本 SFDP 时。 由于直观且易于实现,该方法已经成为许多可视化系统中的关键组件。

A typical problem of spring-electric models is that large repulsive electric forces are exerted on each nearby pair of connected nodes, but only small attractive spring forces. In a physical simulation this is reasonable since repelling contact forces for real bodies should be extremely large. However, such forces might destroy graph structures by preventing connected nodes in the graph from being neighbors in the layout (see Figure 1(a)). Although recently proposed graph layout methods such as tsNET allow to better preserve 2-ring neighborhoods, they fail in efficiently maintaining low stress errors and 1-ring neighborhoods. DRGraph improves the scalability of tsNET but has a similar issue with yielding large stress errors.

弹簧-电荷模型的一个典型问题是:对每一对距离很近的相连节点,模型会施加很大的电荷排斥力,却只有较小的弹簧吸引力。 在真实物理仿真中,这样做是合理的,因为真实物体之间的接触排斥力本来就应当非常大。 但在图布局中,这类力可能会破坏图结构,使图中相连的节点无法在布局中成为邻居(见 图1(a))。 虽然 tsNET 等较新的图布局方法能够更好地保留二阶邻域,但它们难以同时高效保持低 stress error 和一阶邻域。 DRGraph 提升了 tsNET 的可扩展性,但同样容易产生较大的 stress error。

In this paper, we revisit the spring-electric model and propose a new heavily-tailed force based on the t-distribution (referred to as t-force), for better maintaining local graph structures while keeping the intuitiveness and simplicity of the model. Deviating from traditional spring-electric models gives us the flexibility to freely explore the interplay between attractive and repulsive forces. This is achieved by dividing forces into their short- and long-range components and improving their short-range aspects. The t-force has an upper bound at short-range and behaves similar to the electrical repulsive force at long-ranges. The attractive force at short-range is kept unaltered. By using t-forces to define the repulsive force within FDP and combining t-forces with spring forces as the attractive forces, we are able to compose a new t-force-based FDP model (t-FDP for short), which allows us at the same time better preserving graph neighborhoods, distances of nodes and clusters (see Figure 1(c)).

本文重新审视弹簧-电荷模型,并提出一种基于 t 分布的重尾新力,称为 t-force,用于在保持模型直观性和简洁性的同时,更好地维护图的局部结构。 偏离传统弹簧-电荷模型,使作者能够更自由地探索吸引力和排斥力之间的相互作用。 具体做法是把力拆分为短程分量和长程分量,并重点改进短程部分。 t-force 在短程范围内有上界,而在长程范围内表现得类似电荷排斥力。 短程吸引力则保持不变。 通过使用 t-force 定义 FDP 中的排斥力,并将 t-force 与弹簧力组合为吸引力,作者构造出新的基于 t-force 的 FDP 模型,即 t-FDP,从而同时更好地保留图邻域、节点距离和簇结构(见 图1(c))。

Being one of two aspects of the t-SNE gradient, the t-force is similar to the repulsive force of the t-SNE model, a widely used method for dimensionality reduction. This allows us to employ an interpolation-based acceleration strategy for t-SNE using the fast Fourier transformation which can be implemented on the GPU. Due to the parallel nature of the FFT, our whole model can be efficiently implemented on GPUs and this allows us to perform interactive parameter optimization. To enable a wide range of users to use our method, we provide an implementation as a drop-in force for the ``d3-force'' library.

t-force 对应于 t-SNE 梯度中的一个方面,因此与广泛用于降维的 t-SNE 模型中的排斥力相似。 这使作者能够借用 t-SNE 的基于插值的 FFT 加速策略,而且该策略可以在 GPU 上实现。 由于 FFT 天然具有并行性,整个模型可以在 GPU 上高效实现,从而支持交互式参数优化。 为了让更多用户使用该方法,作者还提供了一个可作为 ``d3-force'' 库中 drop-in force 的实现。

We evaluated our approach using 50 graphs with node numbers ranging from 72 to 4 million and edge numbers ranging from 75 to 100 million. We quantitatively measured the quality of our results using various structural and readability metrics. For the tested data, our method performs similarly or even outperforms all existing methods in all metrics except the stress error, where our method is the second-best behind the stress model. Regarding performance, the CPU version of t-FDP is one order of magnitude faster than SFDP and DRGraph on a desktop computer with Intel i7-8700 CPU, while reducing the memory size considerably. Our GPU-based CUDA implementation is able to further improve performance by another order of magnitude.

作者使用 50 个图评估该方法,节点数量从 72 到 400 万,边数量从 75 到 1 亿不等。 他们使用多种结构性和可读性指标,对结果质量进行定量测量。 在测试数据上,除了 stress error 指标中该方法仅次于 stress model 之外,t-FDP 在所有指标上都与现有方法相当甚至更优。 在性能方面,在 Intel i7-8700 桌面 CPU 上,t-FDP 的 CPU 版本比 SFDP 和 DRGraph 快一个数量级,同时显著减少内存占用。 基于 GPU 的 CUDA 实现还能进一步提升一个数量级的性能。

In addition, we present two extensions of t-FDP for better exploring graph structures. First, we show how we reveal structures at different scales by re-applying our force to existing layouts with repulsive forces of different magnitudes. Second, we allow users to explore nodes of interest with fisheye-like visualizations by interactively changing force parameters. Since our method is very fast, interactivity can be maintained even for larger graphs.

此外,作者还提出 t-FDP 的两个扩展,用于更好地探索图结构。 首先,他们展示如何在已有布局上重新施加不同强度的排斥力,从而揭示不同尺度下的结构。 其次,他们允许用户通过交互式调整力参数,以类似鱼眼视图的方式探索感兴趣节点。 由于该方法速度很快,即便面对较大的图,也可以保持交互性。

In summary, our main contributions are as follows:

  • we revisit the spring-electric model and show that it does not properly capture local neighborhoods and cluster structures in graphs;
  • we propose a new short-range force based on the t-distribution and use it to compose a new FDP model that overcomes drawbacks of existing FDP models in neighborhood preservation; and
  • we provide an FFT-based fast implementation and an interactive demo of our method, and demonstrate its effectiveness through a quantitative evaluation and extensions.

总的来说,本文主要贡献如下:

  • 作者重新审视弹簧-电荷模型,并说明它不能充分捕获图中的局部邻域和簇结构;
  • 作者提出一种基于 t 分布的新短程力,并用它构造新的 FDP 模型,以克服现有 FDP 模型在邻域保持方面的缺陷;
  • 作者提供了基于 FFT 的快速实现和交互式 demo,并通过定量评估与扩展应用展示了方法的有效性。

Related work falls into two categories: force-directed graph layouts and the t-distribution and its applications in graph layout.

相关工作分为两类:力导向图布局,以及 t 分布及其在图布局中的应用。

2.1. Force-directed Graph Layouts

Since their first proposition by Tutte, many variants of force-directed layouts have been developed. Among them, spring-electric and stress models are two most popular categories.

自 Tutte 首次提出以来,力导向布局已经发展出许多变体。 其中,弹簧-电荷模型和 stress model 是最流行的两类。

By representing nodes as ring-like joints and edges as springs, the spring-electric model uses attractive forces to pull adjacent nodes together and repulsive forces between all nodes to repel them. When the energy of the system reaches a minimum, it finds the optimal layout. To avoid strong long-range forces, Eades introduced logarithmic spring forces and repulsive forces inverse to the squared distance. To produce layouts with uniform edge lengths, Fruchterman and Reingold redefine attractive forces proportional to the squared distance and repulsive forces reciprocal to the distance. Hu et al. model repulsive forces as power functions of the form f(d)=dp with p being any positive integer and find that forces inversely related to the squared distance work well for most graphs. Noack generalizes this idea to a LinLog energy model which uses constant attractive forces and repulsive forces reciprocal to the distance, yielding well-separated clusters for their tested graphs.

弹簧-电荷模型把节点表示成环状接头,把边表示成弹簧,并使用吸引力拉近相邻节点,同时使用所有节点之间的排斥力将它们推开。 当系统能量达到最小时,模型就得到最优布局。 为了避免过强的长程力,Eades 引入了对数弹簧力和与距离平方成反比的排斥力。 为了生成边长均匀的布局,Fruchterman 和 Reingold 将吸引力重新定义为与距离平方成正比,将排斥力定义为与距离成反比。 Hu 等人把排斥力建模为 f(d)=dp 形式的幂函数,其中 p 为任意正整数,并发现与距离平方成反比的力对大多数图都效果良好。 Noack 将这一思想推广为 LinLog 能量模型,该模型使用常数吸引力和与距离成反比的排斥力,在测试图上能得到分离良好的簇。

As a compromise between LinLog and FR model, ForceAtlas2 sets attractive forces proportional and repulsive forces reciprocal to the distance for better showing local neighborhoods and cluster structures. However, such a configuration might work poorly for complex graphs, because strong short-range repulsive forces often let the corresponding optimization be trapped in local minima. To alleviate this issue, Kumar et al. introduce a heuristic method that imposes an upper bound on the repulsive forces as a stopping condition for visualizing directed acyclic graphs. In contrast, short-range t-forces in our t-FDP model allow us to largely alleviate this issue for general graphs.

作为 LinLog 和 FR 模型之间的折中,ForceAtlas2 将吸引力设为与距离成正比,将排斥力设为与距离成反比,以便更好地展示局部邻域和簇结构。 然而,对于复杂图,这种配置可能效果较差,因为强短程排斥力常常会使优化过程陷入局部极小值。 为缓解这一问题,Kumar 等人提出一种启发式方法,将排斥力上界作为可视化有向无环图时的停止条件。 相比之下,t-FDP 模型中的短程 t-force 可以在一般图上大幅缓解这一问题。

To visualize large graphs, various multilevel approaches have been employed for improving scalability. For example, the Barnes-Hut technique and the fast multipole method are often used for approximating long-range repulsive forces, having an overall time complexity of O(nlogn) for n nodes. Since the Barnes-Hut approximation is easy to implement, it is adopted by many FDP algorithms such as SFDP and ForceAtlas2. Random vertex sampling was proposed by Gove for computing repulsive forces. It generates similar layouts as Barnes-Hut and FMM, but with a time complexity of O(n). Based on FFT acceleration, our t-FDP model also shows a runtime of O(n), but the resulting layouts maintain lower stress errors.

为了可视化大规模图,人们使用了多种多层方法来提升可扩展性。 例如,Barnes-Hut 技术和快速多极子方法常用于近似长程排斥力,对于 n 个节点整体时间复杂度为 O(nlogn) 由于 Barnes-Hut 近似容易实现,它被许多 FDP 算法采用,例如 SFDP 和 ForceAtlas2。 Gove 提出随机顶点采样来计算排斥力。 它能生成与 Barnes-Hut 和 FMM 类似的布局,但时间复杂度为 O(n) 基于 FFT 加速,t-FDP 模型同样具有 O(n) 的运行时间,并且所得布局保持更低的 stress error。

Alternatively, stress models, originally proposed by Kamada and Kawai, associate a spring between each pair of nodes with an ideal length proportional to the length of the shortest path between them. In doing so, this model can yield a layout with a much better global structure than a spring-electric model. However, stress models pay more attention to distant node pairs, resulting in typically poor preservation of local structures. To preserve such local structures better, local versions of stress models have been proposed. The maxent-stress model applies stress only on edges within a specified length, but electric repulsion forces to all nodes. In contrast to this, t-FDP aims at improving spring-electric models by better balancing the representation of local and global structures. Our experimental results show that it performs better than the maxent-stress model for representing global structures while being significantly faster.

另一类方法是 Kamada 和 Kawai 最早提出的 stress model,它在每一对节点之间关联一根弹簧,并使理想长度与二者最短路径长度成正比。 因此,该模型通常能比弹簧-电荷模型得到更好的全局结构。 不过,stress model 更关注远距离节点对,因此往往难以保留局部结构。 为了更好地保留局部结构,人们提出了局部版本的 stress model。 maxent-stress 模型只在指定长度范围内的边上施加 stress,但对所有节点施加电荷排斥力。 与之不同,t-FDP 旨在通过更好地平衡局部结构和全局结构的表示来改进弹簧-电荷模型。 实验结果表明,在表示全局结构方面,t-FDP 优于 maxent-stress 模型,并且速度显著更快。

2.2. t-Distribution and its Applications for Graph Layout

The t-distribution is a symmetric bell-shaped distribution with heavier tails compared with the Gaussian distribution. The function is defined by:

t 分布是一种对称钟形分布,与高斯分布相比具有更重的尾部。 其函数定义为:

f(x)=(1+x22v1)v,

where v is the degree of freedom. With increasing v the distribution becomes closer to the Gaussian distribution. By employing such a distribution to model the similarity between two points in the embedding space, the neighborhood embedding t-SNE not only greatly alleviates the crowding problem of dimensionality reduction but also efficiently preserves local structures.

其中 v 是自由度。 随着 v 增大,该分布会逐渐接近高斯分布。 通过使用这种分布来建模嵌入空间中两个点之间的相似性,邻域嵌入方法 t-SNE 不仅大幅缓解了降维中的 crowding problem,也能有效保留局部结构。

A number of t-distribution based DR methods have been proposed, such as LargeVis, UMAP and TriMap for preserving more global structures. On the other hand, acceleration strategies were developed for improving the scalability of t-SNE, such as the Barnes-Hut approximation, GPU-based texture splatting and interpolation-based FFT. Among them, the interpolation-based FFT method can achieve a near-linear complexity by using the fast Fourier transform to approximate the repulsive force, which is one term of the negative gradient.

已有许多基于 t 分布的降维方法被提出,例如用于更好保留全局结构的 LargeVis、UMAP 和 TriMap。 另一方面,也有多种加速策略用于提升 t-SNE 的可扩展性,例如 Barnes-Hut 近似、基于 GPU 的 texture splatting,以及基于插值的 FFT。 其中,基于插值的 FFT 方法通过使用快速傅里叶变换近似负梯度中的排斥力项,可以达到近似线性的复杂度。

Recently, t-distribution based DR methods have been leveraged for graph layout. A representative approach is tsNET, which utilizes a customized t-SNE for capturing local structures. Compared to the stress model based on multidimensional scaling MDS, tsNET can generate high-quality layouts for a wider variety of graphs. To handle large graphs, Zhu et al. propose DRGraph that further improves the scalability of tsNET by using negative sampling for gradient estimation. In a similar spirit, we use the t-distribution to define the bounded short-range force in our t-FDP model. Furthermore, this formation enables us to employ the FFT for improving the scalability, which is about one order of magnitude faster than DRGraph on the CPU and two orders faster on the GPU.

近来,基于 t 分布的降维方法也被用于图布局。 一个代表性方法是 tsNET,它使用定制化 t-SNE 来捕获局部结构。 与基于多维尺度分析 MDS 的 stress model 相比,tsNET 能为更多类型的图生成高质量布局。 为了处理大图,Zhu 等人提出 DRGraph,通过负采样进行梯度估计,进一步提高 tsNET 的可扩展性。 出于类似思路,作者在 t-FDP 模型中使用 t 分布来定义有界短程力。 此外,这种形式使作者能够使用 FFT 提升可扩展性:CPU 上约比 DRGraph 快一个数量级,GPU 上快两个数量级。

3. Method

Given an undirected graph with n nodes and m edges, the main goal of graph layout is to compute a low-dimensional position for each node that meets a set of structural and aesthetic criteria including evenly distributed nodes, uniform edge lengths, and reflecting symmetry. However, force-directed layout algorithms and especially spring-electric models do not explicitly strive for these goals but are solely based on two design principles proposed by Fruchterman and Reingold:

  • Nodes connected by an edge should be drawn close to each other; and
  • Nodes should not be drawn too close to each other in general.

These two principles, however, cannot ensure that connected nodes become nearest neighbors in the layout for graphs with more than two nodes (see Figure 1(a)). This results in losing important local graph structures. To address this issue, we propose a third design principle:

  • Nodes connected by an edge should be drawn closer to each other than unconnected nodes.

This will enhance local structures and represent local connection patterns more faithfully.

给定一个包含 n 个节点和 m 条边的无向图,图布局的主要目标是为每个节点计算低维位置,使其满足一系列结构和美学标准,例如节点分布均匀、边长一致以及对称性体现。 然而,力导向布局算法,尤其是弹簧-电荷模型,并不会显式优化这些目标,而只是基于 Fruchterman 和 Reingold 提出的两条设计原则:

  • 由边相连的节点应当被绘制得彼此接近;
  • 一般而言,节点不应被绘制得彼此过近。

不过,对于包含两个以上节点的图,这两条原则并不能保证相连节点在布局中成为最近邻(见 图1(a))。 这会导致重要局部图结构丢失。 为解决该问题,作者提出第三条设计原则:

  • 由边相连的节点应当比不相连节点绘制得更近。

这将增强局部结构,并更忠实地表示局部连接模式。

3.1. Revisiting Spring-electric Models

To meet P1 and P2, the spring-electric model employs attractive spring forces fa to pull connected nodes together and electric repulsive forces fr to repel nodes from each other. So far, different functions have been used for defining attraction and repulsion forces and many of them can be written in the form of power functions:

为了满足 P1 和 P2,弹簧-电荷模型使用弹簧吸引力 fa 将相连节点拉近,并使用电荷排斥力 fr 将节点彼此推开。 到目前为止,人们使用了不同函数来定义吸引力和排斥力,其中许多都可以写成幂函数形式:

Fa(i,j)=αxixjp,ij,Fr(i,j)=xixjq,ij.

Here xi is the 2D position of the graph node i in the layout space, α is the weight, and p and q have to be non-negative. The larger p, the stronger the long-range attractive force, while the larger q, the weaker the long-range repulsive force. At each time step, the resultant force will move the node until convergence is reached. The layout quality is mainly influenced by p and q. By using the exponents (2,1) for p and q, respectively, the FR model yields layouts with uniform edge lengths. The LinLog model uses the exponents (0,1) for better revealing clusters in the graph. Recently, Jacomy et al. suggest to use the exponents (1,1) as an intermediate model between these two models, and these values are also set as default in many visualization libraries or systems such as GraphViz or D3.

其中,xi 是图节点 i 在布局空间中的二维位置,α 是权重,pq 必须非负。 p 越大,长程吸引力越强;q 越大,长程排斥力越弱。 在每个时间步,合力会移动节点,直到布局收敛。 布局质量主要受 pq 影响。 FR 模型分别为 pq 使用指数 (2,1),从而得到边长较均匀的布局。 LinLog 模型使用指数 (0,1),以便更好地揭示图中的簇。 最近,Jacomy 等人建议使用指数 (1,1) 作为两类模型之间的中间模型,这些值也被许多可视化库或系统(如 GraphViz 和 D3)设为默认值。

Two problematic cases of traditional FDP
图2:传统 FDP 的两个问题案例:(a) 三个附近但非邻接的节点施加较大的排斥力,把 V1 推出它所在的簇;(b) 一个强长程吸引力把 V2 从局部簇中拉出。t-FDP 避免了这两种情况。

Attractive spring and repulsive electric forces allow to meet P1 and P2 for small and simple graphs. However, for large graphs, such forces based on power functions might not be able to adequately represent graph neighborhoods and cluster structures, as illustrated in Figure 2. First, repulsive forces tend to become extremely large when the Euclidean distance between two nodes approaches zero. As a result, closely connected nodes might be pushed away from each other. Second, long-range attraction forces that are too strong might break a cluster into multiple smaller pieces. Reducing long-range attraction forces or increasing short-range attraction might reduce this issue. However, adjusting the corresponding coefficients in the spring-electric model will always change both, long- and short-range forces at the same time.

对于小而简单的图,弹簧吸引力和电荷排斥力可以满足 P1 和 P2。 然而,对于大图,基于幂函数的这类力可能无法充分表示图邻域和簇结构,如 图2 所示。 首先,当两个节点之间的欧氏距离趋近于零时,排斥力往往会变得极大。 结果,距离很近且相连的节点可能会被彼此推开。 其次,过强的长程吸引力可能会把一个簇拆成多个较小部分。 降低长程吸引力或提高短程吸引力可能会缓解这一问题。 但在弹簧-电荷模型中,调整相应系数总会同时改变长程和短程力。

3.2. The t-Force

One straightforward way to avoid extremely large repulsive forces would be to define a constant force value when two nodes approach each other. However, only manipulating repulsive forces with a constant cannot meet P3, because it is hard to find a good balance to the spring-like attractive forces. Therefore, it is better to increase attraction and reduce repulsion at short-ranges so that neighbors are able to move closer together. To do so, we propose the t-force as a new short-range force that serves as two roles: a new repulsive force and a component of the attractive force for short-ranges.

避免极大排斥力的一种直接方式,是在两个节点彼此靠近时定义一个常数力值。 然而,仅用常数来处理排斥力无法满足 P3,因为它很难与弹簧式吸引力取得良好平衡。 因此,更好的做法是在短程范围内增强吸引、降低排斥,使邻居节点能够更靠近。 为此,作者提出 t-force 作为一种新的短程力,它承担两个角色:一种新的排斥力,以及短程吸引力的一个组成部分。

As a repulsive force, it should be short-range in nature, and thus we require it to be weak at long ranges similar to power-function-based repulsive forces. Moreover, it should have an upper bound at short ranges to avoid extremely large contact forces and improve optimization stability. Finally, to be used as an additional component of attractive forces at short ranges, we further require it to have similar behavior as a linear spring force, which is quite strong compared to other forces.

作为排斥力,它本质上应当是短程力,因此作者要求它在长程范围内像基于幂函数的排斥力一样较弱。 此外,它在短程范围内应当有上界,以避免极大的接触力并提高优化稳定性。 最后,为了作为短程吸引力的额外分量,作者还要求它表现得类似线性弹簧力,而线性弹簧力相对于其他力来说较强。

Suppose the Euclidean distance between two graph nodes is d=xixj, the force f(d) should satisfy the following three requirements:

  • R1: ς>0 such that 0<f(d)ς, d>0;
  • R2: f(d)dq as d, where q>0; and
  • R3: f(d)d as d0.

In other words, such a force has an upper bound ς for short distances between two nodes and smoothly reduces to zero as the distance decreases with different possible decay rates.

设两个图节点之间的欧氏距离为 d=xixj,则力 f(d) 应满足以下三条要求:

  • R1:存在 ς>0,使得 0<f(d)ς, d>0
  • R2:当 d 时,f(d)dq,其中 q>0
  • R3:当 d0 时,f(d)d

换句话说,这种力在两个节点短距离范围内具有上界 ς,并会随着距离变化以不同可能的衰减率平滑地趋近于零。

After investigating a number of functions, we found the following t-distribution based force to meet our requirements. This does not mean that no other possible functions exist, but this one seems to be very simple and provides all required aspects:

在考察了多种函数之后,作者发现下面这种基于 t 分布的力满足需求。 这并不意味着不存在其他可能函数,只是该函数看起来非常简单,并且具备所需的全部性质:

f(d)=2τφ~d(1+τd2)φ~+1.

For simplicity, we set τ to 1 and simplify the notation to the following function:

为简化起见,作者令 τ=1,并将记号简化为如下函数:

f(d)=d(1+d2)φ,φ1.

The function f(d) approaches 1 when |d| is close to zero and gradually decreases to zero as |d| increases. We refer to the heavy-tailed force defined by this equation as t-force. As shown in Figure 3(a), the maximum is always smaller than 1 and located in the d-range of [0,1]. The smaller φ, the larger the maximum and the heavier its tail. No matter what φ is, the function has an upper bound of ς.

|d| 接近零时,函数 f(d) 接近 1;随着 |d| 增大,它会逐渐下降到零。 作者将这个方程定义的重尾力称为 t-force。 图3(a) 所示,其最大值始终小于 1,并位于 [0,1]d 区间内。 φ 越小,最大值越大,尾部也越重。 无论 φ 取何值,该函数都有上界 ς

3.3. The t-FDP model

t-force function graphs
图3:(a) 不同 φ 下 t-force 的函数曲线;(b) FDP 与 t-FDP 中两个相连节点之间的排斥力、吸引力和合力对比。虚线表示 FDP,实线表示 t-FDP。主要差别在短程力:t-FDP 中短程力是有界的。

Using the t-force, we now define the repulsive and attractive forces of our force-based graph layout as:

使用 t-force 后,作者将该力导向图布局中的排斥力和吸引力定义为:

Fr(i,j)=xixj(1+xixj2)γxixjxixj,ij,Fa(i,j)=(xixj+βxixj1+xixj2)xixjxixj,ij.

Here β is the weight for the attractive t-force. Following the suggestion of ForceAtlas2, we use a linear attractive spring force, and an attractive t-force with φ=1. This model has the same attractive spring forces as the original model, but enhances it with an attractive short-range t-force and replaces its repulsive force with a repulsive short-range t-force. The attractive short-range t-force is only exerted on given edges, and is combined with long-range spring forces for better maintaining global structures.

其中,β 是吸引型 t-force 的权重。 遵循 ForceAtlas2 的建议,作者使用线性弹簧吸引力,并使用 φ=1 的吸引型 t-force。 该模型保留了原模型的弹簧吸引力,但用吸引型短程 t-force 对其进行增强,并用排斥型短程 t-force 替换原来的排斥力。 吸引型短程 t-force 只作用于给定边,并与长程弹簧力结合,以更好地维持全局结构。

The parameter γ specifies the extent and magnitude of the repulsive t-force that controls the longest distance of neighbors in the layout, while α and β tune the weight of the attractive long-range and short-range t-forces, respectively. For simplicity, we first investigate their valid ranges from the extreme case with two connected nodes and then provide guidelines for general graphs. Substituting the repulsive and attractive force definitions into the condition that two connected nodes should not collapse yields α(1+β)<1. We can see that α and β cannot be both large, while they cannot be both small so as to balance attractive and repulsive forces. For specifying γ, the attractive force between two connected nodes should be larger than the repulsive force except when nodes approach each other. To prevent the attractive t-force weighted by αβ from being smaller than the repulsive force, the authors require γ>1.

参数 γ 指定排斥型 t-force 的范围和大小,它控制布局中邻居节点的最长距离;αβ 分别调节长程吸引力和短程吸引型 t-force 的权重。 为简化分析,作者先从两个相连节点的极端情况研究参数的有效范围,再给出一般图的设置指南。 将排斥力和吸引力定义代入“两个相连节点不应塌缩”的条件,可以得到 α(1+β)<1 因此,αβ 不能同时很大;同时,为平衡吸引力和排斥力,它们也不能同时很小。 对于 γ,除节点彼此接近的情况外,相连节点之间的吸引力应大于排斥力。 为避免由 αβ 加权的吸引型 t-force 小于排斥力,作者要求 γ>1

Parameter weight analysis
图4:参数 αβ 对布局质量的影响。改变长程吸引力和短程吸引力的比例,会影响邻域保持、簇结构和 stress error。
Exponent analysis
图5:不同 γ 指数对布局结果的影响。更大的 γ 会使排斥型 t-force 范围更短,更有利于局部簇结构,但可能牺牲邻域和距离保持。

Overall, γ specifies the extent and magnitude of the repulsive t-force that controls the longest distance of neighbors in the layout, while α and β tune the weights of the attractive long-range and short-range t-forces, respectively. In our experiments, we find that a configuration of α=0.1, β=8 and γ=2 works well and preserves local as well as global structures. Figure 3(b) shows that the repulsive force is larger than the attractive force when the distance between two nodes is smaller than a certain value; otherwise, the attractive force is larger. This is well aligned with the three requirements R1--R3.

总的来说,γ 决定排斥型 t-force 的作用范围和大小,从而控制布局中邻居节点的最长距离;αβ 分别调节长程吸引力和短程吸引型 t-force 的权重。 在实验中,作者发现 α=0.1β=8γ=2 的配置表现良好,能够同时保留局部和全局结构。 图3(b) 显示,当两个节点之间距离小于某个值时,排斥力大于吸引力;否则,吸引力更大。 这与 R1--R3 三条要求一致。

4. Approximate Calculation of Repulsive Forces

The repulsive t-force is exerted between all pairs of nodes, and directly computing it has quadratic complexity. To scale t-FDP to large graphs, the authors employ an interpolation-based FFT acceleration strategy inspired by the acceleration of t-SNE. The central idea is to interpolate node contributions onto a regular grid, compute the convolution with a kernel by FFT, and then interpolate the force back to node positions. This changes the main cost from all-pairs force evaluation to grid-based convolution and interpolation.

排斥型 t-force 会作用在所有节点对之间,直接计算具有二次复杂度。 为了将 t-FDP 扩展到大规模图,作者采用了受 t-SNE 加速启发的基于插值的 FFT 加速策略。 核心思想是:先把节点贡献插值到规则网格上,再通过 FFT 计算与核函数的卷积,最后将力插值回节点位置。 这样,主要计算开销就从所有节点对之间的力计算,转移为基于网格的卷积与插值。

j=1nK(xi,xj)(xixj)=xij=1nK(xi,xj)j=1nK(xi,xj)xj.

The above decomposition separates the repulsive force into terms that can be approximated with kernel interpolation. The implementation first places interpolation points on an evenly spaced grid and evaluates the kernel on the grid. Because the FFT is highly parallel, the authors can implement the approximation efficiently on GPUs. This implementation keeps the model simple while making it practical for interactive parameter tuning on large graphs.

上述分解把排斥力拆成可以通过核插值近似的若干项。 实现中会先在等距网格上放置插值点,并在网格上计算核函数。 由于 FFT 具有高度并行性,作者可以在 GPU 上高效实现该近似。 这种实现既保持了模型简单,也让大图上的交互式参数调节成为可能。

FFT kernel
图6:FFT 近似中使用的核函数和参数设置示意。该近似利用规则网格和快速傅里叶变换来加速全局排斥力计算。

5. Evaluation

To understand the behavior of t-FDP, the authors evaluate approximation quality, visual quality, convergence behavior, performance, and layout quality against competing methods. The evaluation uses graphs with widely varying sizes, ranging from small examples to graphs with millions of nodes and up to one hundred million edges. The comparison includes classical FDP variants, stress-based methods, neighborhood embedding methods, and scalable graph layout approaches. The metrics cover both global structure preservation and local neighborhood preservation, together with readability-oriented measures.

为了理解 t-FDP 的行为,作者从近似质量、视觉质量、收敛行为、性能和布局质量等方面,将其与竞争方法进行比较。 评估使用规模差异很大的图,从小型示例到拥有数百万节点和多达一亿条边的图都有覆盖。 比较对象包括经典 FDP 变体、基于 stress 的方法、邻域嵌入方法,以及可扩展图布局方法。 指标同时覆盖全局结构保持、局部邻域保持以及面向可读性的度量。

5.1. Comparison of Approximation Methods

Approximation quality comparison
图7:不同近似方法的质量比较。作者比较了近似排斥力计算对布局质量的影响。
Approximation visual comparison
图8:不同近似方法得到的可视化结果对比。该图展示近似策略在视觉结构保持上的差异。

The authors compare the FFT approximation with other strategies for accelerating repulsive force computation. The results in Figure 7 and Figure 8 show that the FFT-based approximation preserves the visual structure well while significantly improving runtime. It avoids the high memory and time cost of exact all-pairs computation, and it is especially suitable for GPU acceleration.

作者将 FFT 近似与其他加速排斥力计算的策略进行比较。 图7图8 的结果表明,基于 FFT 的近似能够很好地保持视觉结构,同时显著提升运行时间。 它避免了精确全节点对计算带来的高内存和高时间成本,并且特别适合 GPU 加速。

Convergence comparison
图9:不同方法的收敛行为对比。t-FDP 在保持布局质量的同时具有稳定的收敛表现。
Performance comparison
图10:不同布局方法的运行性能比较。基于 FFT 的 t-FDP 在 CPU 上快于 SFDP 和 DRGraph,并且 GPU 版本进一步提升速度。

5.2. Comparison of Layout Methods

The authors compare t-FDP with standard force-directed placement, tsNET, DRGraph, stress-based models, and other graph layout baselines, with summary views shown in Figure 11, Figure 12, and Figure 13. The results indicate that t-FDP better balances local and global structure. Compared with classical FDP, it improves local neighborhood preservation. Compared with neighborhood embedding methods, it maintains lower stress errors and better global organization. Compared with stress models, it preserves local structures more effectively and runs faster.

作者将 t-FDP 与标准力导向布局、tsNET、DRGraph、基于 stress 的模型以及其他图布局基线进行比较,汇总视图见 图11图12图13 结果表明,t-FDP 能更好地平衡局部结构和全局结构。 与经典 FDP 相比,它提升了局部邻域保持。 与邻域嵌入方法相比,它保持了更低的 stress error 和更好的全局组织。 与 stress model 相比,它能更有效地保留局部结构,并且运行速度更快。

Evaluation heatmap
图11:不同方法在多种质量指标上的热力图对比。t-FDP 在多数结构与可读性指标上表现稳定。
Quality boxplot
图12:布局质量指标的箱线图比较。该图从统计角度总结了各方法在多图数据集上的表现。
Visual comparison of layout methods
图13:不同布局方法的视觉结果对比。t-FDP 在多个数据集上同时保持局部邻域和整体结构,类别混合更少。

The visual comparison confirms the quantitative findings. Standard FDP often generates layouts where connected vertices are not nearest neighbors in the layout. Neighborhood embedding methods can improve some local neighborhoods, but may sacrifice global distances. The t-FDP model reduces these conflicts by bounding short-range repulsion and introducing a controllable short-range t-force component.

视觉比较进一步验证了定量结果。 标准 FDP 经常生成这样的布局:图中相连的顶点在布局中并不是最近邻。 邻域嵌入方法可以改善部分局部邻域,但可能牺牲全局距离。 t-FDP 通过限制短程排斥力,并引入可控的短程 t-force 分量,减少了这些冲突。

6. Extensions

Beyond producing static layouts, the authors present two extensions of t-FDP for interactive graph exploration. The first extension changes the magnitude of repulsive forces to reveal structures at different scales. This makes it possible to inspect whether clusters remain stable when local repulsion is strengthened or weakened. The second extension supports fisheye-like exploration by locally changing force parameters around nodes of interest. Because the FFT-based implementation is fast, these extensions can be used interactively even for larger graphs, as demonstrated in Figure 14.

除了生成静态布局之外,作者还提出 t-FDP 的两个扩展,用于交互式图探索。 第一个扩展通过改变排斥力大小来揭示不同尺度下的结构。 这使用户能够观察当局部排斥力增强或减弱时,簇结构是否依然稳定。 第二个扩展支持类似鱼眼视图的探索:围绕感兴趣节点局部改变力参数。 由于基于 FFT 的实现速度很快,即使对于较大的图,这些扩展也可以交互式使用,如 图14 所示。

t-FDP extension case
图14:t-FDP 扩展案例。通过局部或全局调整 t-force 参数,用户可以在不同尺度上探索图结构,并对感兴趣区域进行鱼眼式放大。

7. Conclusion and Future Work

In this paper, the authors revisit spring-electric force-directed graph layouts and identify that traditional power-function forces may fail to preserve local neighborhoods and cluster structures. They propose t-FDP, a new force-directed placement model based on a bounded short-range force derived from the t-distribution. The model decouples short- and long-range behavior and allows users to tune local and global structure preservation more flexibly. An FFT-based implementation makes the method scalable and suitable for interactive use. The experiments show that t-FDP improves neighborhood preservation while maintaining low stress errors and strong runtime performance. Future work can further explore richer force functions, parameter steering strategies, and applications to broader graph analysis workflows.

本文重新审视弹簧-电荷式力导向图布局,并指出传统幂函数力可能无法保留局部邻域和簇结构。 作者提出 t-FDP,这是一种基于 t 分布导出的有界短程力的新力导向布局模型。 该模型解耦了短程和长程行为,使用户能够更灵活地调节局部与全局结构保持。 基于 FFT 的实现使该方法具备可扩展性,并适合交互式使用。 实验表明,t-FDP 在保持低 stress error 和高运行效率的同时,改善了邻域保持。 未来工作可以进一步探索更丰富的力函数、参数引导策略,以及在更广泛图分析流程中的应用。