The Effect of Edge Bundling and Seriation on Sensemaking of Biclusters in Bipartite Graphs

作者

Maoyuan Sun
Jian Zhao
Hao Wu
Kurt Luther
Chris North
Naren Ramakrishnan

摘要

探索协调关系（例如两组实体之间的共享关系）是现实中各种应用程序中重要的分析任务，比如在生物信息学中发现行为相似的基因，检测网络安全中的恶意软件合谋以及确定营销中的产品包。协调的关系可以形式化为二元组。为了支持对双聚类的可视探索，已有基于二部图的可视化，并使用边捆绑来显示双聚类。但是由于双峰可能重叠，因此会导致边交叉，并且对用户在二部图中探索双峰的影响还缺乏深入的理解。为了解决这一问题，我们提出了一种基于双聚类的序列技术，该技术可以减少二部图中的边交叉。本文进行了用户实验研究边捆绑的效果，并提出了双聚类的可视化。我们发现边捆绑可以帮助用户找到更合理的答案。此外，我们确定了四个关键的权衡取舍，这些权衡可以为将来的双集群可视化设计提供参考。研究结果表明，边捆绑对于探索二部图中双聚类至关重要，这有助于减少低级的感知问题并支持高级别的推理。

Introduction

Coordinated relationship exploration
bicluster: a grouped relationships between two sets of entities (e.g., persons and locations), where each entity in one set is related to all entities in the other

Trade-off

relationship-centric
relationship-centric

BiSet

本文贡献：

提出了一个新颖的双聚类顺序排列技术
对用户实验进行了详细的研究设计
提出了四点关键的权衡
发现边绑定对于探索二部图中的双聚类至关重要

Background

Bicluster
- CHARM
- LCM
BiSet
Seriation: 使得模式能够更好地被揭示的排列顺序 (heuristically)
- Bertifier
- BiVoc
- Termite
Related Evaluation
- Matrix
- Edge bundling

Seriation in Bipartite Graphs

The Effect of Edge Bundling and Seriation on Sensemaking of Biclusters in Bipartite Graphs

Biadjacency matrices preparation
- 构造两个 entity-list 到 bicluster-list 的邻接矩阵
Matices fusion
- 将两个矩阵拼到一起
Seriation on a fused matrix
- 对融合矩阵进行 Correspondence Analysis，得到 seriated order
Local order generation
- 根据类别划分，保持全局顺序不变
Visual mapping
- bicluster 的位置由它所链接的 entities 的平均位置决定

User Experiment Design Rational

三个问题：

计算出的biclusters是如何帮助用户发现复杂的 domain specific biclusters?
与传统的视图比较，这种方法有助于改善用户在探索bicluster的表现吗？
有没有trade-offs

用户任务设计

closed biclusters
- 算法得出
merged biclusters
- 需要领域知识

Factors Affecting Task Complexity

The entity and group level factor: entity number
The bicluster level factors: size, overlap and number
The chain and schema level factor: domain number

Evaluation

Participants and apparatus
- 20 位研究生，9男11女，年龄24-33，来自不同专业
- 15.4-inch Macbook Pro
- a mouse and a keyboard
Synthetic data
Task
- Working experience based on companies that they worked for
- Travel preference based on their travel history
- Shopping style based on their shopping records
- Learning interests based on the courses they have taken

The Effect of Edge Bundling and Seriation on Sensemaking of Biclusters in Bipartite Graphs

Visualization and User Interaction

Highlight Propagation

The Effect of Edge Bundling and Seriation on Sensemaking of Biclusters in Bipartite Graphs
Data Collation

interaction logs (time stamp, interaction type, target object type and target object ID)
- mouse-over or out an entity or a bicluster
- selecting or unselecting an entity or a bicluster
- adding or removing an entity to or from answers
screen recording
observations
interviews

Measures and metrics

Variance of Findings
Accuracy of Findings
Connection Based Envidence
Inference Based Evidence
Exploration Cost

The Effect of Edge Bundling and Seriation on Sensemaking of Biclusters in Bipartite Graphs

User Performance Results

在相同的顺序的条件下，边捆绑显著减少了entity的访问，产生了更合理的结果
除了减少entity的访问，顺序对于答案准确率和时间耗费没有影响
在随机顺序的条件下，边捆绑使用户更倾向于发现closed biclusters， merged cluster 的发现率则较低
无论是边绑定还是顺序排列都不影响找到合理答案的时间，这意味着除了entity 访问外，其他因素（如布局）可能会影响时间耗费。

Four Trade-Offs

View Simplicity versus Task Complexity
Similarity: Connection-Based versus Semantic-Driven
Connectedness versus Coordinatedness
Highlight Propagation Driven by: Entity versus Bundle

思考

Critical thinking
对于高密度的网络，边绑定的效果可能不如矩阵形式展示好

Creative thinking
对不同规模的图进行分析比较

How to apply it to our work
可以采用 BiSet 和 Seriation 的方法来简化二部图
思考能否扩展到多部图

The Effect of Edge Bundling and Seriation on Sensemaking of Biclusters in Bipartite Graphs

作者

摘要

Introduction

Background

Seriation in Bipartite Graphs

User Experiment Design Rational

Evaluation

User Performance Results

Four Trade-Offs

思考

相关推荐