软件工程顶级会议中的自动修复工具——JFIX [ISSTA 2017]

前言

本文旨在讲述软工顶会论文中的自动修复工具JFix。

一、SemFix [ISSTA 2017]

1.1 文章名

JFIX: Semantics-Based Repair of Java Programs via Symbolic PathFinder

作者:Xuan-Bach D. Le; Duc-Hiep Chu; David Lo; Claire Le Goues; Willem Visser

1.2 摘要讲了啥

1)Semantics-based APR, on which we focus,
typically uses symbolic execution to infer semantic constraints and
then program synthesis to construct repairs conforming to them.
这个讲的很细,基于语义的不外乎两步:#1 符号执行来推导出语义约束;#2 程序合成来构造修复。

2)While syntactic-based APR techniques have been shown success-
ful on bugs in real-world programs written in both C and Java,
semantics-based APR techniques mostly target C programs. This
leaves empirical comparisons of the APR families not fully explored,
and developers without a Java-based semantics APR technique.

**震惊,这个motivation,确实是我很少看到的一个。但是非常有意义,因为 #1 现在确实还缺少Java这边的语义修复工具 #2 作者的工作量应该是不少的(we demonstrate the semantics-based APR can indeed efficiently and effectively repair a variety of classes of bugs in large real-world Java programs.)但是我认为这些成果都是可以预见的。只要技术上没什么问题,肯定能够修复的。 #3 最给力的是还给了一个 eclipse插件,感觉这个有点6.

3)We present JFix, a semantics-based APR framework that targets Java,
and an associated Eclipse plugin. JFix is implemented atop Sym-
bolic PathFinder, a well-known symbolic execution engine for Java
programs. It extends one particular APR technique (Angelix), and
is designed to be sufficiently generic to support a variety of such
techniques.

是基于Angelix的,还有用了symbolic pathfinder,符号执行这一类的东西。

1.3 Introduction讲了啥

Numerous recent research advances have
brought the once-futuristic idea of APR closer to reality [8, 9, 11, 13–
15, 17].
一些研究成果,现有的,有空看看。

Syntactic APR
techniques typically generate a large number of possible candidate
bug-fixing patches by manipulating or applying repair templates
to the abstract syntax tree of buggy programs. Recent works in
this family include PAR [6], and HDRepair [11] that can repair
many large, real-world Java programs. Meanwhile, semantics-based
techniques typically use symblic reasoning to infer semantics con-
straints, or specifications from behavior on test suites, leveraging
program synthesis to synthesize repaired expressions that conform
to the extracted specifications. A recent semantics-based APR tool
named Angelix [17] has demonstrated good scalability in repair-
ing bugs in large real-world programs written in C, comparable
to those targeted by syntactic approaches.
语义和语法的区别

Only a few
of them have publicly available implementations [4, 11]. With the
exception of Nopol [4], semantics-based techniques target and are
implemented for C programs.
现在的java语义修复工具很少,作者觉得这是一个可以做的点。

[4] Favio DeMarco, Jifeng Xuan, Daniel Le Berre, and Martin Monperrus. 2014.
Automatic repair of buggy if conditions and missing preconditions with smt.

原来大家都认为Nopol是2014年的。

1.4 作者的工作(本质)

Mehlitz, and Neha Rungta. 2013. Symbolic PathFinder: integrating symbolic
execution with model checking for Java bytecode analysis. ASE journal (2013)

We therefore translate and extend Angelix [17], a traditional semantics-based APR technique, such that it can target Java, making use of a selective symbolic execution procedure based on Symbolic PathFinder [19]. We further implement several additional features in support of our primary goals:

感觉主要是在Java上的实现这个idea,技术上来源于Angelix,还有[19] Corina S Păsăreanu, Willem Visser, David Bushnell, Jaco Geldenhuys, Peter

1.5 工具网址

https://xuanbachle.github.io/semanticsrepair/

1.6 实验

We demonstrate JFix on a small set of 47 Java bugs from the In-
troClass benchmark [12], and nine real bugs from large real-world
software. The results show that JFix’s unique ability in leveraging
multiple synthesis engines allows it to repair more bugs as com-
pared to using a single synthesis engine alone. JFix can repair bugs
with various types, including those involving method calls, and at
scale. Also, JFix is able to generate multi-line fixes.

感觉这个实验为什么是用的introclass呢,为什么不用的defects4j呢。

Also, JFix is able to generate multi-line fixes. This promising
result suggests that JFix can enable larger empirical comparison of
repair tools targetting Java programs in the future, and facilitate an
open research environment.

多行代码-修复,很酷。

1.7 缺陷定位

Assume as input
a Java program and a set of JUnit test cases, at least one of which is
failing; the goal is to modify the program such that all test cases pass.
Given the input, the JFix front-end first instruments the program,
and runs tests to collect traces. These traces are then input to the
fault localization module to identify likely-buggy locations.

先要instrument这个缺陷程序,然后收集 traces(这个就是动态缺陷定位技术了,也就是SFL,就是程序谱),然后进行可疑值计算。

这个还是可以懂的。但是到底用了什么语句选择算法呢?

看了之后真的拿不准,不确定是从上到下还是随机,希望能看看源代码,此外,要是能够看到Angelix的代码,也要好好研究下。感觉和Nopol有不一样的处理方式。

1.8 工作原理

软件工程顶级会议中的自动修复工具——JFIX [ISSTA 2017]
flowchart of JFix’s approach

1.9 发现一个问题,很多论文中的 语句选择算法 都没有被作者指出来,感觉很没有存在感,但是反过来看,这是不是一个被忽略的点呢。

2.0 和Angelix的区别

Although the specification inference task in JFix shares the same
spirit with Angelix [17], there are key differences.

有所改进。

JFix automatically infers specifications by leveraging the capa-
bilities of Java PathFinder (JPF) [22].

[22] Willem Visser, Klaus Havelund, Guillaume Brat, SeungJoon Park, and Flavio
Lerda. 2000. Model Checking Programs. Automated Software Engineering (2000).

佩服,这个确实是厉害,这么早的工具能用复现,然后用起来,太厉害了