Informatica 42 (2018) 145–166 145 Counterexamples in Model Checking – A Survey Hichem Debbi Department of Computer Science, University of Mohamed Boudiaf, M’sila, Algeria E-mail: hichem.debbi@gmail.com Overview paper Keywords: model checking, counterexamples, debugging Received: December 9, 2016 Model checking is a formal method used for the verification of finite-state systems. Given a system model and such specification, which is a set of formal properties, the model checker verifies whether or not the model meets the specification. One of the major advantages of model checking over other formal methods its ability to generate a counterexample when the model falsifies the specification. Although the main purpose of the counterexample is to help the designer to find the source of the error in complex systems design, the counterexample has been also used for many other purposes, either in the context of model checking itself or in other domains in which model checking is used. In this paper, we will survey algorithms for counterexample generation, from classical algorithms in graph theory to novel algorithms for producing small and indicative counterexamples. We will also show how counterexamples are useful for debugging, and how we can benefit from delivering counterexamples for other purposes. Povzetek: Pregledni ˇ clanek se ukvarja s protiprimeri v formalni metodi za preverjanje konˇ cnih avtomatov, tj. sistemov manjše raˇ cunske moˇ ci kot Turingovi stroji. Protiprimeri koristijo snovalcem na veˇ c naˇ cinom, predvsem kot naˇ cin preverjanja pravilnosti delovanja. 1 Introduction Model checking is a formal method used for the verification of finite-state systems. Given a system model and such spe- cification, which is a set of formal properties in temporal logics like LTL [109] and CTL [28, 52], the model checker verifies whether or not the model meets the specification. One of the major advantages of model checking over ot- her formal methods its ability to generate a counterexample when the model falsifies such specification. The counterex- ample is an error trace, by analysing it, the user can locate the source of the error. The original algorithm for counte- rexample generation was proposed by [31], and was imple- mented in most symbolic model checkers. This algorithm of generating linear counterexamples for ACTL, which is a fragment of CTL, was later extended to handle arbitrary ACTL properties using the notion of tree-like counterex- amples [36]. Since then, many works have addressed this issue in model checking. Counterexample generation has its origins in graph theory through the problem of fair cy- cle and Strongly Connected Component (SCC) detection, because model checking algorithms of temporal logics em- ploy cycle detection and technically a finite system model is determining a transition graph [32]. The original algo- rithm for fair cycle detection in LTL and CTL model was proposed by [53]. Since then, many variants of this algo- rithm and new alternatives were proposed for LTL and CTL model checking. In section 3 we will investigate briefly the problem of fair cycles and SCCs detection. While the early works introduced by [28, 52] have inves- tigated the problem of generating counterexample so wi- dely, which led to practical implementation within well- known model checkers, the open problem that emerged was the quality of the counterexample generated and how it really serves the purpose. Therefore, in the last decade many papers have considered this issue, earlier in terms of structure[36], by proposing the notion of tree-like counte- rexamples to handle ACTL properties, and followed later by the works investigating the quality of the counterexam- ple mostly in terms of length to be useful later for debug- ging. In section 3, we will investigate the methods propo- sed for generating minimal, small and indicative counterex- amples in conventional model checking. Model checking algorithms are classified in two main categories, explicit and symbolic. While explicit algorithms are applied di- rectly on the transition system, symbolic algorithms em- ploy specific data structures. Generally, the explicit algo- rithms are adopted for LTL model checking, whereas sym- bolic algorithms are adopted for CTL model checking. In this section, the algorithms for generating small counte- rexamples are presented with respect to each type of al- gorithms. However, generating small and indicative counterexam- ples only is not enough for understanding the error. There- fore, counterexamples analysis is inevitable. Many works in model checking have addressed the analysis of counte- rexamples to better understand the error. In section 4, we will investigate the approaches that aim to help the designer 146 Informatica 42 (2018) 145–166 H. Debbi to localize the source of the error given counterexamples. In this section, we consider that most of these methods range into two main categories: those that are applied on the counterexample itself without any need to other infor- mation, and those that require successful runs or witnesses to be compared with the counterexamples. Probabilistic model checking has appeared as an exten- sion of model checking for analyzing systems that exhi- bit stochastic behavior. Several case studies in several domains have been addressed from randomized distribu- ted algorithms and network protocols to biological sys- tems and cloud computing environments. These systems are described usually using Discrete-Time Markov Chains (DTMC), Continuous Time Markov Chains (CTMC) or Markov Decision Processes (MDP), and verified against properties specified in Probabilistic Computation Tree Lo- gic (PCTL)[78] or Continuous Stochastic Logic (CSL) [9, 10]. In probabilistic model checking (PMC) counterex- ample generation has a quantitative aspect. The counterex- ample is a set of paths in which a path formula holds, and their accumulative probability mass violates the probabi- lity bound. Due to its specific nature, we specify section 5 for counterexample generation in probabilistic model checking. As it was done in conventional model checking, addressing the error explanation in the probabilistic mo- del checking is highly required, especially that probabilis- tic counterexample consists of multiple paths instead of a single path, and it is probabilistic. So, in this section we will also investigate the counterexample analysis in PMC. The most important thing about counterexample is that it does not just serve as a debugging tool, but it is also used to refine the model checking process itself, through Coun- terexample Guided Abstraction Refinement(CEGAR)[37]. CEGAR is an automatic verification method mainly propo- sed to tackle the problem of state-explosion problem, and it is based on the information obtained from the counterex- amples generated. In section 6, we will show how counte- rexample contributes to this famous method of verification. Testing is an automated method used to verify the qua- lity of software. When we use model checking to generate test cases, this is called model-based testing. This met- hod has known a great success in the industry through the use of famous model checkers such as SPIN, NuSMV and Java Pathfinder. Model checking is used for testing for two main reasons: first, because model checking is fully auto- mated, and secondly and more importantly because it de- livers counterexamples when the property is not satisfied. In section 7, we will show how counterexample serves as a good tool for generating test cases. Although counterexample generation is in the heart of model checking, not all model checkers deliver counte- rexamples to the user. In the last section, we will review the famous tools that generate counterexamples. Section 9 concludes the paper, where some brief open problems and future directions are presented. 2 Preliminaries and definitions Kripke Structure. A Kripke structure is a tuple M = (AP;S;s 0 ;R;L) that consists of a setAP of atomic pro- positions, a set S of states, s 0 2 S an initial state, a to- tal transition relationR S S and a labelling function L : W ! 2 AP that labels each state with a set of atomic propositions. Büchi Automaton. A Büchi automaton is a tuple B = (S;s 0 ;E; P ;F) whereS is a finite set of states,s 0 2S is the initial state,E S S is the transition relation, P is a finite alphabet, andF S is the set of accepting or final states. We use Büchi automaton to define a set of infinite words of an alphabet. A path is a sequence of states (s 0 s 1 :::s k ), k 1 such that (s i ;s i+1 )2 E for all 1 i < k. A path (s 0 s 1 :::s k ) is a cycle ifs k =s 1 , the cycle is accepting if it contains a state inF . A path(s 0 s 1 :::s k ::::s l ) wherel >k is accepting if s k :::s l forms an accepting cycle. We call a path that starts at the initial state and reaches an accep- ting cycle an accepting path or counterexample (see Figure 1). A minimal counterexample is an accepting path with a minimal number of transitions. Strongly Connected Component. A graph is a pairG = (V;E), whereV is a set of states andE V V is the set of edges. A path is a sequence of states (v 1 ;:::;v k ), k 1 such that (v i ;v i+1 )2 E for all 1 < i k. Let be a path, the length of is defined by the number of transitions and is denoted by[ ]. We say that we can reach a vertexu from a vertexv if there exists a path fromv to u. We define a Strongly Connected Component (SCC) as a maximal set of statesC V such that for every pair of verticesu;v2C,u andv are mutually reachable. A SCC C is trivial ifC =fvg, or otherwiseC is non-trivial if for everyu;v2C there is a non-trivial path fromu tov. Discrete-Time Markov Chain (DTMC) A Discrete-Time Markov Chain (DTMC) is a tuple D = (S;s init ;P;L), such that S is a finite set of states, s init 2 S the initial state,P : S S! [0;1] represents the transition proba- bility matrix, L : S ! 2 AP is a labelling function that assigns to each states2 S the setL(s) of atomic propo- sitions. An infinite path is a sequence of statess 0 s 1 s 2 ::: , whereP(s i ;s i+1 ) > 0 for alli 0. A finite path is the finite prefix of an infinite path. We define a set of paths starting from a states 0 byPaths(s 0 ). The probability of a finite path is calculated as follows: P( 2Paths(s 0 )js 0 s 1 :::s n is a prefix of ) = Q i 0 ;g is a comparison operator, andp is a probability thres- hold. The path formulas ’ are formed according to the following grammar: ’ ::= 1 U 2 j 1 W 2 j 1 U n 2 j 1 W n 2 Where 1 and 2 are state formulas and n 2 N. As in CTL, the temporal operators (U for strong until, W for weak (unless) until and their bounded variants) are requi- red to be immediately preceded by the operator P. The PCTL formula is a state formula, where path formulas only occur inside the operator P. The operator P can be seen as a quantification operator for both the operators8 (universal quantification) and9 (existential quantification), since the properties are representing quantitative requirements. The semantics of a PCTL formula over a state s (or a path ) in a DTMC model D = (S;s init ;P;L) can be defined by a satisfaction relation denoted byj=. The sa- tisfaction of P p (’) on DTMC depends on the probability mass of a set of paths satisfying’. This set is considered as a countable union of cylinder sets, so that, its measurability is ensured. The semantics of PCTL state formulas for DTMC is de- fined as follows: 148 Informatica 42 (2018) 145–166 H. Debbi sj=true,true sj=a,a2L(s) sj=: ,s6j= sj= 1 ^ 2 ,sj= 1 ^sj= 2 sj= P p (’),P(sj=’) p Given a path =s 0 s 1 ::: inD and an integerj 0, where [j] =s j , the semantics of PCTL path formulas for DTMC is defined as for CTL as follows: j= 1 U 2 ,9j 0: [j]j= 2 ^(80 k(path descriptor) and< s 0 ;::;s n ;s 0 > (loop descriptor), where S fdesc1;desc2g describes a fi- nite path leading to a cycle. The tree-like counterexample will be then SQ , where Q refers to the set of descriptors generated by CEX algorithm. The set of descriptors for the example in Figure 3 would be:, and ! . 1: procedure CEX(K,s i 0 ,’) 2: case’ of 3: ’ 1 _’ 2 : 4: CEX(K,s i 0 ,’ 1 ) 5: CEX(K,s i 0 ,’ 2 ) 6: ^ i 1 ’ i : 7: ’ 1 ^’ 2 : 8: Selectj such thatK;s 0 6j=’j 9: CEX(K,s i 0 ,’ j ) 10: 8O( 1 ;:::; k ): 11: Determine = s 0 ;:::;s N ;:::;s N+M such that K; 6j= O( 1 ;:::; k ) 12: desc1:=hs i 0 ;unravel(C;s 1 ;:::;s N )i 13: desc2:=hunravel(C +N;s N ;:::;s N+M )i ! 14: return desc1 and desc2 15: C :=C +N +M +1 16: for all statesp2 S fdesc1;desc2g do 17: forj2f1;:::;kg do 18: ifK;p6j= j then 19: CEX(K;p; j ) 20: end if 21: end for 22: end for 23: end case 24: end procedure Figure 4: The generic counterexample algorithm for ACTL[36]. After these works of Clarke et al., many works have ad- dressed the issue of computing short counterexamples in symbolic model checking [117, 29, 108]. Schuppan et al. [117] proposed some criteria that should be met for the Bü- chi automaton to accept shortest counterexamples. They proved that these criteria are satisfied in the approach pro- posed by [29] just for future time LTL specification, and thus they proposed an approach that meets the criteria pro- posed for LTL specifications with past. The algorithm pro- posed employs breadth-first reachability check with Binary Decision Diagrams(BDD)-based symbolic model checker. The authors in [108] proposed a black-box based techni- que that masks some parts of the system in order to give an understandable counterexample to the designer. So the work does not just tend to produce minimal counterexam- ples, but also, it delivers small indicative counterexample of good quality to be analyzed in order to get the source of the error. The major drawback of this method is that the generalization of counterexample generation from sym- bolic model checking to black box model checking, could lead to non-uniform counterexamples that do not meet the behavior of the system intended. While all of these works are applied to unbounded model checking [117, 108], the works [122, 120, 113] consider bounded model checking, through lifting assignments produced by a SAT solver, where the quality of the counterexample generated depends on the SAT solver in use. Other works have investigated the use of heuristics algorithms for generating counterexam- ples [124, 50]. Although heuristics were not widely used, they gave pretty good results and were also used later for generating probabilistic counterexamples. 4 Counterexamples analysis and debugging One of the major advantages of model checking over ot- her formal methods is its ability to generate a counterex- ample when the model falsifies such specification. The counterexample represents an error trace; by analyzing it the user can locate the source of the error, and as Clarke wrote:\The counterexamples are invaluable in debugging complex systems. Some people use model checking just for this feature" [27]. However, generating small and indicative counterexam- ples only is not enough for understanding the error. There- fore, counterexamples explanation is inevitable. Error ex- planation is the task of discovering why the system exhibits this error trace. Many works have addressed the automatic analysis of counterexamples to better understand the fai- lure. Error explanation ranges in two main categories. The first is based on the error trace itself, through considering the small number of changes that have to be made in order to ensure that the given counterexample is no longer exhi- bited, and thus, these changes represent the sources of the error. The second is based on comparing successful execu- tions with the erroneous one in order to find the differen- ces, and thus those differences are considered as candidate causes for the error. Kumar et al. [97] have introduced a careful analysis of the complexity of each type. For the first type, they showed using three models (Mealy machi- nes, extended finite state machines, and pushdown automa- ton) that this problem is NP-complete. For the second type, they provided a polynomial algorithm using Mealy machi- nes and pushdown automaton, but solving the problem was difficult with extended finite state machines. Error explanation methods are successfully integrated into model checkers such as SLAM [12] and Java PathFin- der JP [16]. SLAM takes less execution time than JP, and can achieve completeness in finding the causes, but accor- ding to Groce [67], this also could be harmful. The error explanation process has many drawbacks; the main one is that the counterexample consists usually of a huge number of states and transitions and involves many variables. The Counterexamples in Model Checking – A Survey Informatica 42 (2018) 145–166 151 second is that model checker usually floods the designer with multiple counterexamples, without any kind of clas- sification. This makes challenging the task of choosing a helpful counterexample for debugging purposes. Besides, a single counterexample it might not be enough to under- stand the behavior of the system. Analyzing a set of coun- terexamples together is an option but the problem is that it requires much effort, and even though, the set of counte- rexamples to be analyzed could contain the same diagnostic information, which may make analyzing this set of counte- rexamples a waste of time. The last and the most important problem in error explanation is that not all the events that occur on the error trace are of importance for the designer, so locating critical events is the goal behind error explana- tion. In this section, we survey some works with respect to the two categories. 4.1 Computing the minimal number of changes Jin et al. [92] proposed an algorithm for analyzing the counterexamples based on the local information, by seg- menting the events of the counterexamples in two main segments, fated and free. The fated segments refer to the events that obviously have to occur in the executions, and the free segments refer to the events that should be avoided for the error not to occur, and thus they are candidate to be causes. Fated and free segments are computed with respect to input variables in the system, where they are classified into controlling and non-controlling. While controlling va- riables are considered to be critical, and have more control on the environment, the non-controlling variables have less importance. So that, fated segments are determined with respect to controlling variables, whereas free segments are determined with respect to non-controlling ones. Wang et al. [132] also proposed a method that works just on the failed execution path without considering successful ones. The idea is about looking at the predicates candidate for causing the failure in the error trace. To do so, they use weakest pre-condition computation, the technique that is widely used in predicate abstraction. This computation aims to find the minimal number of conditions that should be met in order to not let the program violate the asser- tion. This results in a set of predicates that contradict with each other. By comparing how these predicates contradict to each other, we can find the cause for the assertion fai- led and map it back to the real code. Many similar works also provided error explanation methods in the context of C programs [137, 127, 138]. Using the notion of causality by Halpern and Pearl [74], Beer et al. [88] introduced a polynomial-time algorithm for explaining LTL counterexamples that was implemented as a feature in the IBM formal verification platform Rule- Base PE. Given the error trace, the causes for the violation are highlighted visually as red dots on the error trace it- self. The question asked was: what values of signals on the trace cause it to falsify the specification? Following the definition of Halpern and Pearl, they refer to such a set of pairs of state-variable as bottom-valued pairs whose values should be switched to make such state-variable pair criti- cal. The pair is said to be critical if changing the value of the variable in this state no longer produces a counterexam- ple. This pair represents the cause for the first failure of the LTL formula given the error trace, where they argue that the first failure is the most relevant to the user. Neverthe- less, the algorithm computes an over-approximation of the set of causes not just the first cause that occurred. Let ’ be an LTL formula in Negation Normal Form (NNF) and =s0;s1;:::;sk a counterexample for it. The algorithm for computing the approximate set of causes gi- ven’ and is depicted in Figure 5. The procedure invokes each time a function val for evaluating sub-formulas of’ on the path. The procedure is executed recursively given the formula’ until it reaches the proposition level, where the cause is finally rendered as a pairhs i ;pi/hs i ;:pi, where s i refers to the current state. Let us consider the formula : G((:START ^ :STATUS_VALID ^ END) ! [:START USTATUS_VALID]). The result of executing the RuleBase PE implementation of the algo- rithm on this formula is shown in Figure 6. The red dots refer to the relevant causes for the error. Where some variables are not critical for the failure, others can be cri- tical, which means that switching their values alone could result in mitigating the violation. For instance, in state 9, START precedesSTATUS_VALID, by switching the value ofSTART from 1 to 0 in state9, the formula would not fail anymore given this counterexample. 4.2 Comparing counterexamples with successful runs This is the most adopted method for error explanation that is successfully featured in many model checkers such as SLAM and Java PathFinder. Groce et al. [70] have propo- sed an approach for counterexamples explanation based on computing a set of faulty runs called negatives, in which the counterexample is included, and comparing it to a set of correct runs called positives. Analyzing the common features and differences could lead to getting a useful di- agnostic information. Their algorithms were implemented in JA V A pathfinder. Based on Lewis counterfactual the- ory of causality [105] and distance metrics, Groce [68] has proposed a semi-automated approach for isolating errors in ANSI C programs, by considering the alternative worlds as programs executions and the events as propositions about those executions. The approach relies on finding causal de- pendencies between predicates of a program. A predicate a is causally dependent onb given the faulty execution, if only if the executions in which the removal of a cause a also removes the effectb are more likely than the executi- ons wherea andb do not appear together. For finding these traces, which are as close as possible to the faulty one, the authors employed distance metric. A description of their 152 Informatica 42 (2018) 145–166 H. Debbi 1: procedure CAUSES(’, i ) 2: Case’ of 3: p: 4: ifp62s i then 5: returnhs i ;pi 6: end if 7: :p: 8: ifp2s i then 9: returnhs i ;pi 10: end if 11: X’: 12: ifi P(CX 2 ). The strongest path iss 0 s 1 , which is included in the most indicative probabilistic counterexample. 5.1 Probabilistic counterexample generation Various approaches for probabilistic counterexamples ge- neration have been proposed. Aljazzar et al. [1, 3] introdu- ced an approach for counterexample generation for DTMC and CTMC against timed reachability proprieties using heuristics and directed explicit state space search. Since resolving nondeterminism in an MDP results in a DTMC, in complementary work [4], Aljazzar and Leue proposed an approach for counterexample generation for MDPs ba- sed on existing methods for DTMC. Aljazzar and Leue in- troduced a complete work in [5] for generating counterex- amples for DTMCs and CTMSs as what they refer to as diagnostic sub-graphs. All these works on generating in- dicative counterexamples have led to the development of the K* algorithm [6], an on-the-fly heuristics guided al- gorithm for the K shortest path problem. Comparing to classical k-shortest-paths algorithms, K* has two main ad- vantages, it woks on-the- fly in way it avoids exploring the entire graph, and it can be guided using heuristic functions. Based on all the previous works, they built a tool DiPro [2] for generating indicative counterexamples for DTMCs, CTMCs and MDPs. This tool can be jointly used with the model checkers PRISM [81] and MRMC [94], and can ren- der the counterexamples in text format as well as in graphi- cal mode. These heuristic-based algorithms showed a great efficiency in terms of counterexample quality. Neverthe- less, with large models, DiPro tool that implements these algorithms takes usually a long time to produce a counte- rexample. By running DiPro on a PIRSM model of the DTMC presented in Figure 12 against the same property, we obtain the most indicative counterexampleCX3. The graphical representation of CX3 as rendered by DiPro is depicted in Figure 13. The diamonds refer to the final or end states (s 1 ,s 3 ,s 5 ), whereas the circles represent simple nodess 2 ands 4 . The user can navigate through the coun- terexample and inspect all values. Similar to the previous works, [75] has proposed the no- tion of smallest most indicative counterexample that redu- Counterexamples in Model Checking – A Survey Informatica 42 (2018) 145–166 155 Figure 13: A counterexample generated by DiPro. ces to the problem of finding K shortest paths. In a weigh- ted digraph transformed from the DTMC model, and given initial state and the target states, the strongest evidences that form the counterexample are selected using extensi- ons of K-shortest paths algorithms for an arbitrary number k. Instead of generating path-based counterexamples, [134] have proposed a novel approach for DTMCs and MDPs ba- sed on critical subsystems using SMT solvers and mixed integer linear programming. Critical subsystem is simply a part of the model (states and transitions) that are con- sidered relevant because of its contribution to exceeding the probability bound. The problem has been shown that is NP-Complete. Another work always based on the no- tion of critical subsystem is proposed to deliver abstract counterexamples with less number of states and transiti- ons using hierarchical refinement method. Based on all of these works, Jansen et al. proposed the COMICS tool for generating the critical subsystems that induce the counte- rexamples [90]. There are also many other works that addressed special cases for generating counterexamples in PMC. the authors of [8], proposed an approach for finding sets of evidences for bounded probabilistic LTL properties on MDP that be- have differently from each other giving significant diagnos- tic information. While their method is also based on K- shortest path, the main contribution is about selecting the evidences or the witnesses with respect to main five cri- teria in addition to the high probability. While all of the previous works for counterexample generation are explicit- based, the authors in [133] proposed a symbolic method using bounded model checking. In contrast to the previ- ous methods, this method lacks the selection of the stron- gest evidences first, since the selection is performed in ar- bitrary order. Another approach for counterexample gene- ration that uses bounded model checking has been propo- sed [15]. Unlike the previous work that uses conventional SAT solvers, the authors used a SMT-solving approach in order to put some constraints on the paths selected, in order to get more abstract counterexample that consists of stron- gest paths. Counterexample generation for probabilistic LTL model checking has been addressed in [116] and pro- babilistic CEGAR has been also addressed [80]. A com- prehensive representation of the counterexamples using re- gular expressions has been addressed in [44]. Since regu- lar expressions deliver compact representations, they can help to deliver short counterexamples. Besides, they are widely known and easily understandable, so that they will give more benefits as a tool for error explanation. 5.2 Probabilistic counterexample analysis Instead of relying on the state space search resulted from the parallel composition of the modules, [135] suggests to rely directly on the guarded command language used by the model checker, which is more likely and helpful for de- bugging purpose. To do so, the authors employ the critical subsystem technique [134] to identify the smallest set of guarded commands contributing to the error. To analyze probabilistic counterexamples, Debbi and Bourahla [48, 47] proposed a diagnostic method based on the definition of causality by Halpern and Pearl [74] and responsibility [25]. The method proposed takes the pro- babilistic counterexample generated by DiPro tool and the probabilistic formula as input, and returns a set of pairs (state-variable) as candidate causes for the violation orde- red with respect to their contribution to the error. So, in contrast to the previous methods, this method does not tend to generate indicative counterexamples, but it acts directly on indicative counterexamples already generated. Another similar approach for debugging probabilistic counterexam- ples has been introduced by [46]. It adopts the same defi- nition of causality by Halpern and Pearl to reason formally about the causes, and then transforms the causality model into regression model using Structural Equation Modeling (SEM). SEM is a comprehensive analytical method used for testing and estimating causal relationships between va- riables embedded in theoretical causal model. This met- hod helps to understand the behavior of the model through quantifying the causal effect of the variables on the viola- tion, and the causal dependencies between them. The same definition of causality has also been adopted to event orders for generating fault trees from probabilis- tic counterexamples, where the selection of traces forming the fault tree are restricted to some minimality condition [102]. To do so, Leitner-Fischer and Leue proposed the event order logic to reason about Boolean conditions on the occurrence of events, where the cause of the hazard in their context is presented as an Event Order Logic (EOL) formula, which is a conjunction of events. In [57], they ex- tended their approach by integrating causality in explicit- state model checking algorithm to give a causal interpreta- tion for sub- and super-sets of execution traces, the thing 156 Informatica 42 (2018) 145–166 H. Debbi that could help the designer to get a better insight on the behavior of the system. They proved the applicability of their approach to many industrial size PROMELA models. They extended the causality checking approach to proba- bilistic counterexamples by computing the probabilities of events combination [101], but they still consider the use of causality checking of qualitative PROMELA models. 6 Counterexample guided abstraction refinement (CEGAR) The main challenge in model checking is the state explo- sion problem. Dealing with this issue is in the heart of model checking, it was addressed at the beginning of mo- del checking and not finished. Many methods were pro- posed to tackle this issue, the most famous are: symbolic algorithms, Partial Order Reduction (POR), Bounded Mo- del Checking (BMC) and abstraction. Among these techni- ques, abstraction is considered as the most general and flex- ible for handling the state explosion problem [30]. Ab- straction is about hiding or simplifying some details about the system to be verified, even removing some parts from it that are considered irrelevant for the property under con- sideration. The central idea is that verifying a simplified or an abstract model is more efficient than the entire model. Evidently, this abstraction has a price, which is losing some information, and the best abstraction methods are those that control this loss of information. Over-approximation and under-approximation are two main key concepts for this problem. Many abstraction methods have been proposed [42, 65, 106], the last one had the most attention and was adopted in the symbolic model checker NuSMV . Abstraction can be defined by a set of abstract states b S, an abstraction mapping functionh that maps the states in the concrete model to b S, and the set of atomic propositi- onsAP labeling these states. Regarding the choice on b S, h andAP , we distinguish three main types of abstraction : predicate abstraction [66, 115], localization reduction [99] and data abstraction [39]. Predicate abstraction is based on eliminating some variables from the program to be re- placed by predicates that still serve the information about these variables. Each predicate has a Boolean variable cor- responding to it, where the abstract states b S resulted are valuations of these variables. Both the abstraction map- pingh between the concrete and abstract states, and the set of atomic propositionsAP , are determined with respect to the truth values of these predicates. The entire abstract mo- del can then be defined through existential abstraction. To this end, we can use BDDs, SAT solvers or theorem pro- vers depending on the size of the program. Localization reduction and data abstraction are actually just extensions of predicate abstraction. Localization reduction aims to de- fine a small set of variables that are considered relevant to the property in hand to be verified, these variables are cal- led visible, the rest of variables that have no importance with respect to the property to be verified are called invisi- ble. We should mention that this technique does not apply any abstraction on the domain of visible variables. Data ab- straction deals mainly with the domains of variables by ma- king an abstract domain for each variable. So the abstract model will be built with respect to the abstract values. For more detail on abstraction techniques, we refer to [71]. Given the possible loss of information caused by the abstraction, inventing some refinement methods of the abstract model is necessary. The most known method for abstraction refinement is Counterexample-Guided Ab- straction Refinement (CEGAR) that has been proposed by [30] as a generalization of the localization reduction ap- proach. A prototype implementation of this method in NuSMV has also been presented. In this approach, the counterexample plays the crucial role for finding the right abstract model. The process of CEGAR consists of three main steps: the first is to generate an abstract model using one of the abstractions techniques [30, 23, 33] given a for- mula’. The second step is about checking the satisfaction of’, if it is satisfied then the model checker stops and re- turns that the concrete or the original model satisfies the formula, if it is not satisfied, a counterexample will be ge- nerated. The counterexample generated is in the abstract model, so we have to check if it is also a valid counterex- ample in the concrete model, because the abstract model has different behavior comparing to the concrete one. Ot- herwise, the counterexample is called spurious and the ab- straction must be carried out based on this counterexample. So, a spurious counterexample is an erroneous counterex- ample that exists only in the abstract model, not the con- crete model. The final step is to refine the model until no spurious counterexample is found (see Figure 14). This is how the technique gets its name, refining the abstract mo- del using the spurious counterexample. Refinement is an important task of CEGAR that can make the process faster and gives the appropriate results. To refine the abstract mo- del, different partitioning algorithms are applied to abstract states. Like abstraction, partitioning the abstract states in order to eliminate the spurious counterexample can be car- ried out in many other ways than BDDs [30]. SAT solvers [24] or linear programming and machine learning [34] can be used to define the most relevant variables to be conside- red for the next abstraction. In the literature, we find many extensions for CEGAR depending on the type of predicates and application dom- ains: large program executions [96], non-Disjunctive ab- stractions [107] and propositional circumscription [89]. The CEGAR technique itself has been used to find bugs in complex and large systems [14]. The idea is based on gat- hering and saving information during the abstract model checking process in order to generate short counterexam- ples in the case of failure. This could be helpful for large models that make generating counterexamples using stan- dard BMS intractable. CEGAR currently is implemented in many tools such as NuSMV[26], SLAM and BLAST[13]. Counterexamples in Model Checking – A Survey Informatica 42 (2018) 145–166 157 Figure 14: Counterexample Guided Abstraction Refinement Process. 7 Counterexamples for test cases generation Counterexample generation gives the opportunity for mo- del checking to be adopted and used in different domains, one of the domains in which the model checking has been adopted is test cases generation. Roughly speaking, tes- ting is an automated method used to verify the quality of software. When we use model checking to generate test cases, this is called model-based testing. The use of mo- del checking for testing is mainly subjected to the size of the software to be tested, because a suitable model must be guaranteed. The central idea of using model checking for testing [20, 55] is about interpreting counterexamples ge- nerated by the model checkers as test cases, and then test data and some expected results are extracted from these tests using such execution framework. Counterexamples are mainly used to help the designer to find the source of the error. However, they are very useful as test cases. [60]. A test describes the behavior of the test case intended: the final state, the states that should be traversed to reach the final state and so forth. In practice, it might not be possible to execute all test cases, since the software to be tested has usually a large number of behaviors. Neverthe- less, there exist some techniques to help us to measure the reliability of testing. These techniques range in two main categories: first, deterministic methods (given initial state and some input, we will be certainty aware about the out- put), most famous methods for this category are coverage analysis and mutation analysis. Second, statistical analysis, where the reliability of the test is measured with respect to some probability distribution. In coverage-based testing, the test purpose is specified in temporal logic and then converted to what is called a never-claim by negation; to assert that the test purpose ne- ver becomes true. So, the counterexample generated after the verification process will describe how the never-claim is violated, which is a description of how test purpose is fulfilled. Many approaches for creating never-claims ba- sed on coverage criterion (called “trap properties”) [61] are proposed. Coverage criteria aim to find how such a system is exercised given a specification in order to get the sta- tes that were not traversed during the test; in this context, we call this specification a test suit. So, a full coverage is achieved if all the states of the system are covered. To create a test suite that covers all states, we need a trap pro- perty for each possible state. For example, claiming that the value of a variable is never 0: G:(a = 0). A counte- rexample to such a trap property is any trace that reaches a state where(a = 0). With regard to trap properties, we find many variations. Gargantini and Heitmeyer addressed the coverage of soft- ware cost reduction (SCR) specifications [61]. SCR spe- cifications are defined by tables over the events that repre- sent the change of a value in state and lead to a new state, and conditions defined on the states. Formally, a SCR mo- del is defined as quadruple (S;S 0 ;E m ;T) whereS is the set of states, S 0 is the initial state set, E m is the set of input events, and T is the transform function that maps an input event and the current state to a new one. SCR requirement properties can be used as never-claims, first by converting SCR into model checkers languages such as SPIN language (PROMELA), or SMV language, and then transform SCR tables into if-else construct in the case of using SPIN, or a case statement in the case of SMV . Anot- her approach by Heimdahl et al. addressed the coverage of transition systems globally [79], where they consider the use of RSML e as the specification language. A sim- ple example of transition coverage criteria is of the form G(A^C ! :B), where A represents a system’s state s, B represents the next state, andC is the condition that guards the transitionA toB. So a counterexample for this property could be a trace that reaches a state B when C 158 Informatica 42 (2018) 145–166 H. Debbi Figure 15: Coverage based test case generation [60]. evaluates to true, or a trace that reaches another state than B whenC evaluates to false. Hong and Lee [87] proposed an approach based on control and data flow, where they use SMV model checker to generate counterexamples during the model checking of state-charts. The counterexample generated can be mapped into test sequence that induces information about which initial and stable states are consi- dered. Another approach based on abstract state machines has been introduced [62]. The trap properties here will be defined over a set of rules for guarded function updates. We can see that all coverage-based approaches deal with the same thing, which is trap properties, and defer from each other in the formalism adopted. Another approach for using requirement properties as test cases has been introduced by [54]. In this approach, each requirement has a set of tests. Trap properties can be easily derived from requirement properties under property- coverage criteria [125]. Another method that is completely different from coverage-based analysis is mutation-based analysis [18]. Mutation analysis consists of creating a set of mutants, which can be obtained by making small modi- fications on the original program in way these mutants lead to realistic faults. We differ between each mutant by its score, the mutant with the high score indicates high fault sensitivity. It is evident that deriving such mutant that is equivalent to the original program will have a high com- putational cost [91], because we have to apply all the test cases to each mutant, and all mutants should be considered. And for each mutant the model checker must be invoked. Fraser et al. [59] reviewed in detail most of these techni- ques and proposed several effective techniques to improve the quality of the test cases generated in model checking- based testing, especially requirements based testing, and apply them on different types of properties in many indus- trial case studies. 8 Counterexamples generation tools Practically, all successful model checkers are able to output counterexamples in varying formats [38]. In this section, we will try to survey the tools supporting counterexample generation and study their effectiveness. A set of model checkers with their features are presented in Table 1. Berkeley Lazy Abstraction Software Verification Tool (BLAST) [13] is a software model checking tool for C pro- grams. BLAST has the ability to generate counterexam- ples, and furthermore, it employs CEGAR. BLAST is not just a CEGAR-based model checker, but it can be also used for generating test cases. BLAST shows promising results with safety properties of programs with a medium size. CBMC [35] is a well-known Bounded Model Checker for AINCI C and C++ programs. CBMC performs symbo- lic execution on the programs and employs a SAT solver in the verification procedure, when the specification is falsi- fied, a counterexample in the form of states with variables valuation leading to these states is rendered to the user. JavaPathfinder(JPF) [131] is a famous software model checking tool for Java programs. JavaPathfinder is an ef- fective virtual machine-based tool that verifies the program along all the possible executions. Due to its ability to deal with most of JA V A language features, because it runs on byte-code level, JavaPathfinder can generate a detailed re- port on the error in case that the property is violated. Besi- des, the tool gives the ability to generate test cases. SPIN [86] is a model checker mostly known for the veri- fication of systems that exhibit a high interaction between processes. The systems are described using Process Meta Language (PROMELA) language, and verified against pro- perties specified in LTL. By applying a Depth-First Search algorithm on the intersection product of the model and the Büchi automaton representing the LTL formula, a counte- rexample is generated in case an accepted cycle is detected. SPIN offers an interactive simulator that helps to under- stand the cause of the failure by showing the processes and their interactions in order. Counterexamples in Model Checking – A Survey Informatica 42 (2018) 145–166 159 Table 1: Model checkers with their features. Name Model Checking Counterexample generation Programs, systems Algorithms, methods Modeling language Specification language Visualization Form BLAST C programs Predicate lazy abstraction, CEGAR C BLAST No Set of predicates CBMC C programs BMC, SAT solving C/C++ Assertions No Variables and valuations JPF Java programs Explicit state, POR Java No No Error report SPIN Concurrent, distributed, asynchronous Nested Depth First Search, POR PROMELA LTL Yes Execution path NuSMV Synchronous, asynchronous BDD-based, SAT-based BMC SMV LTL, CTL No States and valuations UPPAAL Ral-time On-the-fly, Symbolic Timed automata TCTL Yes Sequence of states PRISM Probabilistic, real-time Graph-based, numerical PRISM PCTL,CSL, PTCTL No-By DiPro Graph MRMC Probabilistic Numerical PRISM, PEPA PCTL,CSL, PRCTL,CSRL No-By DiPro or COMIC Graph NuSMV [26] is a symbolic model checker that appeared as an extension of the Binary Decision Diagrams(BDD)- based model checker SMV . NuSMV includes both LTL and CTL for specification analysis, and combines SAT and BDD techniques for the verification. NuSMV can deliver a counterexample in XML format by indicating the states of the trace and the variables with their new values that cause the transitions. UPPAAL [100] is a verification framework for real-time systems. The systems can be modeled as networks of ti- med automata extended with data types and synchroniza- tion channels, and the properties are specified using a Ti- med CTL(TCTL). UPPAAL can find and generate counte- rexamples in graphical mode as message sequence charts that indicate the events with respect to their order. PRISM [81] is a probabilistic model checker used for the analysis of systems that exhibit stochastic behavior. The systems are described as DTMCs, CTMCs or MDPs, using guarded command language, and verified against probabi- listic properties expressed in PCTL and CSL, and can be extended with rewards. Another successful probabilistic model checker extended with rewards is the Markov Re- ward Model Checker (MRMC) [94]. MRMC is mainly used for performance and dependability analysis. It takes the models as input files in two formats, in PRISM lan- guage or Performance Evaluation Process Algebra (PEPA). Although both model checkers have shown high effective- ness, they lack a mechanism for generating probabilistic counterexamples. Nevertheless, they have been used by re- cent tools (DiPro [2] and COMICS [90]) for generating and visualizing probabilistic counterexamples. 9 Conclusions and future directions In this paper we surveyed counterexamples in model checking from different aspects. At the beginning of using model checking, counterexamples have not been treated as a particular subject, but they have been treated as a rela- ted problem to fair cycle detection algorithms, as presented in section 3. But recently, the quality of the counterex- amples generated has been treated as a standalone and a fundamental problem. Many works tried to deliver short and indicative counterexamples to be used for debugging purpose. Concerning their structure, tree-like counterex- amples have been proposed for the fragment of ACTL as an alternative for linear counterexamples, however, we see that this approach has not been adopted in model checkers, but instead model checkers are still based on generating simple non-branching counterexamples . For debugging, we can conclude that approaches that re- quire other successful runs might have some advantages over other methods based on single trace, in way that they compare many good traces to restrict the set of candidate causes. However, these methods take usually much execu- tion time in order to select the appropriate set of traces, and besides, such traces could contain the same diagnostic in- formation. Regardless of the debugging method in use, the challenge of visualizing the error traces and the quality of diagnoses generated to facilitate debugging is still an open issue. For the case of counterexample generation in PMC, we have seen that the principle of counterexample generation is completely different than conventional model checking, 160 Informatica 42 (2018) 145–166 H. Debbi where the presentation of counterexample is different from a work to another, from smallest and indicative set of paths to most critical sub-systems. Despite the notable advance- ment in generating probabilistic counterexamples that led to inventing important tools like DiPro and COMICS, un- fortunately this advancement is still insufficient for debug- ging. Actually, it is more than important to see the techni- ques for counterexample generation and counterexample analysis integrated in probabilistic model checkers to get their benefit. All these techniques act on verification results of probabilistic model checkers like PRISM, so making the approaches of counterexample generation and counterex- amples analysis to be performed during the model checking process itself is still an open problem. This could really have a great impact on probabilistic model checking. We have also seen the usefulness of counterexamples for other purposes than debugging, like CEGAR and test cases generation. For CEGAR, we have seen different approa- ches for both abstraction and refinement. We have seen that we can benefit from using SAT solvers and theorem provers on the both sides, abstraction and refinement, thus they are very useful for CEGAR. Fast elimination of spu- rious counterexamples is still an active research topic. We also expect to see more works on CEGAR in PMC. For testing, we have seen that the most useful approaches using model checking are those based on coverage and trap properties. Other approaches for testing like requirement- based analysis and mutation-based analysis have received smaller attention due to the limitations presented. Cur- rently, coverage-based techniques are widely used in the industry. In the future, we expect to see the proposition of new approaches to enable us to test new emerging sy- stems, which require new transformation mechanisms for enabling trap properties to be verified by model checkers to generate the counterexamples. We should mention that such techniques can benefit from other techniques. For instance, new efficient CEGAR techniques will not only have an impact on conventional model checking, but on probabilistic model checking as well. We can also see in the future the use of probabilis- tic model checkers like PRISM for testing probabilistic sy- stems. Since PRISM does not generate counterexamples, any advancement in generating indicative counterexamples could be of benefit for testing probabilistic systems. We can also see that techniques based on counterexamples like CEGAR can directly benefit from any advancement in ge- nerating small and indicative counterexamples in a consi- derable time. In addition to all of this, we expect to see more works in other domains that adapt model checking techniques just for the seek of getting counterexamples. In previous works we have seen for instance that counterexamples can be mapped to UML sequence diagrams, describing states and events in the original model [51], they can be used to gene- rate attack graphs in networks security [123], in fragmen- tation of services in Service-Based Applications (SBAs) [21], and they have been also used to enforce synchroni- zability and realizability in distributed services integration [72]. References [1] Aljazzar, H., Hermanns, H., and Leue, S. Coun- terexamples for timed probabilistic reachability. In FORMATS (2005), LNCS, vol. 3829, Springer, Ber- lin, Heidelberg, pp. 177–195. [2] Aljazzar, H., Leitner-Fischer, F., Leue, S., and Si- meonov, D. Dipro - a tool for probabilistic coun- terexample generation. In 18th International SPIN Workshop (2011), LNCS, vol. 6823, Springer, Ber- lin, Heidelberg, pp. 183–187. [3] Aljazzar, H., and Leue, S. Extended directed search for probabilistic timed reachability. In FORMATS (2006), LNCS, vol. 4202, Springer, Berlin, Heidel- berg, pp. 33–51. [4] Aljazzar, H., and Leue, S. Generation of counterex- amples for model checking of markov decision pro- cesses. In International Conference on Quantitative Evaluation of Systems (QEST) (2009), pp. 197–206. [5] Aljazzar, H., and Leue, S. Directed explicit state- space search in the generation of counterexamples for stochastic model checking. IEEE Trans. on Soft- ware Engineering 36, 1 (2010), 37–60. [6] Aljazzar, H., and Leue, S. K*: A heuristic search algorithm for finding the k shortest paths. Artificial Intelligence 175, 18 (2011), 2129 – 2154. [7] Aloul, F., Ramani, A., Markov, I., and Sakallah, K. Pbs: A backtrack search pseudo boolean solver. In Symposium on the Theory and Applications of Satis- fiability Testing (SAT) (2002), pp. 346–353. [8] Andres, M. E., DArgenio, P., and van Rossum, P. Significant diagnostic counterexamples in probabi- listic model checking. In Haifa Verification Confe- rence (2008), pp. 129–148. [9] Aziz, A., Sanwal, K., Singhal, V ., and Brayton, R. Model-checking continuous-time markov chains. ACM Transactions on Computational Logic 1, 1 (2000), 162–170. [10] Baier, C., Haverkort, B., Hermanns, H., and Katoen, J.-P. Model checking algorithms for continuous- time markov chains. IEEE Transactions on Software Engineering 29, 7 (2003), 524–541. [11] Ball, T., Naik, M., and Rajamani, S. From symptom to cause: Localizing errors in counterexample tra- ces. In ACM Symposium on the Principles of Pro- gramming Languages (2003), pp. 97–105. Counterexamples in Model Checking – A Survey Informatica 42 (2018) 145–166 161 [12] Ball, T., and Rajamani, S. The slam project: De- bugging system software via static analysis. In ACM Symposium on the Principles of Programming Lan- guages (2002), pp. 1–3. [13] Beyer, D., Henzinger, T., Jhala, R., and Majumdar, R. The software model checker blast: Applications to software engineering. International Journal on Software Tools for Technology Transfer (STTT) 9, 5 (2007), 505–525. [14] Bjesse, P., and Kukula, J. Using counterexample guided abstraction refinement to find complex bugs. In Design, Automation and Test in European Confe- rence and Exhibition (2004), pp. 156–161. [15] Braitling, B., and Wimmer, R. Counterexample ge- neration for markov chains using smt-based boun- ded model checking. In Formal Techniques for Dis- tributed Systems (2011), LNCS, vol. 6722, Springer, Berlin, Heidelberg, pp. 75–89. [16] Brat, G., Havelund, K., Park, S., and Visser, W. Java pathfinder a second generation of a java model chec- ker. In Workshop on Advances in Verification (2000). [17] Bryant, R. E. Graph-based algorithms for boolean function manipulation. IEEE Trans. Comput 35, 8 (1986), 677–691. [18] Budd, T., and Gopal, A. Program testing by speci- fication mutation. Journal Computer Languages 10, 1 (1985), 63–73. [19] Burch, J. R., Clarke, E. M., McMillan, K. L., Dill, D. L., and Hwang, L. J. Symbolic model checking: 1020 states and beyond. Information and Computa- tion 98, 2 (1992), 142–170. [20] Callahan, J., Schneider, F., and Easterbrook, S. Au- tomated software testing using model checking. In SPIN Workshop (1996). [21] Chabane, Y ., Hantry, F., and Hacid, M. Querying and splitting techniques for sba: A model checking based approach. In Emerging Intelligent Technolo- gies in Industry (2011), SCI 369, Springer-Verlag, Berlin, Heidelberg, pp. 105–122. [22] Chaki, S., and Groce, A. Explaining abstract counte- rexamples. In SIGSOFT04/FSE (2004), pp. 73–82. [23] Chauhan, P., Clarke, E., Kukula, J., Sapra, S., Veith, H., and D.Wang. Automated abstraction refinement for model checking large state spaces using sat based conflict analysis. In Formal Methods in Computer Aided Design(FMCAD) (2002), LNCS, vol. 2517, Springer, Berlin, Heidelberg, pp. 33–51. [24] Chauhan, P., Clarke, E., Kukula, J., Sapra, S., Veith, H., and D.Wang. Automated abstraction refine- ment for model checking large state spaces using sat based conflict analysis. In FMCAD 2002 (2002), LNCS, vol. 2517, Springer-Verlag, Berlin, Heidel- berg, pp. 33–51. [25] Chockler, H., and Halpern, J. Y . Responsibility and blame: a structural model approach. Journal of Arti- ficial Intelligence Research (JAIR) 22, 1 (2004), 93– 115. [26] Cimatti, A., Clarke, E., Giunchiglia, F., and Ro- veri., M. Nusmv: a new symbolic model verifier. In Proceedings Eleventh Conference on Computer- Aided Verification (CAV 99) (1999), LNCS, vol. 1633, Springer, Berlin, Heidelberg, pp. 495–499. [27] Clarke, E. The birth of model checking. In Grum- berg, O., Veith, H. (eds.) 25 Years of Model Checking (2008), LNCS, Springer, Berlin, Heidelberg, pp. 1– 26. [28] Clarke, E., and Emerson, A. Design and synthe- sis of synchronization skeletons using branching- time temporal logic. In Logic of Programs (1982), Springer-Verlag, pp. 52–71. [29] Clarke, E., Grumberg, O., and Hamaguchi, K. Anot- her look at ltl model checking. Formal Methods in System Design 10, 1 (1997), 47–71. [30] Clarke, E., Grumberg, O., Jha, S., Lu, Y ., and Veith, H. Counterexample-guided abstraction refinement for symbolic model checking. Journal of the ACM (JACM) 50, 5 (2003), 752–794. [31] Clarke, E., Grumberg, O., McMillan, K., and Zhao, X. Efficient generation of counterexamples and wit- nesses in symbolic model checking. In Proc. of the Design Automation Conference (1995), ACM Press, pp. 427–432. [32] Clarke, E., Grumberg, O., and Peled, D. Model Checking. MIT, 1999. [33] Clarke, E., Gupta, A., Kukula, J., and Strichman, O. Sat based abstraction refinement using ilp and ma- chine leraning techniques. In Computer-Aided Ve- rification (CAV) (2002), LNCS, vol. 2404, Springer, Berlin, Heidelberg, pp. 137–150. [34] Clarke, E., Gupta, A., Kukula, J., and Strichman, O. Sat based abstraction refinement using ilp and machine leraning techniques. In CAV 2002 (2002), LNCS, vol. 2404, Springer-Verlag, Berlin, Heidel- berg, pp. 265–279. [35] Clarke, E., Kroening, D., and Lerda, F. A tool for checking ansi-c programs. In TACAS 2004 (2004), LNCS, vol. 2988, Springer, Berlin, Heidel- berg, pp. 168–176. 162 Informatica 42 (2018) 145–166 H. Debbi [36] Clarke, E., Lu, Y ., s. Jha, and Veith, H. Tree-like counterexamples in model checking. In Proc. of the 17th Annual IEEE Symposium on Logic in Computer Science (2002), pp. 19–29. [37] Clarke, E., O.Grumberg, Jha, S., Lu, Y ., and Veith, H. Counterexample-guided abstraction refinement. In CAV (1986), pp. 154–169. [38] Clarke, E., and Veith, H. Counterexamples revisi- ted: Principles, algorithms and applications. In In: Grumberg, O., Veith, H. (eds.) 25 Years of Model Checking (2008), LNCS, Springer, Berlin, Heidel- berg, pp. 1–26. [39] Clarke, E. M., Grumberg, O., and Andlong, D. Mo- del checking and abstraction. ACM Transactions on Programming Languages and Systems 16, 5 (1994), 1512–1542. [40] Cleve, H., and Zeller, A. Locating causes of program failures. In ACM/IEEE International Conference on Software Engineering (ICSE) (2005), pp. 342–351. [41] Copty, F., Irron, A., Weissberg, O., Kropp, N., and Gila, K. Effcient debugging in a formal verification environment. Int J Softw Tools Technol Transfer 4 (2003), 335–348. [42] COUSOT, P., and COUSOT, R. Abstract interpre- tation: A unified lattice model for static analysis of programs by construction or approximation of fix- points. In ACM Symposium of Programming Lan- guage (2003), pp. 238–252. [43] Couvreur, J. On-the-fly verification of linear tempo- ral logic. In FM (1999), LNCS, vol. 1708, Springer, Heidelberg, pp. 253–271. [44] Damman, B., Han, T., and Katoen, J. Regular ex- pressions for pctl counterexamples. In Quantitative Evaluation of Systems(QEST) (2008), pp. 179–188. [45] de Alfaro, L., Henzinger, T., and Mang, F. Detecting errors before reaching them. In CAV (2000), LNCS, vol. 2725, Springer, Berlin, Heidelberg, pp. 186– 201. [46] Debbi, H. Diagnosis of probabilistic models using causality and regression. In in Proceedings of the 8th International Workshop on Verification and Eva- luation of Computer and Communication Systems (2014), pp. 33–44. [47] Debbi, H. Systems Analysis using Model Checking with Causality. PhD thesis, University of M’sila, 2015. [48] Debbi, H., and Bourahla, M. Causal analysis of pro- babilistic counterexamples. In Eleventh ACM-IEEE International Conference on Formal Methods and Models for Codesign (Memocode) (2008), pp. 77– 86. [49] Dong, G., and Pei, J. Sequence Data Mining. Sprin- ger, 2007. [50] Edelkamp, S., Leue, S., and Lluch-Lafuente, A. Directed explicit-state model checking in the vali- dation of communication protocols. International Journal on Software Tools for Technology Transfer 5, 2 (2004), 247–267. [51] Elamkulam, J., Z. Glazberg, I. R., Kowlali, G., Chandra, S., Kohli, S., and Dattathrani, S. De- tecting design flaws in uml state charts for embedded software. In HVC 2006 (2006), LNCS, vol. 4383, Springer-Verlag, Berlin, Heidelberg, pp. 109–121. [52] Emerson, E., and Halpern, J. Decision procedures and expressiveness in the temporal logic of bran- ching time. In STOC 82: Proceedings of the four- teenth annual ACM symposium on Theory of com- puting (1982), ACM Press, pp. 169–180. [53] Emerson, E. A., and Lei, C.-L. Efficient mo- del checking in fragments of the propositional mu- calculus. In Proceedings of the First Annual Sympo- sium of Logic in Computer Science (1986), pp. 267– 278. [54] Engels, A., Feijs, L., and Mauw, S. Test genera- tion for intelligent networks using model checking. In Third International Workshop on Tools and Algo- rithms for the Construction and Analysis of Systems. (TACAS97) (1997), LNCS, vol. 1217, Springer, Ber- lin, Heidelberg, pp. 384–398. [55] Engels, A., Feijs, L., and Mauw, S. Test genera- tion for intelligent networks using model checking. In Third International Workshop on Tools and Al- gorithms for the Construction and Analysis of Sy- stems(TACAS) (2010), LNCS, vol. 1217, Springer, Berlin, Heidelberg, pp. 384–398. [56] Fey, G., and Drechsler, R. Finding good counte- rexamples to aid design verification. In First ACM and IEEE International Conference on Formal Met- hods and Models for Co-Design (MEMOCODE03) (2003), pp. 51–52. [57] Fischer, F., and Leue, S. Causality checking for complex system models. In Verification, Mo- del Checking, and Abstract Interpretation (VMCAI) (2013), LNCS, vol. 7737, Springer, Berlin, Heidel- berg, pp. 248–276. [58] Fisler, K., Fraer, R., Kamhi, G., Vardi, M., and Yang, Z. Is there a best symbolic cycle-detection algo- rithm. In TACAS 2001 (2001), LNCS, vol. 2031, Springer, Berlin, Heidelberg, pp. 420–434. [59] Fraser, G. Automated Software Testing with Model Checkers. PhD thesis, IST - Institute for Software Technology, 2007. Counterexamples in Model Checking – A Survey Informatica 42 (2018) 145–166 163 [60] Fraser, G., Wotawa, F., and Ammann, P. E. Testing with model checkers. Journal of Software Testing, Verification and Reliability 19, 3 (2009), 215–261. [61] Gargantini, A., and Heitmeyer, C. Using model checking to generate tests from requirements spe- cifications. In ESEC/FSE99: 7th European Soft- ware Engineering Conference, Held Jointly with the 7th ACM SIGSOFT Symposium on the Foundations of Software Engineering (1999), LNCS, vol. 1687, Springer, Berlin, Heidelberg, pp. 146–162. [62] Gargantini, A., Riccobene, E., and Rinzivillo, S. Using spin to generate tests from asm specificati- ons. In Abstract State Machines 2003. Advances in Theory and Practice: 10th International Works- hop, ASM (2003), LNCS, vol. 2589, Springer, Ber- lin, Heidelberg, pp. 263–277. [63] Gastin, P., and Moro, P. Minimal counterexam- ple generation for spin. In 14th International SPIN Workshop 2007 (2007), LNCS, vol. 4595, Springer, Berlin, Heidelberg, pp. 24–38. [64] Gastin, P., Moro, P., and Zeitoun, M. Minimi- zation of counterexample in spin. In SPIN 2004 (2004), LNCS, vol. 2989, Springer, Berlin, Heidel- berg, pp. 92–108. [65] GRAF, S., and ANDSADI, H. Construction of ab- stract state graphs with pvs. In CAV (1997), LNCS, vol. 1254, Springer, Berlin, Heidelberg, pp. 72–83. [66] Graf, S., and Saidi, H. Construction of abstract state graphs with pvs. In CAV 97 (1997), LNCS, vol. 1254, Springer-Verlag, Berlin, Heidelberg, pp. 72– 83. [67] Groce, A. Error Explanation and Fault Localization with Distance Metrics. PhD thesis, School of Com- puter Science Carnegie Mellon University, 2005. [68] Groce, A., Chaki, S., Kroening, D., and Strichman, O. Error explanation with distance metrics. Interna- tional Journal on Software Tools for Technology 4, 3 (2006), 229–247. [69] Groce, A., Kroening, D., and Lerda, F. Under- standing counterexamples with explain. In Alur R., Peled D.A. (eds) Computer Aided Verification. CAV 2004 (2004), vol. 3114 of Lecture Notes in Compu- ter Science, Springer, pp. 453–456. [70] Groce, A., and Visser, W. What went wrong: Explai- ning counterexamples. In SPIN Workshop on Model Checking of Software (2003), pp. 121–135. [71] Grumberg, O. Abstraction and refinement in model checking. In FMCO 2005 (2006), LNCS, vol. 4111, Springer-Verlag, Berlin, Heidelberg, pp. 219–242. [72] Gudemann, M., Salaun, G., and Ouederni, M. Coun- terexample guided synthesis of monitors for re- alizability enforcement. In ATVA 2012 (2012), LNCS, vol. 7561, Springer-Verlag, Berlin, Heidel- berg, pp. 238–253. [73] Guo, L., Roychoudhury, A., and Wang, T. Accura- tely choosing execution runs for software fault loca- lization. In 15th international conference on Compi- ler Construction (2006), LNCS, vol. 3923, Springer, Berlin, Heidelberg, pp. 80–95. [74] Halpern, J., and Pearl, J. Causes and explanations: A structural-model approach part i: Causes. In 17th UAI (2001), pp. 194–202. [75] Han, T., and Katoen, J. Counterexamples genera- tion in probabilistic model checking. IEEE Trans. on Software Engineering 35, 2 (2009), 72–86. [76] Hansen, H., and Geldenhuys, J. Cheap and small counterexamples. In Software Engineering and For- mal Methods, SEFM ’08 (2008), IEEE Computer Society Press, pp. 53–62. [77] Hansen, H., and Kervinen, A. Minimal counterex- amples in o(n log n) memory and o(n 2 ) time. In ACDC 2006 (2006), IEEE Computer Society Press, pp. 131–142. [78] Hansson, H., and Jonsson, B. logic for reasoning about time and reliability. Formal aspects of Com- puting 6, 5 (1994), 512–535. [79] Heimdahl, M., Rayadurgam, S., and Visser, W. Spe- cification centered testing. In Second International Workshop on Automates Program Analysis, Testing and Verification (2000). [80] Hermanns, H., Wachter, B., and Zhang, L. Probabi- listic cegar. In Computer Aided Verification (CAV) (2008), LNCS, vol. 5123, Springer, Berlin, Heidel- berg, pp. 162–175. [81] Hinton, A., Kwiatkowska, M., Norman, G., and Par- ker, D. Prism: A tool for automatic verification of probabilistic systems. In TACAS (2006), LNCS, vol. 3920, Springer, Berlin, Heidelberg, pp. 441–444. [82] Hojati, R., Brayton, R. K., and Kurshan, R. P. Bdd- based debugging of designs using language contain- ment and fair ctl. In Fifth Conference on Computer Aided Verification (CAV 93) (1993), LNCS, vol. 697, Springer, Berlin, Heidelberg, pp. 41–58. [83] Hojati, R., Brayton, R. K., and Kurshan, R. P. Bdd- based debugging of designs using language contai- nment and fair ctl. In CAV 93 (1993), LNCS, vol. 697, Springer, Berlin, Heidelberg, pp. 41–58. 164 Informatica 42 (2018) 145–166 H. Debbi [84] Hojati, R., Touati, H., Kurshan, R. P., and Bray- ton, R. K. Effcient -regular language containment. In Computer Aided Verification (1992), LNCS, vol. 1708, Springer, Berlin, Heidelberg, pp. 371–382. [85] Holzmann, G., Peled, D., and Yannakakis, M. On nested depth first search. In SPIN’96 (1996). [86] Holzmann, G. J. The model checker spin. IEEE Transactions on Software Engineering 23, 5 (1997), 1–17. [87] Hong, H. S., and Lee, I. Automatic test generation from specifications for controlflow and data-flow co- verage criteria. In International Conference on Soft- ware Engineering (ICSE) (2003). [88] I.Beer, Ben-David, S., Chockler, H., Orni, A., and Treer, R. Explaining counterexamples using causa- lity. Formal Methods Systems Design 40, 1 (2012), 20–40. [89] Janota, M., Grigore, R., and Marques-Silva, J. Counterexample guided abstraction refinement al- gorithm for propositional circumscription. In JE- LIA’10 Proceedings of the 12th European confe- rence on Logics in artificial intelligence (2010), LNCS, vol. 6341, Springer, Berlin, Heidelberg, pp. 195–207. [90] Jansen, N., Abraham, E., V olk, M., Wilmer, R., Ka- toen, J., and Becker, B. The comics tool - compu- ting minimal counterexamples for dtmcs. In ATVA (2012), LNCS, vol. 7561, Springer, Berlin, Heidel- berg, pp. 249–253. [91] Jia, Y ., and Harman, M. An analysis and survey of the development of mutation testing. IEEE Tran- sactions ON Software Engineering 37, 05 (2011), 649 – 678. [92] Jin, H., Ravi, K., and F.Somenzi. Fate and free will in error traces. International Journal on Software Tools for Technology Transfer 6, 2 (2004), 102–116. [93] Kashyap, S., and Garg, V . Producing short coun- terexamples using crucial events. In CAV 2008 (2008), LNCS, vol. 5123, Springer, Berlin, Heidel- berg, pp. 491–503. [94] Katoen, J.-P., Khattri, M., and Zapreev, I. S. A markov reward model checker. In QEST (2005), pp. 243–244. [95] Kesten, Y ., Pnueli, A., and o. Raviv, L. Algorithmic verification of linear temporal logic specifications. In International Colloquium on Automata, Langua- ges, and Programming (ICALP-98), (1998), LNCS, vol. 1443, Springer, Berlin, Heidelberg, pp. 1–16. [96] Kroening, D., Groce, A., and Clarke, E. Counte- rexample guided abstraction refinement via program execution. In 6th International Conference on For- mal Engineering Methods (ICFEM) (2004), LNCS, vol. 3308, Springer, Berlin, Heidelberg, pp. 224– 238. [97] Kuma, N., Kumar, V ., and Viswanathan, M. On the complexity of error explanation. In Verification, Mo- del Checking, and Abstract Interpretation (VMCAI) (2005), LNCS, vol. 3385, Springer, Berlin, Heidel- berg, pp. 448–464. [98] Kumazawa, T., and Tamai, T. Counterexample- based error localization of behavior models. In NASA Formal Methods (2011), pp. 222–236. [99] Kurshan, R. P. Computer-Aided Verification of coor- dinating processes - the automata theoretic appro- ach. Princeton University Press, 1994. [100] Larsen, K. G., Pettersson, P., and Wang, Y . Uppaal in a nutshell. Int. J. Software Tools for Technology Transfer 1, 1 (1997), 134–152. [101] Leitner-Fischer, F., and Leue, S. On the synergy of probabilistic causality computation and causality checking. In SPIN 2013 (2013), LNCS, vol. 7976, Springer-Verlag, Berlin, Heidelberg, pp. 246–263. [102] Leitner-Fischer, F., and Leue, S. Probabilistic fault tree synthesis using causality computation. IJCCBS 4, 2 (2013), 119–143. [103] Leue, S., and Befrouei, M. T. Counterexample ex- planation by anomaly detection. In SPIN (2012), vol. 7385 of Lecture Notes in Computer Science, Springer, pp. 24–42. [104] Leue, S., and Befrouei, M. T. Mining sequential patterns to explain concurrent counterexamples. In SPIN (2013), vol. 7976 of Lecture Notes in Compu- ter Science, Springer, pp. 264–281. [105] Lewis, D. Causation. Journal of Philosophy 70 (1973), 556–567. [106] LONG, D. Model checking, abstraction and compo- sitional verification. PhD thesis, School of Compu- ter Science, Carnegie Mellon University, 2005. [107] McMillan, K., and Zuck, L. Abstract counterex- amples for non-disjunctive abstractions. In Reacha- bility Problems (2009), LNCS, vol. 5797, Springer, Berlin, Heidelberg, pp. 176–188. [108] Nopper, T., Scholl, C., and Becker., B. Computa- tion of minimal counterexamples by using black box techniques and symbolic methods. In Computer- Aided Design (ICCAD) (2007), IEEE Computer So- ciety Press, pp. 273–280. Counterexamples in Model Checking – A Survey Informatica 42 (2018) 145–166 165 [109] Pnueli, A. The temporal logic of programs. In 18th Annual Symposium on Foundations of Compu- ter Science (1977), IEEE, pp. 46–57. [110] Pytlik, B., Renieris, M., Krishnamurthi, S., and Reiss, S. P. Automated fault localization using po- tential invariants. In AADEBUG’2003, Fifth Inter- national Workshop on Automated and Algorithmic Debugging (2003), pp. 273–276. [111] Ravi, K., Bloem, R., and Somenzi, F. A comparative study of symbolic algorithms for the computation of fair cycles. In Third International Conference, FM- CAD 2000 (2000), LNCS, vol. 1954, Springer, Ber- lin, Heidelberg, pp. 162–179. [112] Ravi, K., Bloem, R., and Somenzi, F. A note on on-the-fly verification algorithms. In TACAS 2005 (2005), LNCS, vol. 3440, Springer, Berlin, Heidel- berg, pp. 174–190. [113] Ravi, K., and Somenzi, F. Minimal assignments for bounded model checking. In TACAS (2004), LNCS, vol. 2988, Springer, Berlin, Heidelberg, pp. 31–45. [114] Renieris, M., and Reiss, S. Fault localization with nearest neighbor queries. In ASE (2003), IEEE Computer Society, pp. 30–39. [115] Saidi, H., and Shankar, N. Abstract and model check while you prove. In CAV 99 (1999), LNCS, vol. 4111, Springer-Verlag, Berlin, Heidelberg, pp. 219– 242. [116] Schmalz, M., Varacca, D., and V olzer, H. Coun- terexamples in probabilistic ltl model checking for markov chains. In International Conference on Con- currency Theory (CONCUR) (2009), LNCS, vol. 5710, Springer, Berlin, Heidelberg, pp. 787–602. [117] Schuppan, V ., and Biere, A. Shortest counterexam- ples for symbolic model checking of ltl with past. In 11th International Conference on Tools and Algo- rithms for the Construction and Analysis of Systems (2005), LNCS, vol. 3440, Springer, Berlin, Heidel- berg, pp. 493–509. [118] Shen, S., Qin, Y ., and Li, S. Bug localization of har- dware system with control flow distance minimiza- tion. In 13th IEEE International Workshop on Logic and Synthesis (IWLS 2004) (2004). [119] Shen, S., Qin, Y ., and Li, S. Localizing errors in counterexample with iteratively witness searching. In ATVA (2004), LNCS, vol. 3299, Springer, Berlin, Heidelberg, pp. 456–469. [120] Shen, S., Qin, Y ., and Li, S. Minimizing counterex- ample with unit core extraction and incremental sat. In Verification, Model Checking, and Abstract Inter- pretation (2005), LNCS, vol. 3385, Springer, Berlin, Heidelberg, pp. 298–312. [121] Shen, S., and Y . Qin, S. L. Localizing errors in counterexample with iteratively witness searching. In ATVA 2004 (2004), LNCS, vol. 3299, Springer, Berlin, Heidelberg, pp. 459–464. [122] Shen, S.-Y ., Qin, Y ., and Li, S. A fast counterexam- ple minimization approach with refutation analysis and incremental sat. In Conference on Asia South Pacific Design Automation (2005), pp. 451–454. [123] Sheyner, O., Haines, J., , Jha, S., Lippmann, R., and Wing, J. Automated generation and analysis of at- tack graphs. In IEEE Symposium on Security and Privecy 2002 (2002), pp. 273–284. [124] Tan, J., Avrunin, G., and Leue, S. Heuristic-guided counterexample search in flavers. In 12th ACM SIGSOFT international symposium on Foundations of software engineering (2004), pp. 201–210. [125] Tan, L., Sokolsky, O., and Lee, I. Specification- based testing with linear temporal logic. In Procee- dings of IEEE International Conference on Informa- tion Reuse and Integration (2004), pp. 493–498. [126] Tarjan, R. E. Depth-first search and linear graph al- gorithms. SIAM Journal of Computing 1, 2 (1972), 146–160. [127] Tip, F., and Dinesh, T. A slicing-based approach for locating type errors. ACM Transactions on Software Engineering and Methodology1 10, 1 (2001), 5–55. [128] Touati, H. J., Brayton, R. K., and Kurshan, R. P. Testing language containment for! automata using bdds. In International Workshop on Formal Methods in VLSI Design (1991), pp. 371–382. [129] Valmari, A., and Geldenhuys, J. Tarjans algo- rithm makes on-the-fly ltl verification more eff- cient. In Jensen, K., Podelski, A. (eds.) TACAS (2004), LNCS, vol. 2988, Springer, Berlin, Heidel- berg, pp. 205–219. [130] Vardi, M., Wolper, P., and Yannakakis, M. Memory- efficient algorithms for the verification of temporal properties. Formal Methods in System Design 1, 2 (1992), 275–288. [131] Visser, W., Havelund, K., Brat, G., Park, S., and Lerda, F. Model checking programs. Automated Software Engineering Journal 10, 2 (2003), 203– 222. [132] Wang, C., Yang, Z., Ivancic, F., and Gupta, A. Who- dunit? causal analysis for counterexamples. In 4th International Symposium, ATVA (2006), LNCS, vol. 4218, Springer, Berlin, Heidelberg, pp. 82–95. [133] Wimmer, R., Braitling, B., and Becker, B. Coun- terexample generation for discrete-time markov 166 Informatica 42 (2018) 145–166 H. Debbi chains using bounded model checking. In Verifica- tion, Model Checking, and Abstract Interpretation (2009), LNCS, vol. 5403, Springer, Berlin, Heidel- berg, pp. 366–380. [134] Wimmer, R., Jansen, N., Abraham, E., Becker, B., and Katoen, J. Minimal critical subsystems for discrete-time markov models. In TACAS (2012), LNCS, vol. 7214, Springer, Berlin, Heidelberg, pp. 299–314. [135] Wimmer, R., Jansen, N., and V orpahl, A. High- level counterexamples for probabilistic automata. In Quantitative Evaluation of Systems (QEST) (2013), LNCS, vol. 8054, Springer, Berlin, Heidelberg, pp. 39–54. [136] Xie, A., and Beerel, P. A. Implicit enumeration of strongly connected components. In Internatio- nal Conference on ComputerAided Design (1999), pp. 37–40. [137] Zeller, A. Yesterday, my program worked. today, is does not. why? In ACM Symposium on the Founda- tions of Software Engineering (1999), pp. 253–267. [138] Zeller, A. Isolating cause-effect chains for computer programs. In ACM Symposium on the Foundations of Software Engineering (2002), pp. 1–10.