Informatica 33 (2009) 285-296 285
Optimization of Actions in Activation Timed Influence Nets
M. Faraz Rafi, Abbas K. Zaidi and Alexander H. Levis
System Architectures Laboratory, ECE Dept., George Mason University, Fairfax, VA USA E-mail: {mrafil, szaidi2, alevis}@gmu.edu
P. Papantoni-Kazakos
EE Dept., University of Colorado Denver, Denver, CO USA E-mail: Titsa.Papantoni@cudenver.edu
Keywords: influence net, activation timed influence net, Bayesian net
Received: October 12, 2008
A sequential evolution of actions, in conjunction with the preconditions of their environment and their effects, are all depicted by Activation Timed Influence Nets. In this paper, we develop two algorithms for the optimal selections of such actions, given a set ofpreconditions. A special case for the two algorithms is also considered where the selection of actions is further constrained by the use of dependencies among them. The two algorithms are based on two different optimization criteria: one maximizes the probability of a given set of target effects, while the other maximizes the average worth of the effects' vector.
Povzetek: Predstavljena sta dva algoritma za optimizacijo akcij v časovno odvisnih mrežah.
1 Introduction
We consider the scenario1 where a sequence of actions needs to be initialized towards the materializing of some desirable effects. As depicted in Figure 1, each action is supported by a set of preconditions and gives rise to a set of effects; the latter become then the preconditions of the following action(s) which, in turn, gives rise to another set of effects. Such sequential evolution of actions is termed Activation Timed Influence Nets (ATINs), where the action performers may be humans. ATINs are an extension of an earlier formalism called Timed Influence Nets (TINs) [6-12, 20-27, 30, 31] that integrate the notions of time and uncertainty in a network model. The TINs are comprised of nodes that represent propositions (i.e., pre-and post-conditions of potential actions as well as assertions of events which may indirectly describe such actions), connected via causal links that represent relationships between the nodes, without any explicit representation of actions. TINs have been experimentally used in the area of Effects Based Operations (EBOs) for evaluating alternate courses of actions and their effectiveness to mission objectives in a variety of domains, e.g., war games [20-22, 25], and coalition peace operations [24, 27], to name a few. A number of analytical tools [6-12, 23, 24, 27, 30] have also been developed over the years for TIN models to help an analyst update conditions/assertions, represented as nodes in a TIN, to map a TIN model to a Time Sliced Bayesian Network for incorporating feedback evidence, to determine best set of pre-conditions for both timed and un-timed versions of Influence Nets, and to assess temporal aspects of the in-
1 This work was supported by the Air Force Office of Scientific Research (AFOSR) under Grants FA9550-05-1-0106 and FA9550-05-1-0388.
fluences between nodes. A recent work [31] on TINs, underlying constructs and the computational algorithms, provides a comprehensive analytical underpinning of the modeling and analysis approach.
c: preconditions a: actions e: effects
Figure 1: Network Representation of an Activation Timed Influence Net (ATIN)
In contrast to their predecessors (i.e., TINs), ATINs explicitly incorporate as nodes the mechanisms and/or actions that are responsible for changes in the state of a domain; other nodes represent preconditions and effects of actions. A set of preconditions may support a number of different actions, each of which may lead to the same effects, with different probabilities and different costs/awards, however. The objective is to select an optimal set of actions, where optimality is determined via a pre-selected performance criterion. In this paper, we present two algorithms which attain such an objective. We note that an effort to develop an action selection algorithm is also presented in [1].
The organization of the paper is as follows: In Section 2, we present the core formalization of the problem,
286 Informatica 33 (2009) 285-296
M. F. Rafi et al.
including two different optimization criteria. In Section 3, we derive the two algorithms which address the latter criteria. In Section 4, we express the extensions of the two algorithms to the network propagation scenario. In Section 5, we include numerical evaluations while in Section 6, we draw some conclusions.
1.1 Related Work
ATINs include action planning. In the domain of action planning, classical planners assume that the effects of an action are known with certainty and generate a set of actions that will achieve the desired goals [19]. Some planners do monitor for errors as actions are executed, but no action adaptations are incorporated [29]. Other planners assign probabilities to the effects of actions [2, 13, 14, 16, 28], but provide no mechanisms for reacting to changes in the environment. Reactive planners [5, 15, 17, 18] are designed to select and execute actions in response to the current state of the world, but, with a few exceptions [3], [4], they do not use probabilistic information to determine the likelihood of success of the actions. In [1], probabilistic information is used, in an effort to deal with environmental uncertainties, but no optimal action selection strategies are considered and/or proposed.
The ATIN formalism in this paper is similar to an earlier work by Sugato Baghci et al [1] on planning under uncertainty. The similarity, however, stops with the graph representation of preconditions, actions and their effects. Similar parallels can also be drawn with other graph-based planning approaches, e.g. GraphPlan (http://www.cs.cmu.edu/~avrim/graphplan.html). The approach in this paper represents a new formalism and is based on well established statistical results.
2 Problem formalization - core
In this section, we consider a modular core problem. We initially isolate a single action with its supporting preconditions and its resulting effects, as depicted in Fig. 2.
Yj = [Yb...,Ym]'
pj(x 1)
qj(y 1)
The status random vector of the effects, where Y; = 1, if effect e; is present and Yi = 0 if effect ei is absent. y m denotes binary vector value realizations of Ym.
The probability of success for action aj, given that the value of the precondition status vector is x 1;
P(success for action a} | x^)
The probability that the value of the effects' status vector is y m, given that the action aj is taken;
P(y1" | aj taken)
q0(y "1) The probability that the value of the effects' status vector is y m, given that no action is taken;
P(ym | no action taken)
Uj(y m) The utility of the value y m of the effects' status vector, when action aj is taken.
U0(y 1)	The utility of the value y m of the
effects' status vector, when no action is taken.
We note that the utility function Uj(y ") measures the net worth of the effects' vector value ym when action aj is taken; thus, Uj(ym) is computed as the worth of y I1 minus the cost of deployment for action aj.
Let us now assume mutually exclusive actions, which are supported by the same preconditions, to lead to the same set of effects (as shown in Fig. 3). Let (aj}1<j<k be this set of actions and let xland y"denote the common status random vectors of preconditions versus effects, respectively. Let the utility functions for each action in the set {aj}1<J<k be nonnegative; let also U0(y"") be nonnegative.
Figure 2: A Single Action ATIN
X ! = [Xi,..., Xn]T The status random vector of the preconditions, where X; = 1, if precondition c is present and X; = 0 if precondition c is absent. x ? denotes binary vector value realizations of
n
X1.
Figure 3: A Single Level ATIN
We now state multiple versions of the core problem, based on two different optimization criteria. Problem 3a and 3b are the constrained versions of the first two problems.
OPTIMIZATION OF ACTIONS IN ACTIVATION TIMED...
Informatica 33 (2009) 285-296 287
Problem 1 (Optimal Path Problem) Given a preconditions vector value x n, given an effects vector value y find the maximum probability action that connects them. That is, find the action that maximizes the conditional probability P(ym | x1).
Problem 2 (Average Utility Maximization)
Given a preconditions vector value x 1n , find the action or actions that maximize the effects' average utility.
Problem 3a (Optimal Path Problem with Constrained Actions) Given a preconditions vector value x 1n , given an effects vector value y m, and an action dependency matrix, find the maximum probability action that connects them. That is, find the action that maximizes the conditional probability P(y7 | x"J). In this case, only those action combinations are considered that are allowed by the constraints in the dependency matrix.
Problem 3b (Average Utility Maximization with Constrained Actions)
Given a preconditions vector value x 1n , find the action or actions that maximize the effects' average utility. As in Problem 3a, only those action combinations are considered that are allowed by the constraints in the dependency matrix.
Action Dependency Matrix (ADM)
An action dependency matrix is a tool which defines dependency among actions in the network. It reduces the number of combinations of actions by considering only those allowed by the dependency matrix. It also reduces significantly the amount of calculations required to obtain the optimal path. The value of the variable aj reflects the existence or absence of dependency between actions a! and aj, where a^ equals 1; for positive dependency and equals 0; for negative dependency, and 1 < i, j < n, where 'n' represents the total number of actions in the network. The elements of an ADM are determined as follows:
1 ; if action a, is selected for execution in level l, then aj refers to the action that has to be or has been selected for execution in level k, where l ^ k
0 ; if action ai is selected for execution in level l, then aj refers to the action that must not be selected for execution in level k, where l ^ k
where, level l in an ATIN corresponds to a set of preconditions (C1, C2...Cn) followed by a set of actions (a1, a2.. .ak) and a set of effects (e1, e2... em) (as shown in Fig. 3). The effects of this level then serve as the preconditions for the next level l +1 and so on.
3 Solutions to the core problems
We present the solutions to the two core problems posed in Section 2 in the form of a theorem, whose proof is in the Appendix.
Theorem 1
a. Given x 1, given y m, and given a set of actions {aJ}1<J<k, the conditional probability P(yIi|xn)is maximized as follows:
by action j if
qj.(y m) pj*( x n)=
max qj(y m) pjU?)> qo (y m) (i)
where then max P(ym | x1) = qj*(y m) pj*( x ?)
by no action; if
qo(y m) > max	qj (y "") pj(x^)
1< j< k
where then max	P(y" | x"1) = qo( y m )
(2)
If more than one action satisfy the maximum in (1), then one of these actions may be selected randomly.
b. Given xf , given a set of actions {aj}1<J<k, and given utility functions { U^") }i<j<k and U0(ym), the average utility
U(xn) = ZZP(aj taken,y" | xn) • Uj(y")
1<j<k y"
+ Z P(no action taken K) • U0(y")
y"
is maximized as follows: by action aj*; if
Aj.(x?) = pj.(x?) Z qj*(y m) Uj*(ym) =
max p
1 < j < k
j(xn) Z qj(y m)Uj(ym)>
Z q0(y m^m)
(3)
by no action; if
Z q0(y 1)U0(ym)
max pj (xn)Z qj(y 1) Uj(y")	(4)
1< j< k	m
y1
Aj»(x1î ) in (3) is the award assigned to action aj*; it is also the worth assigned to the precondition vector value x n by the action aj*.
If more than one action attain the maximum award Aj*( xn ) in (3), one of them is selected randomly.

y
y
y
>
aij=
y
288 Informatica 33 (2009) 285-296
M. F. Rafi et al.
4 Solutions of the network propagation problem
In this section, we generalize the core problem solutions expressed in Theorem 1, Section 3, to the sequence of actions depicted by the ATIN in Fig. 1.
Problem 1 (The Optimal Path Problem) In the ATIN in Fig. 1, we fix the preconditions vector value xn(1), at time 1, and the effects' vector value y m (N), at time N. We then search for the sequence of actions that maximizes the probability P(ym(N)| xn(1)). The solution to this problem follows a dynamic programming	approach	where x n(l) = y m (l -1); 2 < l < N, in our notation. The proof of the step evolution is included in the Appendix.
Step 1
For each y m (1) = x n(2) value, find
r(ym (1)) = max [qo(ym (1)), max^ (1)) jx? (1))]
and the action index j*( y 1 (1)) that attains r(ym (1)).
Step l
The values r(y1^ (l -1)) = max P(yf(l -1)| x^(l)), for each ym1 (l -1) value, are in memory, as well as the actions that attain them. At step l, the values
r(ym(l)) = max r(ym (l -1)) x
1	ym (l -1) 1
x max [qo(ym (l)), maxq^ (l)) p^ (l -1))]
are maintained, as well as the sequence of actions leading to them.
The complexity of this problem is polynomial with respect to the number of links. Assume that a given ATIN model has 'N' number of levels and each level has 'k' links, then the complexity is given as O (N x k).
Problem 2 (The Average Utility Maximization) In the ATIN in Fig. 1, we fix the value of the precondition vector at time 1, denoted xn1 (1) . For each value y w (N) of the effects vector at time N, we assign worth functions U (yw(N)). For each action aj (l), at time l, we assign a deployment cost cj (l). The utility of the effects' vector value y w (N), when action aj (N) is taken, is then equal to Uj (y w (N)) = U (yw (N)) -c j (N) , while the utility of the same value, when no action is taken, equals U 0 (y w (N)) = U (y w (N)) . We are seeking the sequence of actions which lead to the maximization of the average utility. The evolving algorithm, from part (b) of Theorem 1, back propagates as follows. The proof is in the Appendix.
Step 1
Compute the action awards (including that to no action), with notation of Figure 1, as follows: 0 < j < r;
Aj(xl (N -1)) =
pj(xl(N-1)) 2 qj(y ,T(N)) Uj (y w (N))
y™(N)
with po(x1 (N-1)) = 1
Select j, (N-1))(xl (N-1))=maxAj (xl (N-1)) ; for each x1 (N -1) value.
Take action a* I(N ^(N) for preconditions vector value x, (N - 1)and simultaneously assign worth Aj((xi (N-1))(xl (N - 1))to xl (N -1). That is, assign:
U(x1 (N -1)) = Aj((x,(N-1))(xl (N -1))	(5)
Step 2
Back propagate to the preconditions at N-2, as in Step 1, starting with the worth assignments in (5), and subsequent utilizations
Uj (x, (N -1)) = max[A.*(x, (N-1)) (x1 (N -1)) - c j (N -1),0]
Step n
As in Steps 1 and 2 (for subsequent levels) the above described algorithm generates the optimal sequence of actions for given initial preconditions x 1n (1) . The optimal such preconditions can be also found via maximization of the utility Uj (xk(2)), with respect toxn(1).
The complexity of this problem is also polynomial with respect to the number of links.
Problems 3 a, 3b (Optimization with Constrained Actions)
Problems 3a and 3b impose dependency constraints on the actions in the ATIN network. As explained in Section 2, an ADM defines the dependency of one action on every other one, where positive dependency is depicted by 1 and negative dependency is depicted by 0. The dependency constraints are taken into account, when, at a certain level, an optimal action is finalized. At any given level, only positively related actions are considered in the calculations.
As described in Step 1 of Problem 1 (see Section 4), for the first level, r(y m1 (1)) is calculated the same way
for constrained actions also. But for the rest of the levels,
it is calculated in a different manner.
Consider,
r(ym(l)) = max r(ym (l -1)) x 1 y> -1) 1
x max [q0(ym (l)), maxq^ (l)) jy^ (l -1))]
The parameter max r(yI" (l -1)) corresponds to an ac-
ym (l -1) 1
OPTIMIZATION OF ACTIONS IN ACTIVATION TIMED...
Informatica 33 (2009) 285-296 289
tion selected for execution in level l -1. Its dependent actions can be known from the ADM. In this way, those combinations of actions which are not allowed by the ADM are eliminated from the calculation of r(ym(l)),
hence eliminating all links to and from the actions exhibiting negative dependencies. As a result of which it yields a network with lesser number of links and eases the determination of optimal sequence of actions.
5 Numerical evaluations
In this section, we focus on numerical scenarios. We first state the experimental setup. We then, evaluate and discuss a specific experimental scenario. We only state the experimental setups for Problems 1 and 2, since those of Problems 3 a and 3b are straight forward modifications of the former.
5.1 Experimental Setups
Experimental Setup for Problem 1 Assign the probabilities {qj(x1k (l))} and {jxk (l))} as in problem 2.
Given these probabilities: a. Compute first:
r(ym(1)) = max [q0(y1(1)),max ^(l»^^))] and the action j* (y1^ (1)) that attains r(ym (1)).
b. For each l : 2 < l < N, maintain in memory the valA
ues r(ym (l -1)) = max P (ym (l -1)| x?(l)), for each y1]1 (l -1) value, and the actions that attain them. Then, compute and maintain the values:
r(ym(l)) = max ^y"1 (l -1)) x
1 yJV -1) 1
x max [q0(ym (l )), maxq^ (l )) p^ (l -1))]
Also, maintain the actions that attain the values r(ym (l)) .
Experimental Setup for Problem 2 Considering the network in Fig. 1, assign:
c.
a.
b.
d.
Probabilities pj(x k (l)) =
P(action j succeeds | x k (l) preconditions) at all levels, from 1 to N-1,
A
where po(x k (l )) = 1; V l
Implementation/deployment costs cj (l) for all actions, at all levels 2 to N.
Given the above assignments,
a. Compute first, Aj(xl (N -1)) =
pj(xl(N-1)) 2 qj(yw (N)) Uj(yw (N))
yW(N)
where,
po(x1 (N-1)) =1;
Uj (yw (N)) = max [U (yW (N)) - cj(N), 0]
Af(xi(N-1))^(N-1))=10<;<^AJ(x1(N-1)) ;
for all x1 (N -1) values.
Worth function U(yw(N))for all yw(N) values of the effects' status vector, at level N.
Probabilities q/x k (l)) =
P( x 1k (l) occurring | action j at step l - 1) at all levels, 2 to N,
where q0(x k (l)) =
P( x 1k (l) occurring | no action j at step l -1) at all levels, 2 to N,
b. Take action aw , , ^ for each precondition vec-
j*(x1 (N-1))	F
tor value x1 (N -1).
Assign worth A^ (N1))(x1l (N-1))toxl (N-1), as U(x1 (N -1)) = Aj*(x1 (N-D)(x1 (N -1))
Repeat steps (a) and (b) for level N-1 and back propagate to level N-2. Continue back propagation to level 1.
5.2 A Specific Experimental Scenario
In this section, we illustrate the use of Activation Timed Influence Nets with the help of an example ATIN, and present the results of the algorithms included in this paper, when applied to this ATIN. The model used in this section was derived from a Timed Influence Net presented in Wagenhals et al., in 2001 [27] (which was developed with the help of a team of subject matter experts) to address the internal political instabilities in Indonesia in the context of East Timor. For purposes of results illustration, we have selected a part of this network, as shown in Fig. 4.
Example ATIN:
The model provides detailed information about the religious, ethnic, governmental and non-governmental organizations of Indonesia. In this section, the propositions and actions referred are given in italic text. According to the model, rebel militia formed by a minority group poses the main concern which has captured a large number of people under its secured territory. Amongst these people in the community, some are against the rebels and considered to be at risk, in case the negotiations with the local government didn't work. For this example,
290 Informatica 33 (2009) 285-296
M. F. Rafi et al.
consider the initial conditions when the rebels are getting local support, the community is in unrest and the local administration is losing control. Based on the data provided, only one action can be executed from a possible set of actions at a given time i.e. either of the Indonesian press or provincial authority or the minister of interior would declare resolve to keep peace. Depending upon this selected action and the data provided for the effects, only a specific set of events can result. For instance, rebels may or may not start thinking that they are getting publicity, GOI (original anti-government of Indonesia) war may or may not expand, GOI chances of intervention and international attention may increase or decrease. Similarly, this specific set of events forms the set of possible pre-conditions for a later time. Depending upon which conditions actually become true, second action can be selected for execution from another set of actions, i.e. Security Council and General Assembly may or may not pass resolutions or UN may or may not declare resolve to keep peace. Depending upon this action and the data provided for the effects, coalition may or may not form, rebels may or may not contemplate talks, GOI support may increase or decrease or may not increase at all, or GOI may or may not allow coalition into territories. Ultimately, the coalition may authorize use of force which might compel rebels to negotiate and the humanitarian assistance (HA) may start preparing for the worst case. Depending upon which conditions meet, the coalition may declare resolve to keep peace or may declare war on rebels. This may affect the chances of military confrontation, rebels' popularity and chances of negotiated settlement which represents the final effects in the network.
Table 1 lists some of the parameters (and their values) required by the network in Fig. 4. The parameters in the table are listed by their abbreviated labels also in addition to the phrases shown inside the network nodes in the figure. For the sake of brevity, we do not list all the values.
Solutions to Problems:
Solution to Problem 1 (Optimal Path Problem): Consider the example scenario described earlier, we need
to identify an optimal path (i.e., the sequence of actions) resulting into the final effect when, military confrontation chances are reduced, while rebels start losing local support and negotiation chances start increasing. This set of effects (post-conditions) leads to the following output state in the ATIN model:
-	Reduction in the chances of military confrontation
(i e. Yj2 = 0)
-	Decrease in local support and popularity for Rebels (i e. Yjs = 1)
-	Increase in chances of negotiated settlement (i e. Y14 = 1).
The above defined conditions lead to a postcondition vector [0, 1, 1] T at level 4, i.e. y 12(4).
After fixing the post-condition vector, we define the initial preconditions, when rebels have been getting local support, the community has been in unrest and the local administration has started losing control. This set of pre conditions given by x 13 (1) results into a vector value of [1, 1, 1] T, where
-	Xj = 1; represents the condition Rebels are getting Local Support
-	X2 = 1; represents the condition There is unrest in the Community
-	X3 = 1; represents the condition Local Administration is losing Local Control.
We want to find out the sequence of actions which
achieves the desired effects y1142 (4) given the initial preconditions xj3 (1). Technically, we want to identify the sequence of actions which maximizes the probability P(y 12 (4) | x3 (1)). Applying the optimal path algorithm (see Section 4) results that if the provincial authority and UN declare resolve to keep peace and coalition does not take any action, instead it declares resolve to keep peace, then the desired effects will be achieved which will result into less chances of military confrontation, reduction in local support for rebels and more chances of a negotiated settlement.
Figure 4: Example ATIN.
OPTIMIZATION OF ACTIONS IN ACTIVATION TIMED...	Informatica 33 (2009) 285-296 291
Table 1: Parameter values in the Example ATIN
1							
Yt	Action aj		q0(y Í)	x;	Pitá)	P,(^)*q,(y I4)	r(>1)
	Indonesian press declare resolve to keq> peace (al)	68.00%	6.45%	n.i.if	so.oo%	54.40%	63.64%
	Provincial Authority declares resolve to keep	74.00%			S6.00%	63.64%	
	p eace (a2)						
	Minister of interior declares resolve to keq> p eace(a3)	56.00%			20.00%	11.20%	
Lewi 2							
YÍ	Action aj	q ,(y i )	q/yp	xl	pj(x I)	Pj(x47)* q ,(y i)	
	Resolution is passed in Security council (a4)	14.00%	0.95%	[Q,Q,Q,0]T	16.00%	2.24%	34.02%
							
				[1A1:1]T	91.00%	12.74%	
							
				[1,1,1,1]T	1S.00%	2.52%	
	UN declares resolve to keep p eace (a5)	42.00%		[0;0S0;0]T	66.00%	27.72%	
							
				[1A14]T	Sl.00%	34.02%	
							
				[u=u]T	13.41%	5.63%	
	Resolution is passed in General Assembly (a6)	39.00%		[Q,0,Q,0]t	43.05%	16.79%	
							
				[1A1:1]T	48.20%	18.80%	
							
				[U=1,1]T	24.87%	9.70%	
Lewi 3							
y 11	Action aj	qM1)	qJfrV)	x1;	PJ<X ")	Pj(x i') *ij(y a1)	r(y ")
c	Coalition Authorizes the Use of F orce (a7)	43.00%	59.00% (No Action)	[QAQ,0]T	0.00%	0.00%	59.00%
							
				[1A1:1]T	64.00%	27 52%	
							
				[1,1,1,1]T	IS.00%	7.74%	
Le\el4							
V	Action aj	q,(y;t)			Pt(X 12 )		
jU'L'Oj	Coalition declares	21.00%	1.50%	[OiOO]T	16.00%	3.36%	19.11%
							
	resolve to kasp peace {aS)			[1=0S0]T	91.00%	19.11%	
							
				[1=1,1]T	38.00%	7.98%	
	Coalition declares war on rebels (a9)	17.00%		[0;0:0]T	67.00%	11.39%	
				...			
				[1,0,0]T	18.48%	3.14%	
				...			
				[1,1, If	30.88%	5.25%	
292 Informatica 33 (2009) 285-296
M. F. Rafi et al.
The details of this result are given in Table 1. It only contains the values that correspond to the selected actions at their respective levels, while a complete set of probabilities has been used to calculate the actual final sequence. The optimal actions, their corresponding state vectors and the probabilities are underlined in the table. The Optimal Path algorithm is of dynamic programming nature, so it requires two traversals to finalize the sequence of actions. During the forward traversal, r(ym )is calculated for each level for all possible post-condition combinations. At the last level, the post-condition vector y12(4)is fixed to be the desired effect of the network
which is [0, 1, 1] T as determined earlier. The best action associated with this post-condition vector is identified along with its pre-condition vector x^^). Using this
pre-condition vector (which is the post-condition vector of the second last level), the network is traversed in reverse direction identifying actions and their corresponding preconditions, from last to the first level. The action at the first level is identified by fixing the pre-condition to the value determined earlier, i.e. x3(1) which is
[1, 1, 1] T. Completing both forward and reverse travers-als gives the optimal actions which achieve the desired effects when the initial causes are given.
Solution to Problem 2 (Average Utility Maximization): Consider a scenario where we need to identify the sequence of actions which maximizes the effects' average utility (at level 4) for the same input pre-condition as it was used in the solution of Problem 1, i.e. [1, 1, 1] T. Assume, that the deployment costs for actions a8 and a9 are 25 and 30 units, respectively. The worth of each effect in the last level (i.e. level 4) is given by the worth function values U(y|2(4)) given in Table 2 and 3. Each
effect also has a net utility which is determined by subtracting the deployment cost of the action from the worth of the effect. This net utility UJ(y'12(4))(when action aj is
taken) and the action awards are given in Tables 2 and 3. The action award is calculated for each action corresponding to all of its pre-conditions. Similarly, these calculations are performed for the rest of the actions in ATIN model (after costs are assigned to every action in the model), but for the sake of brevity only the results for actions a8 and a9 are shown in Tables 2 and 3, respectively.
As described in Section 4, the action award is calculated for all actions in each level. For instance, starting from the last level, the action awards are calculated for actions a8 and a9. The selected action is the one which maximizes the average utility and its action index j is recorded. As each action award is calculated, it is also assigned as the worth function to the previous level effects vector. The latter worth function is used to calculate the utilities at the previous level, and calculations are repeated similarly. This procedure is back traversed from last to first levels. Table 4 summarizes the action awards of those actions which maximize the effects' average utility at their respective levels.
Table 2: tion a8
Utility Functions and Action awards for Ac-
Level 4 - Action ag						
14 xi2(4)	p^4))	y12(4)	q(yi4(4))	U(y"(4))	us(y14(4))	AiO^))
[0,0,0]T	16.00%	[0,0,0]T	37.00%	40	15	11.11
[0,0,1] T	24.00%	[0,0,1]T	65.00%	30	5	16.66
[0,1,0] T	75.00%	[0,1,0]T	53.00%	60	35	52.07
[0,1,1] T	85.00%	[0,1,1]T	21.00%	79	54	59.02
[1,0,0] T	91.00%	[1,0,0]T	19.00%	41	16	11.11
[1,0,1] T	72.00%	[1,0,1]T	43.00%	65	40	49.99
[1,1,0] T	16.00%	[1,1,0]T	29.00%	37	12	63.18
[1,1,1] T	38.00%	[1,1,1]T	27.00%	51	26	26.38
Table 3: Utility Functions and Action awards for Action a9.						
Level 4 - Action a9						
14 X,2(4)	P,(Xi(4))	y12(4)	q$4(4)	U(y}2(4))	uy2(4»	^(4)
[0,0,0] T	67.00%	[0,0,0]T	41.00%	40	10	48.25
[0,0,1] T	97.15%	[0,0,1]T	26.00%	30	0	69.96
[0,1,0] T	58.29%	[0,1,0]T	71.00%	60	30	41.97
[0,1,1] T	13.00%	[0,1,1]T	17.00%	79	49	9.36
[1,0,0] T	18.48%	[1,0,0]T	26.00%	41	11	13.31
[1,0,1] T	39.28%	[1,0,1]T	54.00%	65	35	28.29
[1,1,0] T	38.67%	[1,1,0]T	62.00%	37	7	27.85
[1,1,1] T	30.88%	[1,1,1]T	58.00%	51	21	22.24
From Table 4 it can be seen that the sequence of actions that maximizes the effects' average utility, obtained as a result of applying the algorithm is given by: a1 (i.e.
Indonesian press declares resolve to keep peace), a6 (i.e. Resolution is passed in General Assembly), a7 (i.e. Coalition authorizes use of Force), a9 (i.e., Coalition declares war on rebels). The underlined entries in Table 3 correspond to the worth, utility function and action award of action a9.
Solution to Problem 3a, 3b (ConstrainedActions):
The dependencies among the actions in the example ATIN model are defined in the action dependency matrix given in Figure 5.
Most of the dependencies given in the matrix are quite evident. For instance, the peace resolution declaration by UN (a5) ensures that either of Indonesian press, provincial authority or minister of interior must also have declared the resolution to keep peace (either of ai or a2 or a3 must have been executed in the past) which would represent the opinion of the locals in general. Similarly, resolution passed by the Security Council or General Assembly (a4 or a6) makes sure that whether or not the coalition will have to authorize the use of force (a7), considering the resolution is in support of use of force. This infers that if the coalition authorizes the use of force, it will declare war on Rebels otherwise, it will declare resolve to keep peace. All of these dependencies can be observed from the ADM (as shown in Fig. 5).
Consider a25 in ADM, (as shown in Fig. 5) which
OPTIMIZATION OF ACTIONS IN ACTIVATION TIMED...	Informatica 33 (2009) 285-296 293
corresponds to a positive dependency between peace declaration by the provincial authority (a2) and peace declaration by UN (a5). The ADM suggests that there exist negative dependencies between action a2 and actions a4, a6, a7 and a9 which means that if Provincial authority declares peace resolution, Security Council and General Assembly won't pass resolution and the Coalition will not authorize the use of force and hence will declare resolve to keep peace. This knowledge of dependencies from the ADM certainly reduces an extensive amount of effort in calculating the optimal path. While calculating the optimal path, during the forward traversal, only those paths are considered which satisfy the constraints defined in ADM yielding less number of combinations to consider for calculation and making it easy to back traverse and identify the optimal actions.
The same applies to the solution of the second problem of identifying sequence of actions maximizing the effects' average utility under constraints. The action awards are calculated for those actions only which satisfy constraints defined in ADM, and hence reducing the effort of calculating action awards and assignment of worth function at each level.
Table 4: Action Awards.
6 Conclusion
This paper presented an extension of a Timed Influence Net, termed ATIN (Activation Timed Influence Net). An ATIN utilizes a set of preconditions required for the undertaking of an action and produces a set of effects. These effects become then the preconditions for the next level of action(s), resulting in a sequential evolution of actions. Some other probabilistic planning techniques were also discussed. The paper identified several pre-
selected performance criteria regarding ATINs (i.e., optimal path and average utility maximization with and without constrained actions) and recommended algorithms for their satisfaction. A tool called ADM (Action Dependency Matrix) was introduced, which induces dependencies among the actions. It is represented with the help of a m x m matrix, where ' m ' represents the total number of actions in the network.
The implementation of the suggested algorithms was illustrated with the help of a real world example. The example demonstrated a politically unstable situation in Indonesia. Sets of actions preceded by preconditions and followed by sets of effects were demonstrated in the form of an ATIN Model (see Figure 4). The experiment was formulated based on a previous Timed Influence Network model for the same scenario. The experimental procedure was applied to the network with a set of probability data. Solutions of both problems were discussed in depth. The optimal path problem required the knowledge of an initial set of causes (preconditions) and the final set of effects (postconditions). With the help of the algorithm, an optimal sequence of actions was identified which maximized the conditional probability of achieving the desired effects, when the initial conditions were given. For the sake of brevity, only significant parts of the probability data used were shown in Table 1. For the same scenario, the second algorithm yielded a sequence of actions, which maximized the effects' average utility. The solution for both problems was comprehended in detail. The experiment was repeated with constrained actions considering only dependent actions as defined in the Action Dependency Matrix (see Figure 5) which produced similar results and required lesser effort to calculate than without ADM.
References
[1] Bagchi S., Biswas G., and Kawamura K., (Nov, 2000), "Task Planning under Uncertainty using a Spreading Activation Network", IEEE Transactions
Figure 5: Action Dependency Matrix
294 Informatica 33 (2009) 285-296
M. F. Rafi et al.
on Systems, Man and Cybernetics, Vol. 30, No. 6, pp 639-650.
[2]	Dean T. and Wellman (1991) M.P., Planning and Control. San Mateo, CA: Morgan Kaufmann.
[3]	Dean T., Kaelbling L.P., Kirman J., and Nicholson A., (1993), "Planning with deadlines in stochastic domains," in Proc. 11th Nat. Conf. Artificial Intelligence (AAAI-93). Chicago, IL: AAAI Press, pp. 574-579.
[4]	Drummond M. and Bresina J., (1990), "Anytime synthetic projection: Maximizing the probability of goal satisfaction," in Proc. 8th Nat. Conf. Artificial Intelligence (AAAI-90). Chicago, IL: AAAI Press, pp.138-144.
[5]	Firby R.J., (1987), "An investigation into reactive planning in complex domains,"in Proc. 6th Nat. Conf. Artificial Intelligence (AAAI-87). San Mateos, CA: Morgan Kaufmann, vol. 1, pp. 202-206.
[6]	Haider S., (October, 2003), "On Computing Marginal Probability Intervals in Probabilistic Inference Networks," in Proceedings of IEEE-SMC Conference.
[7]	Haider S. (August, 2003), "A Hybrid Approach for Learning Parameters of Probabilistic Networks from Incomplete Databases," Design and Application of Hybrid Intelligent Systems, Proceedings of the Third International Conference on Hybrid Intelligent Systems (HIS'03), IOS Press, Amsterdam, The Netherlands, ISBN 158603-394.
[8]	Haider S., Zaidi A. K., and Levis A. H. (Nov. 2004), "A Heuristic Approach for Best Set of Actions Determination in Influence Nets," in Proc. IEEE International Conference on Information Reuse and Integration, Las Vegas.
[9]	Haider S. and Levis A. H. (2004), "An Approximation Technique for Belief Revision in Timed Influence Nets," in Proceedings of Command and Control Research and Technology Symposium.
[10]	Haider S. and Zaidi A. K. (2004), "Transforming Timed Influence Nets into Time Sliced Bayesian Networks," in Proceedings of Command and Control Research and Technology Symposium.
[11]	Haider S. and Zaidi A. K. (2005), "On Temporal Analysis of Timed Influence Nets using Point Graphs," in the Proc. of the 18th International FLAIRS Conference, FL.
[12]	Haider S. and Levis A. H. (June 2005), "Dynamic Influence Nets: An Extension of Timed Influence Nets for Modeling Dynamic Uncertain Situations," in Proc. 10th International Command and Control Research and Technology Symposium, Washington DC.
[13]	Kushmerick N., Hanks S., and Weld D., (1995), "An algorithm for probabilistic planning," Univ. of Washington, Seattle, Artificial Intelligence Journal, pp. 76(1-2):239-86
[14]	Likhachev M. and Stentz A., (2006), "PPCP: Efficient Probabilistic Planning with Clear References in Partially Known Environments", Twenty-First AAAI Conference on Artificial Intelligence, Bos-
ton, Massachusetts, The AAAI Press/Menlo Park, California, pp 860-868.
[15]	Maes P., (1990), "Situated agents can have goals," Robot. Auton. Syst., vol. 6, pp. 49-70.
[16]	Mausam and weld D.S., "Probabilistic Temporal Planning with Uncertain Durations", Twenty-First AAAI Conference on Artificial Intelligence, Boston, Massachusetts, The AAAI Press/Menlo Park, California, pp 880-888.
[17]	Mitchell T.M., (1990), "Becoming increasingly reactive," in Proc. 8th Nat.Conf. Artificial Intelligence (AAAI-90). Chicago, IL: AAAI Press/The MIT Press, pp. 1051-1058.
[18]	Nikolova E. and Karger D.R., "Route Planning under Uncertainty: The Canadian Traveler Problem", Twenty-Third AAAI Conference on Artificial Intelligence, Chicago, Illinois, The AAAI Press/Menlo Park, California, pp 969-975.
[19]	Tate A., Hendler J., and Drummond M., (1990), "A review of AI planning techniques,"in Readings in Planning, J. Allen, J. Hendler, and A. Tate,Eds. San Mateo, CA: Morgan Kaufmann, pp. 26-49.
[20]	Wagenhals L. W. and Levis A. H. (June 2002), "Modeling Support of Effects-Based Operations in War Games," in Proc. 2002 Command and Control Research and Technology Symposium, Monterey, CA.
[21]	Wagenhals L.W., Levis A. H., and McCrabb M. (June 2003), "Effects Based Operations: a Historical Perspective for a Way Ahead," in Proc. 8th Int'l Command and Control Research and Technology Symposium, National Defense University, Washington, DC.
[22]	Wagenhals L. W. and Wentz L. K., (June 2003), "New Effects-Based Operations Models in War Games," in Proceedings of the 2003 International Command and Control Research and Technology Symposium, National Defense University, Washington, DC.
[23]	Wagenhals L. W. (2000), "Course of Action Development and Evaluation Using Discrete Event System Models of Influence Nets", PhD Dissertation, George Mason University.
[24]	Wagenhals L. W., Levis A. H. (2000), "Course of Action Development and Evaluation," in Proceedings of the 2000 Command and Control Research and Technology Symposium.
[25]	Wagenhals L. W. and Levis A. H., "Modeling Effects Based Operations in Support of War Games," in Proc. Of the 15th International Symposium on Aerospace / Defense Sensing, Internal Society for Optical Engineering, Proceedings of SPIE, Vol. # 4367, 2001.
[26]	Wagenhals L. W., Shin I., and Levis A. H., (1998) "Creating Executable Models of Influence Nets with Coloured Petri Nets," Int. J. STTT, SpringVerlag, Vol. 1998, No. 2.
[27]	Wagenhals L. W., Reid T. J., Smillie R. J., & Levis A. H. (June 2001), "Course of Action Analysis for Coalition Operations," Proceedings of 6th Interna-
OPTIMIZATION OF ACTIONS IN ACTIVATION TIMED...
Informatica 33 (2009) 285-296 295
tional Command and Control Research and Technology Symposium, Annapolis, Maryland.
[28]	Wellman M. P., (1990), "Formulation of tradeoffs in planning under uncertainty". San Mateo, CA: Morgan Kaufmann.
[29]	Wilkins D.E., (1988), "Practical Planning: Extending the Classical AI Planning Paradigm". San Mateo, CA: Morgan Kaufmann.
[30]	Zaidi A. K., Wagenhals L W., and Haider S. (2005), "Assessment of Effects Based Operations Using
Temporal Logic," in Proc. of the 10th International Command and Control Research and Technology Symposium.
[31] Zaidi A. K., Mansoor F., and Papantoni-Kazakos P. (October 2007), "Modeling with Influence Networks Using Influence Constants: A New Approach," in Proceedings of IEEE-SMC Conference.
Appendix
Proof of Theorem 1
In the derivations below, the following considerations are incorporated:
1.	Effects are fully dictated by the actions taken; thus, when probabilities are conditioned on actions and preconditions, the conditioning on preconditions drops.
2.	By probability of action success, we mean the probability that the action may succeed, given the preconditions. The final action is selected among those that have positive probability of success. The probability of action taken, given that the action may succeed is the criterion that dictates the final action selection.
(a)
P(ymK) P(ym,ajtaken|xn) +
1< j<k
+ P(ym, no action taken | x1} ) (1.1)
where
P(y"11,ajtaken | x°) = = P(ym | ajtaken,xn1)P(ajtaken | x1}) = = P(y1 | ajtaken)P(ajtaken | xJ) =
= P(ajtaken|xn)qj(y";)
= [P(ajtaken, succ for action aj|x"J) + + P(ajtaken, no succ for action aj | x")Jx
x qj(ym)
= [ P(ajtaken | succ for action aj5 x1}) x x P(succ for action aj | xJ) + + P(ajtaken|no succ for action aj5 x1}) x x P(no succ for action aj|x1) J qj(y"1)
= [ P(a^aken | succ for action aj)x x P(succ for action aj | x1}) + + P(ajtaken | no succ for action aj) x xP(no succ for action aj|xn1) J qj(y"!)
= P(ajtaken | succ for action aj) x
x pj(xn1)qj(yni)
{Using P(ajtaken | no succ for action )= 0} Equating in (1. 1)
P(ym | xD =
= ^P(ajtaken | succ for action aj) x
1< j<k
x pj (xn)qj (y"1) + P(yJ", no action taken | xj1) (1.2) where
P(ym,no action taken | x1}) = = P(ym | no action taken, x1}) P(no action taken | x1}) = P(ym | no action taken) P(no action taken | x1}) = P(no action taken | x1})q0(y1!1)
= P(no action taken, no action succ | x1)q0(ym) = P(no action taken | no action succ)q0(y"1) {Using P(no action succ |x1}) = l}
Equating in (1.2)
P(ym | xni) =
= ^ P(ajtaken| succ for	action aj)pj(xn1)qj(y"1) +
1<j<k
+ P(no action taken | no	action succ) q0(y"1)
296 Informatica 33 (2009) 285-296
M. F. Rafi et al.
^ max P(yn; | x") attained
if P(a:»taken | succ for action a,.) = 1;
|ym(N-1)) max P(ym(N- 1)|xn(1)) ]
sequence of actions
forpj*(x1)qj*(yn1) = max P,(^j(y!) > q^l)
J J 1<j<k J J
otherwise, max P(y11 x°) attained
= max
yf(N-1)
[ { maxP(ym(N)|ym(N-1)) } r(ym(N-1)) ]
if P(no action taken | no action succ) = 1; where, via Theorem 1 we have:
form® Pj(x1)qj(xn1) < qo(ym)
1<j<k
(b)
maxP(y1(N)| ym(N-1)) =
action
= max[ max p,^ (N - 1))q,(ym (N)), q0(y m (N))]
U(xn) = S SP(ajtaken,y1|xn1)Uj(yn;) +
1<j<k yf
+ S P(no action taken, y"; | xf) U0(y";)
ym
= S SP(ym | ajtaken) P(ajtaken | xn)Uj(ym)
i +
1<j<k ym
+
S P(ym | no action taken)P(no action taken | xf )
Thus, via substitution principle, we obtain: r(ym(N))< max[r(ym(N-1)) x x max{ max p/ym (N -1)) x q/ym (N)), q0(ym (N)) } ] with equality iff the ym(N -1) value that attains
x U0(ym)
= S P(ajtaken | succ for action aj)pj(xn1) x
1< j<k
xS^^U^"!) + P(noactiontaken| no action succ)x
ym
xS qo(y^;)Uo(y^;)
ym
^ max U(x°) attained for : P(aj(taken | succ for action aj() = 1;
ifPj.(xn)S qj*(yim)Uj.(yn1) =
ym
=m« Pj(xn)Sjymwm) >S^(ym^m)
j	y'ï	yl1
P(no action taken | no action succ) = 1;
if Sq0(ym)U0(ym) >max Pj(xn1)Sqj(ym)Uj(ym)
r(ym(N -1)) is selected. The above proves the general step in the network propagation of Problem 1.
Proof of the Network Propagation - Problem 2
Using the notation in Section 4, Problem 2, and via the use of the Theorem of Total Probability and the Bayes Rule, we obtain:
max SU(yW(N))P(yw(N)|xn(1)) =
sequence of actions (N)
= max SU(yw(N))Z P(yw(N),
sequence of actions y j
'(N)	xi(N-1)
x1(N-1)|x?(1)) =
= max S U(yw(N)) S P(yw(N)|
sequence of actions y1
(N)
A
x'j (N-1)
|x{(N - 1))P(x1(N - 1)|xn(1)) =
= max S p(X1(N - 1)|xn(1))=
sequence of actions x1 (N-1)
Proof of the Network Propagation - Problem 1
Using the notation in Section 4, Problem 1, and via the
SU(yw(N)) P(y w(N) | xf (N -1)) =
yw (N)
Theorem of Total Probability and the Bayes Rule, we = max S P(x{(N - 1)|xn(1)):
obtain:	kj=x11CkN;- 2x;1N-1) 1	1
A	A
r(yT(N)) = max P(yT(N) | x1(1)) =	x Aj-c*1cn-d) x[(N-1)
sequence of actions
= max . sp^tCN^CN-1)|xn(1))=
sequence of actions m
ym (N-1)
= max , S P(ym(N)|ym(N -1)) x
sequence of actions
yf (N-1)
xP(ym(N- 1)|xn(1))< max [ maxP(ym(N)|
y'm (N-1) action
The latter expression proves the back propagation property and the steps in the algorithm.
ym
x